LoginSignup
2
0

More than 3 years have passed since last update.

sql oracle icuライブラリを使用した半角⇔全角変換ファンクション

Last updated at Posted at 2019-12-24

まえがき

java言語拡張機能を用いたユーザー定義ファンクションを作成した。名寄せに使えるとおもう。

参考文献

Javaにおける文字列の全角⇔半角変換について

6.3.1 Javaクラスの作成または再利用

DBMS_LOCK.SLEEPの代替SQLをいろいろ考えてみよう(Oracle)

6.3.2 Javaクラスのロードおよび解決

6.3.3 Javaクラスの公開

6.3.4 ストアド・プロシージャのコール

2.5.7 Overview of Checking Java Uploads

環境

$java -version
openjdk version "1.8.0_232"
OpenJDK Runtime Environment (build 1.8.0_232-b09)
OpenJDK 64-Bit Server VM (build 25.232-b09, mixed mode)
$sqlplus user01/ORACLE_PWD@ORCLPDB01

SQL*Plus: Release 19.0.0.0.0 - Production on 火 12月 24 23:11:52 2019
Version 19.3.0.0.0

Copyright (c) 1982, 2019, Oracle.  All rights reserved.

最終正常ログイン時間: 火 12月 24 2019 22:50:24 +09:00


Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.3.0.0.0
に接続されました。
USER01@ORCLPDB01> 

スクリプト

まずは簡単なスクリプトから。

まず、java言語を利用したpublicstaticなメソッドを定義。
この例は引数なしのメソッド。

create or replace and resolve java source named Mock
as
public class Mock{
    public static String nack() {
        return "nnn";
    }
}
/

構文形式は以下のような感じ。

create or replace and resolve java source named クラス名
as
public class クラス名{
    public static 戻り値型 メソッド名() {
        return "nnn";//戻り値型の値を返却する処理;
    }
}
/

定義したjavaソースコードを接続しているスキーマ内で使用できるようにファンクション化。
戻り値の型はパッケージ名からのフル名称で指定してあげる。

create or replace function nack
return varchar2
as
    language java
    name 'Mock.nack() return java.lang.String';
/

sql文から実行。

select nack() from dual;

ぬぬぬ。

image.png

次は引数ありのメソッドを作成してみる。

sql文とjavaの型マッピングは以下を参考に。

7.2.2 データ型のマッピングについて

なお、create or replace and resolve java source named文を実行し、create or replace function文を実行せずに、sql文を実行すると、以下の様なoraエラーが出るが、実行しなおすと取得結果を得ることができる。

ORA-29549: クラスUSER01.Mockが変更されました。Javaのセッション状態はクリアされました
create or replace and resolve java source named Mock
as
import java.math.BigDecimal;
public class Mock{
    public static String nack(String s, BigDecimal n){
        return "args->"+"[String:"+s+"],[BigDecimal:"+n+"]";
    }
}
/

create or replace function nack(s varchar2,n number)
return varchar2
as
    language java
    name 'Mock.nack(java.lang.String,java.math.BigDecimal) return java.lang.String';
/

select nack(chr(level+96),level) from dual connect by level<=5;

image.png

次は標準ライブラリをインポートしたファンクションを作成してみる。

import文をas句の直後に差し込む。

create or replace and resolve java source named Mock
as
import java.text.Normalizer;
public class Mock{
    public static String nomaz(String s){
        return Normalizer.normalize(s,Normalizer.Form.NFKC);
    }
}
/

create or replace function nomaz(s varchar2)
return varchar2
as
    language java
    name 'Mock.nomaz(java.lang.String) return java.lang.String';
/

テストデータ投入。

開く
drop table test_tbl purge;
create table test_tbl (
    grp       number
    , unicode   varchar2(4000 byte)
    , pre       varchar2(4000 byte)
);
set define off;
insert into test_tbl (grp,unicode,pre) values (1,'FF01','!');
insert into test_tbl (grp,unicode,pre) values (1,'FF02','"');
insert into test_tbl (grp,unicode,pre) values (1,'FF03','#');
insert into test_tbl (grp,unicode,pre) values (1,'FF04','$');
insert into test_tbl (grp,unicode,pre) values (1,'FF05','%');
insert into test_tbl (grp,unicode,pre) values (1,'FF06','&');
insert into test_tbl (grp,unicode,pre) values (1,'FF07',''');
insert into test_tbl (grp,unicode,pre) values (1,'FF08','(');
insert into test_tbl (grp,unicode,pre) values (1,'FF09',')');
insert into test_tbl (grp,unicode,pre) values (1,'FF0A','*');
insert into test_tbl (grp,unicode,pre) values (1,'FF0B','+');
insert into test_tbl (grp,unicode,pre) values (1,'FF0C',',');
insert into test_tbl (grp,unicode,pre) values (1,'FF0D','-');
insert into test_tbl (grp,unicode,pre) values (1,'FF0E','.');
insert into test_tbl (grp,unicode,pre) values (1,'FF0F','/');
insert into test_tbl (grp,unicode,pre) values (1,'FF10','0');
insert into test_tbl (grp,unicode,pre) values (1,'FF11','1');
insert into test_tbl (grp,unicode,pre) values (1,'FF12','2');
insert into test_tbl (grp,unicode,pre) values (1,'FF13','3');
insert into test_tbl (grp,unicode,pre) values (1,'FF14','4');
insert into test_tbl (grp,unicode,pre) values (1,'FF15','5');
insert into test_tbl (grp,unicode,pre) values (1,'FF16','6');
insert into test_tbl (grp,unicode,pre) values (1,'FF17','7');
insert into test_tbl (grp,unicode,pre) values (1,'FF18','8');
insert into test_tbl (grp,unicode,pre) values (1,'FF19','9');
insert into test_tbl (grp,unicode,pre) values (1,'FF1A',':');
insert into test_tbl (grp,unicode,pre) values (1,'FF1B',';');
insert into test_tbl (grp,unicode,pre) values (1,'FF1C','<');
insert into test_tbl (grp,unicode,pre) values (1,'FF1D','=');
insert into test_tbl (grp,unicode,pre) values (1,'FF1E','>');
insert into test_tbl (grp,unicode,pre) values (1,'FF1F','?');
insert into test_tbl (grp,unicode,pre) values (1,'FF20','@');
insert into test_tbl (grp,unicode,pre) values (1,'FF21','A');
insert into test_tbl (grp,unicode,pre) values (1,'FF22','B');
insert into test_tbl (grp,unicode,pre) values (1,'FF23','C');
insert into test_tbl (grp,unicode,pre) values (1,'FF24','D');
insert into test_tbl (grp,unicode,pre) values (1,'FF25','E');
insert into test_tbl (grp,unicode,pre) values (1,'FF26','F');
insert into test_tbl (grp,unicode,pre) values (1,'FF27','G');
insert into test_tbl (grp,unicode,pre) values (1,'FF28','H');
insert into test_tbl (grp,unicode,pre) values (1,'FF29','I');
insert into test_tbl (grp,unicode,pre) values (1,'FF2A','J');
insert into test_tbl (grp,unicode,pre) values (1,'FF2B','K');
insert into test_tbl (grp,unicode,pre) values (1,'FF2C','L');
insert into test_tbl (grp,unicode,pre) values (1,'FF2D','M');
insert into test_tbl (grp,unicode,pre) values (1,'FF2E','N');
insert into test_tbl (grp,unicode,pre) values (1,'FF2F','O');
insert into test_tbl (grp,unicode,pre) values (1,'FF30','P');
insert into test_tbl (grp,unicode,pre) values (1,'FF31','Q');
insert into test_tbl (grp,unicode,pre) values (1,'FF32','R');
insert into test_tbl (grp,unicode,pre) values (1,'FF33','S');
insert into test_tbl (grp,unicode,pre) values (1,'FF34','T');
insert into test_tbl (grp,unicode,pre) values (1,'FF35','U');
insert into test_tbl (grp,unicode,pre) values (1,'FF36','V');
insert into test_tbl (grp,unicode,pre) values (1,'FF37','W');
insert into test_tbl (grp,unicode,pre) values (1,'FF38','X');
insert into test_tbl (grp,unicode,pre) values (1,'FF39','Y');
insert into test_tbl (grp,unicode,pre) values (1,'FF3A','Z');
insert into test_tbl (grp,unicode,pre) values (1,'FF3B','[');
insert into test_tbl (grp,unicode,pre) values (1,'FF3C','\');
insert into test_tbl (grp,unicode,pre) values (1,'FF3D',']');
insert into test_tbl (grp,unicode,pre) values (1,'FF3E','^');
insert into test_tbl (grp,unicode,pre) values (1,'FF3F','_');
insert into test_tbl (grp,unicode,pre) values (1,'FF40','`');
insert into test_tbl (grp,unicode,pre) values (1,'FF41','a');
insert into test_tbl (grp,unicode,pre) values (1,'FF42','b');
insert into test_tbl (grp,unicode,pre) values (1,'FF43','c');
insert into test_tbl (grp,unicode,pre) values (1,'FF44','d');
insert into test_tbl (grp,unicode,pre) values (1,'FF45','e');
insert into test_tbl (grp,unicode,pre) values (1,'FF46','f');
insert into test_tbl (grp,unicode,pre) values (1,'FF47','g');
insert into test_tbl (grp,unicode,pre) values (1,'FF48','h');
insert into test_tbl (grp,unicode,pre) values (1,'FF49','i');
insert into test_tbl (grp,unicode,pre) values (1,'FF4A','j');
insert into test_tbl (grp,unicode,pre) values (1,'FF4B','k');
insert into test_tbl (grp,unicode,pre) values (1,'FF4C','l');
insert into test_tbl (grp,unicode,pre) values (1,'FF4D','m');
insert into test_tbl (grp,unicode,pre) values (1,'FF4E','n');
insert into test_tbl (grp,unicode,pre) values (1,'FF4F','o');
insert into test_tbl (grp,unicode,pre) values (1,'FF50','p');
insert into test_tbl (grp,unicode,pre) values (1,'FF51','q');
insert into test_tbl (grp,unicode,pre) values (1,'FF52','r');
insert into test_tbl (grp,unicode,pre) values (1,'FF53','s');
insert into test_tbl (grp,unicode,pre) values (1,'FF54','t');
insert into test_tbl (grp,unicode,pre) values (1,'FF55','u');
insert into test_tbl (grp,unicode,pre) values (1,'FF56','v');
insert into test_tbl (grp,unicode,pre) values (1,'FF57','w');
insert into test_tbl (grp,unicode,pre) values (1,'FF58','x');
insert into test_tbl (grp,unicode,pre) values (1,'FF59','y');
insert into test_tbl (grp,unicode,pre) values (1,'FF5A','z');
insert into test_tbl (grp,unicode,pre) values (1,'FF5B','{');
insert into test_tbl (grp,unicode,pre) values (1,'FF5C','|');
insert into test_tbl (grp,unicode,pre) values (1,'FF5D','}');
insert into test_tbl (grp,unicode,pre) values (1,'FF5E','~');
insert into test_tbl (grp,unicode,pre) values (1,'FF5F','⦅');
insert into test_tbl (grp,unicode,pre) values (1,'FF60','⦆');
insert into test_tbl (grp,unicode,pre) values (1,'FF61','。');
insert into test_tbl (grp,unicode,pre) values (1,'FF62','「');
insert into test_tbl (grp,unicode,pre) values (1,'FF63','」');
insert into test_tbl (grp,unicode,pre) values (1,'FF64','、');
insert into test_tbl (grp,unicode,pre) values (1,'FF65','・');
insert into test_tbl (grp,unicode,pre) values (1,'FF66','ヲ');
insert into test_tbl (grp,unicode,pre) values (1,'FF67','ァ');
insert into test_tbl (grp,unicode,pre) values (1,'FF68','ィ');
insert into test_tbl (grp,unicode,pre) values (1,'FF69','ゥ');
insert into test_tbl (grp,unicode,pre) values (1,'FF6A','ェ');
insert into test_tbl (grp,unicode,pre) values (1,'FF6B','ォ');
insert into test_tbl (grp,unicode,pre) values (1,'FF6C','ャ');
insert into test_tbl (grp,unicode,pre) values (1,'FF6D','ュ');
insert into test_tbl (grp,unicode,pre) values (1,'FF6E','ョ');
insert into test_tbl (grp,unicode,pre) values (1,'FF6F','ッ');
insert into test_tbl (grp,unicode,pre) values (1,'FF70','ー');
insert into test_tbl (grp,unicode,pre) values (1,'FF71','ア');
insert into test_tbl (grp,unicode,pre) values (1,'FF72','イ');
insert into test_tbl (grp,unicode,pre) values (1,'FF73','ウ');
insert into test_tbl (grp,unicode,pre) values (1,'FF74','エ');
insert into test_tbl (grp,unicode,pre) values (1,'FF75','オ');
insert into test_tbl (grp,unicode,pre) values (1,'FF76','カ');
insert into test_tbl (grp,unicode,pre) values (1,'FF77','キ');
insert into test_tbl (grp,unicode,pre) values (1,'FF78','ク');
insert into test_tbl (grp,unicode,pre) values (1,'FF79','ケ');
insert into test_tbl (grp,unicode,pre) values (1,'FF7A','コ');
insert into test_tbl (grp,unicode,pre) values (1,'FF7B','サ');
insert into test_tbl (grp,unicode,pre) values (1,'FF7C','シ');
insert into test_tbl (grp,unicode,pre) values (1,'FF7D','ス');
insert into test_tbl (grp,unicode,pre) values (1,'FF7E','セ');
insert into test_tbl (grp,unicode,pre) values (1,'FF7F','ソ');
insert into test_tbl (grp,unicode,pre) values (1,'FF80','タ');
insert into test_tbl (grp,unicode,pre) values (1,'FF81','チ');
insert into test_tbl (grp,unicode,pre) values (1,'FF82','ツ');
insert into test_tbl (grp,unicode,pre) values (1,'FF83','テ');
insert into test_tbl (grp,unicode,pre) values (1,'FF84','ト');
insert into test_tbl (grp,unicode,pre) values (1,'FF85','ナ');
insert into test_tbl (grp,unicode,pre) values (1,'FF86','ニ');
insert into test_tbl (grp,unicode,pre) values (1,'FF87','ヌ');
insert into test_tbl (grp,unicode,pre) values (1,'FF88','ネ');
insert into test_tbl (grp,unicode,pre) values (1,'FF89','ノ');
insert into test_tbl (grp,unicode,pre) values (1,'FF8A','ハ');
insert into test_tbl (grp,unicode,pre) values (1,'FF8B','ヒ');
insert into test_tbl (grp,unicode,pre) values (1,'FF8C','フ');
insert into test_tbl (grp,unicode,pre) values (1,'FF8D','ヘ');
insert into test_tbl (grp,unicode,pre) values (1,'FF8E','ホ');
insert into test_tbl (grp,unicode,pre) values (1,'FF8F','マ');
insert into test_tbl (grp,unicode,pre) values (1,'FF90','ミ');
insert into test_tbl (grp,unicode,pre) values (1,'FF91','ム');
insert into test_tbl (grp,unicode,pre) values (1,'FF92','メ');
insert into test_tbl (grp,unicode,pre) values (1,'FF93','モ');
insert into test_tbl (grp,unicode,pre) values (1,'FF94','ヤ');
commit;

sql文を実行して、全角文字と半角文字それぞれ確認してみると、全角文字は半角文字に、半角文字は全角文字に変換されている。

with sub as(
    select
        grp
        ,row_number()over(partition by ceil(rownum/101) order by rownum) as subgrp
        ,unicode
        ,pre
        ,nomaz(pre) as post
    from test_tbl
)select * from sub where subgrp<=5;

image.png

最後に外部ライブラリを使用したファンクションを作成してみる。
これは事前にちょっとした準備が必要。

まずは外部ライブラリをダウンロード。今回はICUが提供してくれているjarライブラリを落す。落とした後、loadjavaコマンドを通してoracleデータベースのインスタンスにjarライブラリを使用できるように認識1させる。

$curl -LO https://github.com/unicode-org/icu/releases/download/release-65-1/icu4j-65_1.jar
$ls
icu4j-65_1.jar

loadjavaコマンドの在処。オラクルインストール時に一緒に梱包されているコマンドの一つぽい。

$which loadjava
/opt/oracle/product/19c/dbhome_1/bin/loadjava
$ll /opt/oracle/product/19c/dbhome_1/bin/loadjava
-rwxr-xr-x. 1 oracle oinstall 2459  4月 18  2019 /opt/oracle/product/19c/dbhome_1/bin/loadjava

loadjavaコマンドの詳細な使い方はhelpで。

開く
$loadjava -help
loadjava: load classes, jars, resources or properties files into Oracle.
Usage: loadjava [options] classes..jars..resources..properties...
Options:
  -d, -definer             gives the methods of the class definers rights
  -encoding <encoding>     the encoding to be used to read .java files 
  -f, -force               force loading of all classes.  Normally classes
                           identical to classes already loaded are not
                           reloaded.
  -g, -grant <grants>      grant execute permission on all loaded classes
                           and resources to the users and roles
                           listed in the comma separated list <grants>.
  -genmissing              if the classes and jars refer to classes that
                           are not present, generate dummy versions of 
                           the missing classes.
  -genmissingjar <jar>     In addition to the actions of -genmissing 
                           create a jar containing the generated classes.
  -h, -help                print out this message.
  -nousage                 do print a usage message if there are no file
                           arguments
  -noverify                do not verify bytecodes. 
  -o, -oci8                use JDBC OCI8.  The database must be an entry
                           from TNSNAMES.ORA file or a Net8 Name-Value list.
  -order                   resolve classes in "bottom up" order 
  -r, -resolve             resolve all the classes after loading them.  
  -nativecompile           natively compile methods in classes after loading 
                           and resolving them.  This option forces loaded 
                           classes to be resolved. 
  -resolver <resolver>     use <resolver> as the resolver for the loaded
                           classes.  As resolvers contain special characters
                           they should be quoted on the command line.
  -schema <schema>         loads into <schema> rather than schema of user
                           schema being loaded into.
  -s, -synonym             create public synonyms for the loaded classes.
                           You must have CREATE PUBLIC SYNONYM privilege.
  -tableschema <schema>    Use <schema> to hold all tables used by loadjava
                           instead of putting the tables in the schema 
                           where classes are being loaded.
  -t, -thin                use JDBC THIN.  The database must be of the form
                           <host>:<port>:<sid> or a Net8 Name-Value list.
  -user user/password@database   account and database where to load the
                            classes and resources.
  -v, -verbose             print some information as it loads the files.
                           recommended to find missing classes early.
  -proxy host:port         Host is the proxy host name or internet address.
                           Port is the proxy port number.
  -edition <edition name>  The application edition into which to load objects

jarファイルが存在するディレクトリで-userオプションの引数にユーザー名/パスワード@データベース名を指定し、後続の引数に対象のjarファイルを渡して実行。10秒ぐらい掛かった。インスタンスは落とさずに実行した。

[oracle💚af009cfe1ee6 (火 12月 24 23:53:01) ~]$ls icu4j-65_1.jar
icu4j-65_1.jar
[oracle💚af009cfe1ee6 (火 12月 24 23:53:01) ~]$loadjava -user user01/ORACLE_PWD@ORCLPDB01 icu4j-65_1.jar

上記の事前準備を終えたあと、今迄と同様にjava言語で記載したプログラムとそれを呼び出すファンクションを作成。
テストデータは同じ。

全角から半角へ変換
create or replace and resolve java source named Mock
as
import com.ibm.icu.text.Transliterator;
public class Mock{
    public static String full2half(String s){
        return Transliterator.getInstance("Fullwidth-Halfwidth").transliterate(s);
    }
}
/

create or replace function full2half(s varchar2)
return varchar2
as
    language java
    name 'Mock.full2half(java.lang.String) return java.lang.String';
/

with sub as(
    select
        grp
        ,row_number()over(partition by ceil(rownum/101) order by rownum) as subgrp
        ,unicode
        ,pre
        ,full2half(pre) as post
    from test_tbl
)select * from sub where subgrp<=5;

全角文字は半角文字に、半角文字は半角文字のまま。

image.png

半角から全角へ変換
create or replace and resolve java source named Mock
as
import com.ibm.icu.text.Transliterator;
public class Mock{
    public static String half2full(String s){
        return Transliterator.getInstance("Halfwidth-Fullwidth").transliterate(s);
    }
}
/

create or replace function half2full(s varchar2)
return varchar2
as
    language java
    name 'Mock.half2full(java.lang.String) return java.lang.String';
/

with sub as(
    select
        grp
        ,row_number()over(partition by ceil(rownum/101) order by rownum) as subgrp
        ,unicode
        ,pre
        ,half2full(pre) as post
    from test_tbl
)select * from sub where subgrp<=5;

全角文字は全角文字のまま、半角文字は全角文字に。

image.png

まとめ

さいごに構文形式をおさらい。

7.2.2 データ型のマッピングについて

--javaプログラムを定義
create or replace and resolve java source named クラス名
as
import com.ibm.icu.text.Transliterator;
public class クラス名{
    public static 戻り値の型 メソッド名(引数の型 s){
        //戻り値の型の値を返却するような処理を施したあと、その値をreturn文で返却。
        retutn 戻り値の型の値;
    }
}
/

--javaプログラムをファンクション化
create or replace function メソッド名(s javaの型と互換のあるsqlデータ型)
return javaの型と互換のあるsqlデータ型
as
    language java
    name 'クラス名.メソッド名(javaの型をフルパス表記) return javaの型をフルパス表記';
/

--sql文で使用
select メソッド名(tgt) from dual;

あとがき

javaの機能を外部ライブラリの助けを借りつつもsql文のファンクション形式で使用できるのは便利だと思う。
javaのapiに強い人が外部ライブラリを交えながら、色んなsqlファンクションを共有ないしは公開すると、表現の幅が広がって楽しいと思う。^^

相乗効果でoracleのsql盛り上がるといいな。

これを機にjavaのapiについて興味が出て来たので、いい機会だったと思う。

以上、ありがとうございました。

20191227追記

データ型のマッピングに関しては以下も参考になるか。


  1. 認識したか判定。メソッド検索できる。select object_name, object_type, status from user_objects where object_type in ('JAVA SOURCE', 'JAVA CLASS', 'JAVA RESOURCE')and regexp_like(object_name,'com.ibm.icu.text.Transliterator')order by object_type, object_name; 

2
0
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
2
0