NFKC,NFKD
NFKC normalize form compatibility Compose
NFKD normalize form compatibility Decompose
compatibility、つまり互換性のあるものに統一変換
実験した結果、全角数字、半角数字、①②は半角数字(1→1、①→1)、半角カナ、全角カナは全角カナ(ガ→ガ)に統一される
NFC,NFD
NFC normalize form compose
NFD normalize form decompose
こちらはnon compatibility。互換性を意識しないので
[ガ、ガ]、および、[1、1、①]は別物として扱われる
compose, Decompose
compose(ガ)、decompose(カ+”)の統一
package sampleproject;
import java.text.Normalizer;
/**
*
* @author java
*/
public class SampleProject {
/**
* @param args the command line arguments
*/
public static void main(String[] args) {
char c1 = '\u3066';
char c2 = '\u3099';
System.out.println(c1);
System.out.println(c2);
String s = "\u3066\u3099";
printall(s);
s = "ガ";
printall(s);
s = "ガ";
printall(s);
s = "1";
printall(s);
s = "1";
printall(s);
s = "①";
printall(s);
}
static void printall(String str) {
System.out.println("----------------");
print(str,"");
print(Normalizer.normalize(str, Normalizer.Form.NFC),"NFC");
print(Normalizer.normalize(str, Normalizer.Form.NFD),"NFD");
print(Normalizer.normalize(str, Normalizer.Form.NFKC),"NFKC");
print(Normalizer.normalize(str, Normalizer.Form.NFKD),"NFKD");
}
static void print(String str, String opt) {
System.out.println(opt + " " + str + ":" + str.length());
}
}
て
゙
----------------
で:2
NFC で:1
NFD で:2 //て、”
NFKC で:1
NFKD で:2 //て、”
----------------
ガ:1
NFC ガ:1
NFD ガ:2 //カ、”
NFKC ガ:1
NFKD ガ:2 //カ、”
----------------
ガ:2
NFC ガ:2
NFD ガ:2
NFKC ガ:1
NFKD ガ:2 //カ、”
----------------
1:1
NFC 1:1
NFD 1:1
NFKC 1:1
NFKD 1:1
----------------
1:1
NFC 1:1
NFD 1:1
NFKC 1:1
NFKD 1:1
----------------
①:1
NFC ①:1
NFD ①:1
NFKC 1:1
NFKD 1:1