文字コード判別や構文処理などのためのバイトパターン表。
Shift_JIS
長さ |
byte[0] |
byte[1] |
1 |
00-80,A0-DF,FD-FF |
|
2 |
81-9F,E0-FC |
40-7E,80-FC |
UTF-8
- より短い長さで表現できる筈の文字コードの範囲をわざと冗長に表現するUTF-8バイトパターンは不正とみなされるため、その範囲は
(invalid)
と表記している。
- U+D800 - U+DFFF 代用符号位置 (Surrogate Code Point) は U+10000 - U+10FFFF の領域を UTF-16 エンコードするために用いられる領域なので、 UTF-8 でこの領域がエンコードされるのはおかしい筈?
-
Plane (Unicode)
Unicode 10.0 (2017-06-20)現在、Plane 3 (U+30000 - U+3FFFF, TIP) を含めた Plane 3-13 (U+30000 - U+DFFFF) へ割り当てられた文字は無い。
- Unicode 13.0 (2020-03-10) より、Plane 3 (U+30000 - U+3FFFF, TIP) への文字割り当て開始(Unicode 13.0では U+30000-U+3134A の4,939字)。 Plane 4-13 (U+40000 - U+DEFFF) は未割り当て。
長さ |
Unicode |
|
byte[0] |
byte[1] |
byte[2] |
byte[3] |
byte[4] |
byte[5] |
1 |
U+0000 - U+007F |
Basic Latin (ASCII) |
00-7F |
|
|
|
|
|
2 |
(invalid) |
|
C0-C1 |
80-BF |
|
|
|
|
2 |
U+0080 - U+07FF |
|
C2-DF |
80-BF |
|
|
|
|
3 |
(invalid) |
|
E0 |
80-9F |
80-BF |
|
|
|
3 |
U+0800 - U+0FFF |
|
E0 |
A0-BF |
80-BF |
|
|
|
3 |
U+1000 - U+CFFF |
|
E1-EC |
80-BF |
80-BF |
|
|
|
3 |
U+D000 - U+D7FF |
|
ED |
80-9F |
80-BF |
|
|
|
3 |
U+D800 - U+DFFF |
代用符号位置 (Surrogate Code Point) |
ED |
A0-BF |
80-BF |
|
|
|
3 |
U+E000 - U+FFFF |
U+E000-U+F8FF 私用 (Private Use Area), U+F900-U+FFFD 互換文字と特殊文字 |
EE-EF |
80-BF |
80-BF |
|
|
|
3 |
(U+FEFF) |
U+FEFF Byte Order Mark |
EF |
BB |
BF |
|
|
|
4 |
(invalid) |
|
F0 |
80-8F |
80-BF |
80-BF |
|
|
4 |
U+10000 - U+1FFFF |
(Plane 1): 追加多言語面 (Supplementary Multilingual Plane; SMP) |
F0 |
90-9F |
80-BF |
80-BF |
|
|
4 |
U+20000 - U+2FFFF |
(Plane 2): 追加漢字面 (Supplementary Ideographic Plane; SIP) |
F0 |
A0-AF |
80-BF |
80-BF |
|
|
4 |
U+30000 - U+3FFFF |
(Plane 3): 第三漢字面 (Tertiary Ideographic Plane; TIP) |
F0 |
B0-BF |
80-BF |
80-BF |
|
|
4 |
U+40000 - U+BFFFF |
(Plane 4-11): unassigned |
F1-F2 |
80-BF |
80-BF |
80-BF |
|
|
4 |
U+C0000 - U+DFFFF |
(Plane 12-13): unassigned |
F3 |
80-9F |
80-BF |
80-BF |
|
|
4 |
U+E0000 - U+EFFFF |
(Plane 14): 追加特殊用途面 (Supplementary Special‐purpose Plane; SSP) |
F3 |
A0-AF |
80-BF |
80-BF |
|
|
4 |
U+F0000 - U+FFFFF |
(Plane 15): 私用面A (Supplementary Private Use Area A; SPUA-A) |
F3 |
B0-BF |
80-BF |
80-BF |
|
|
4 |
U+100000 - U+10FFFF |
(Plane 16): 私用面B (Supplementary Private Use Area B; SPUA-B) |
F4 |
80-8F |
80-BF |
80-BF |
|
|
|
(U+110000以降は定義外領域) |
|
|
|
|
|
|
|
4 |
U+110000 - U+13FFFF |
|
F4 |
90-BF |
80-BF |
80-BF |
|
|
4 |
U+140000 - U+1FFFFF |
|
F5-F7 |
80-BF |
80-BF |
80-BF |
|
|
5 |
(invalid) |
|
F8 |
80-87 |
80-BF |
80-BF |
80-BF |
|
5 |
U+200000 - U+FFFFFF |
|
F8 |
88-BF |
80-BF |
80-BF |
80-BF |
|
5 |
U+1000000 - U+3FFFFFF |
|
F9-FB |
80-BF |
80-BF |
80-BF |
80-BF |
|
6 |
(invalid) |
|
FC |
80-83 |
80-BF |
80-BF |
80-BF |
80-BF |
6 |
U+4000000 - U+3FFFFFFF |
|
FC |
84-BF |
80-BF |
80-BF |
80-BF |
80-BF |
6 |
U+40000000 - U+7FFFFFFF |
|
FD |
80-BF |
80-BF |
80-BF |
80-BF |
80-BF |