Posted at

PHP: Unicodeのコードポイントを求める関数

UTF-8文字からUnicodeのコードポイントを求める関数です。

var_dump(toCodePoint("\0")); //=> string(6) "U+0000"

var_dump(toCodePoint("\n")); //=> string(6) "U+000A"
var_dump(toCodePoint("A")); //=> string(6) "U+0041"
var_dump(toCodePoint("a")); //=> string(6) "U+0061"
var_dump(toCodePoint("あ")); //=> string(6) "U+3042"
var_dump(toCodePoint("𠮷")); //=> string(7) "U+20BB7"

function toCodePoint(string $char): string
{
return sprintf("U+%04X", hexdec(bin2hex(mb_convert_encoding($char, 'UCS-4', 'UTF-8'))));
}