5
3

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

More than 5 years have passed since last update.

golangで、Unicodeコードポイントに対応するUTF-8, UTF-16を取得する

Last updated at Posted at 2017-03-05

はじめに

 先日、掲題の作業を行った際、愚かにも自前で算出してしまったのですが、よくよく調べるとgolangは標準ライブラリで取得できるじゃありませんか!
 私のようにバカな真似をする人が一人でも減りますよう、ここにやり方を残しておきます (エラー処理は一切行っていないのでご注意ください)

環境

macOS 10.12.3、golang 1.8

ソース

main.go
package main

import (
	"fmt"
	"strconv"
	"strings"
	"unicode/utf16"
	"unicode/utf8"
)

func main() {
	// コードポイント文字列の配列。それぞれ、9, ¢, あ, 𠀋, 🚩(絵文字の旗)
	codepoints := []string{"39", "00A2", "3042", "2000B", "1F6A9"}

	// byte配列を、16進数の文字列に変換する
	bytesToStr := func(bytes []byte) string {
		var str string
		for _, b := range bytes {
			str += fmt.Sprintf("%02X ", b)
		}
		return strings.TrimSuffix(str, " ")
	}

	// 2byte配列を、16進数の文字列に変換する
	wordsToStr := func(words []uint16) string {
		var str string
		for _, w := range words {
			str += fmt.Sprintf("%04X ", w)
		}
		return strings.TrimSuffix(str, " ")
	}

	for _, code := range codepoints {
		char, _ := strconv.ParseUint(code, 16, 32)
		r := rune(char)

		// codepoint -> utf8へ
		bytes := make([]byte, 4)
		size := utf8.EncodeRune(bytes, r)

		// codepoint -> utf16へ
		words := utf16.Encode([]rune{r})

		// byte配列を、16進数文字列化して表示
		fmt.Println("char =", string(r), ", utf8 =", bytesToStr(bytes[:size]), ", utf16 =", wordsToStr(words))
	}
}

出力

char = 9 , utf8 = 39 , utf16 = 0039
char = ¢ , utf8 = C2 A2 , utf16 = 00A2
char = あ , utf8 = E3 81 82 , utf16 = 3042
char = 𠀋 , utf8 = F0 A0 80 8B , utf16 = D840 DC0B
char = 🚩 , utf8 = F0 9F 9A A9 , utf16 = D83D DEA9
5
3
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
5
3

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?