やりたいこと

文字列の幅を算出したい。ここでの幅とは全角文字の幅を 2、それ以外の文字の幅を 1 とした値の合計値のこと。

動機

PrettyTable などのライブラリを使ってテーブルをテキスト出力する場合に、全角文字のせいで表が崩れて困ったので

方法

なお east asian width (東アジアの文字幅) とは Unicode における文字幅の規定のことで、Unicode Standard Annex #11 で定義されている。

Wikipedia の情報を参考に unicodedata.east_asian_width の結果が W, F, A となる文字を幅 2 とみなし、それ以外を幅 1 とみなす。

from unicodedata import east_asian_width


width_dict = {
  'F': 2,   # Fullwidth
  'H': 1,   # Halfwidth
  'W': 2,   # Wide
  'Na': 1,  # Narrow
  'A': 2,   # Ambiguous
  'N': 1    # Neutral
}

string = 'Ꚋんは天使でｱﾙ 😇'

chars = [char for char in string]
print(chars)  # ['Ꚋ', 'ん', 'は', '天', '使', 'で', 'ｱ', 'ﾙ', ' ', '😇']

east_asian_width_list = [east_asian_width(char) for char in chars]
print(east_asian_width_list)  # ['N', 'W', 'W', 'W', 'W', 'W', 'H', 'H', 'Na', 'W']

width_list = [width_dict[east_asian_width] for east_asian_width in east_asian_width_list]
print(width_list)  # [1, 2, 2, 2, 2, 2, 1, 1, 1, 2]

print(sum(width_list))  # 16

文字列の幅を算出する

やりたいこと

動機

方法

参考