More than 5 years have passed since last update.

Ruby でローマ数字の文字 (機種依存文字) をアルファベットに変換する

Last updated at 2019-01-17Posted at 2019-01-17

やりたいこと

例えば Ⅶ (U+2166) を VII に変換したい。

方法

愚直な方法

def replace_roman_numerals_with_alphabets(str)
  conversions = {
    'Ⅰ' => 'I', 'Ⅱ' => 'II', 'Ⅲ' => 'III', 'Ⅳ' => 'IV', 'Ⅴ' => 'V',
    'Ⅵ' => 'VI', 'Ⅶ' => 'VII', 'Ⅷ' => 'VIII', 'Ⅸ' => 'IX', 'Ⅹ' => 'X',
    'Ⅺ' => 'XI', 'Ⅻ' => 'XII'
  }.freeze
  
  str.gsub(/[#{conversions.keys}]/, conversions)
end

replace_roman_numerals_with_alphabets('ﾌｧｲﾅﾙﾌｧﾝﾀｼﾞｰⅦ')
# => "ﾌｧｲﾅﾙﾌｧﾝﾀｼﾞｰVII"

変換ルールを自分で用意するのは大変

スマートな方法

def replace_roman_numerals_with_alphabets(str)
  # Unicode の U+2160 から U+217F までがローマ数字。
  roman_numerals_pattern = /[\u2160-\u217F]/ 
  str.gsub(roman_numerals_pattern) { |char| char.unicode_normalize(:nfkd) }
end

replace_roman_numerals_with_alphabets('ﾌｧｲﾅﾙﾌｧﾝﾀｼﾞｰⅦ')
# => "ﾌｧｲﾅﾙﾌｧﾝﾀｼﾞｰVII"

# ちなみに……
'ﾌｧｲﾅﾙﾌｧﾝﾀｼﾞｰⅦ'.unicode_normalize(:nfkd)
# => "ファイナルファンタジーVII"

NFKD 形式 (あるいは NFKC 形式) で Unicode 正規化することで、対応するアルファベットに分解することができる。

参考

[Wikipedia] Unicode正規化
[Ruby] String#gsub
- 第 2 引数に Hash オブジェクトを渡せたり、ブロック引数を渡せたりで器用なメソッド。
[Qiita] Python 3 での Unicode に関する備忘録
- 拙記事。Python での Unicode 正規化に言及している。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up