Help us understand the problem. What is going on with this article?

railsでの、文字コード変換で発生する特定文字列の例外対処

More than 1 year has passed since last update.

CSVダウンロードの実行でUTF-8からsjis変換時に例外が発生。

いわゆる波ダッシュ問題の対応方法備忘録。

環境

$ ruby -v
ruby 2.3.1p112 (2016-04-26 revision 54768) [x86_64-darwin15]

$ rails -v
Rails 5.0.0.1

helperを実装

application_helper に変換用のメソッドを追加。

app/helpers/application_helper.rb
module ApplicationHelper
  # 文字コード変換用のリスト
  FROM_CHR_UTF8 = "\u{301C 2212 00A2 00A3 00AC 2013 2014 2016 203E 00A0 00F8 203A}".freeze
  # 〜−¢£¬–—‖‾ ø›
  TO_CHR_SJIS   = "\u{FF5E FF0D FFE0 FFE1 FFE2 FF0D 2015 2225 FFE3 0020 03A6 3009}".freeze
  # ~-¢£¬-―∥ ̄ Φ〉

  # UTF-8をSJIS変換時に、変換できない文字を変換できる文字に置換する。
  def sjisable(str)
    return if str.blank?
    str.tr!(FROM_CHR_UTF8, TO_CHR_SJIS)

    # 指定した文字から漏れた変換できない文字は、「?」に置換する。
    str.encode(Encoding::SJIS, Encoding::UTF_8, invalid: :replace, undef: :replace).encode(Encoding::UTF_8, Encoding::SJIS)
  end
end

使用

sjisable(string)

確認

str = '〜−¢£¬–—‖‾ ø›'
p str
p sjisable(str)
# => 〜−¢£¬–—‖‾ ø›
# => ~-¢£¬-―∥ ̄ Φ〉

p sjisable('ä啟')
# => ??

追伸

  • 文字列を渡すのではなく、配列をまるっとわたせるメソッドの方が、CSVを扱うには有用で、カラム追加時等でもバグ生みにくいか。
  • 変換のセットをhashで持つほうが保守性高そう。

参考

Why not register and get more from Qiita?
  1. We will deliver articles that match you
    By following users and tags, you can catch up information on technical fields that you are interested in as a whole
  2. you can read useful information later efficiently
    By "stocking" the articles you like, you can search right away
Comments
No comments
Sign up for free and join this conversation.
If you already have a Qiita account
Why do not you register as a user and use Qiita more conveniently?
You need to log in to use this function. Qiita can be used more conveniently after logging in.
You seem to be reading articles frequently this month. Qiita can be used more conveniently after logging in.
  1. We will deliver articles that match you
    By following users and tags, you can catch up information on technical fields that you are interested in as a whole
  2. you can read useful information later efficiently
    By "stocking" the articles you like, you can search right away
ユーザーは見つかりませんでした