1
2

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

More than 5 years have passed since last update.

文字列の表示幅を調べる方法を模索

Posted at

全角を半角2文字分とみなして文字列を表示したときの幅を求める方法を模索していきます。

方法1. 1文字ずつ自力で調べる

w = -> s { s.codepoints.inject(0) { |a, e| a + (e < 256 ? 1 : 2) } }
w["12"]                             # => 2
w["あ"]                             # => 2
w["\a"]                             # => 1

意外と応用が効きそうです。

方法2. encode("EUC-JP").bytesize

"12".encode("EUC-JP").bytesize      # => 2
"あ".encode("EUC-JP").bytesize      # => 2
"\a".encode("EUC-JP").bytesize      # => 1

EUC-JP なのか EUC_JP なのかで私はよくわからなくなります。

方法3. toeuc.bytesize

require "kconv"
"12".toeuc.bytesize                 # => 2
"あ".toeuc.bytesize                 # => 2
"\a".toeuc.bytesize                 # => 1

簡単です。

方法4. unicode-display_width gem を使う

require "unicode/display_width"
require "unicode/display_width/string_ext"
"12".display_width                  # => 2
"あ".display_width                  # => 2
"\a".display_width                  # => 0

ベルを鳴らす文字は幅が0だそうです。
高性能です。

それぞれの方法で動作検証

str = [*"あいうえお".chars, *"a".."z"].shuffle.join * 100
w[str]                              # => 3600
str.encode("EUC-JP").bytesize       # => 3600
str.toeuc.bytesize                  # => 3600
str.display_width                   # => 3600

どれもちゃんと動いてます。

ついでに速度の比較

require "active_support/core_ext/benchmark"
def _; "%7.2f ms" % Benchmark.ms { 2000.times { yield } } end
_ { w[str]                        } # => " 543.65 ms"
_ { str.encode("EUC-JP").bytesize } # => "  77.91 ms"
_ { str.toeuc.bytesize            } # => " 353.93 ms"
_ { str.display_width             } # => "4792.27 ms"
1
2
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
1
2

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?