LoginSignup
1
0

More than 3 years have passed since last update.

各種エンコーディングでの文字列のバイナリをRubyで取得するスクリプトを作ってみた

Last updated at Posted at 2019-02-21

TL;DR

  • ある文字列を変換したあとの16進バイナリを知りたくない?
  • Rubyにやらせてみた。
  • サンプルコードつき。

はじめに

  • テキストファイルの中身(バイナリ)、なんの文字コードで書かれているのか知りたい・・・
  • 勝手に判別してくれるライブラリを使ってもいいが、特定の文字で知れれば十分。
  • なんか役に立ちそう。

コード

  • 以下のコードを適当に.rb拡張子でファイルに保存してください。
get_binary_from.rb
# frozen_string_literal: true

def display_binary(input, encode_type)
  begin
    puts "get binary when: #{encode_type.ljust(Encoding.name_list.map { |n| n.length }.max)} => #{input.encode(Encoding.find(encode_type)).bytes.map { |b| b.to_s(16).upcase }}"
  rescue => exception
    puts "error occurred when encoding to #{encode_type} => #{exception}"
  end
end

input = ARGV[0]
encode_type = ARGV[1]

ENCODING_TYPES = %w(ascii Shift_JIS utf-8 utf-16 utf-16le utf-16be utf-32le utf-32be eucjp).freeze

if input.nil?
  puts "Please enter strings you'd like to encode as 1st arg. If you'd like to know only a encoding type, please input Encodeing type as 2nd arg."
  return
end

if !encode_type.nil? && Encoding.find(encode_type)
  display_binary(input, encode_type)
  return
end

ENCODING_TYPES.each do |type|
  display_binary(input, type)
end
  • .bashrcとか.zshrcにエイリアスを。これでどこからでも呼べる。
alias get_binary_from="ruby $HOME/scripts/get_binary_from.rb"
  • 使い方。
    • 第一引数は必須、16進バイナリを知りたい文字列を与える。
    • 第二引数はオプション。知りたいエンコーディングを与える。
      • Encodingのクラスメソッド、name_listに対応する文字列を与えてね。
      • 第二引数を与えなければ、デフォルトで9通りで出てくる。
[shoutatani@mac] ~ % get_binary_from "あい う"
error occurred when encoding to ascii => U+3042 from UTF-8 to US-ASCII
get binary when: Shift_JIS                  => ["82", "A0", "82", "A2", "20", "82", "A4"]
get binary when: utf-8                      => ["E3", "81", "82", "E3", "81", "84", "20", "E3", "81", "86"]
get binary when: utf-16                     => ["FE", "FF", "30", "42", "30", "44", "0", "20", "30", "46"]
get binary when: utf-16le                   => ["42", "30", "44", "30", "20", "0", "46", "30"]
get binary when: utf-16be                   => ["30", "42", "30", "44", "0", "20", "30", "46"]
get binary when: utf-32le                   => ["42", "30", "0", "0", "44", "30", "0", "0", "20", "0", "0", "0", "46", "30", "0", "0"]
get binary when: utf-32be                   => ["0", "0", "30", "42", "0", "0", "30", "44", "0", "0", "0", "20", "0", "0", "30", "46"]
get binary when: eucjp                      => ["A4", "A2", "A4", "A4", "20", "A4", "A6"]

[shoutatani@mac] ~ % get_binary_from "あい う" "utf-8"
get binary when: utf-8                      => ["E3", "81", "82", "E3", "81", "84", "20", "E3", "81", "86"]
[shoutatani@mac] ~ % get_binary_from "あい う" "ISO-2022-JP"
get binary when: ISO-2022-JP                => ["1B", "24", "42", "24", "22", "24", "24", "1B", "28", "42", "20", "1B", "24", "42", "24", "26", "1B", "28", "42"]

終わりに

  • 何の言語でも、Encodingあたりを触るのは楽しい。興奮する。
1
0
2

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
1
0