LoginSignup
3
4

More than 5 years have passed since last update.

Rubyにおいてのスペースの取り扱い

Last updated at Posted at 2016-06-28

こんな感じのコードを最近みました。

puts 'hello world!'      .gsub('hello', 'good evening')

いやいや!スペース空きすぎでしょ、こんなんエラー起こしているでしょー

なんて思ってたら、なんか動いていたんですね。

上のコードだと、結果はこうなります。

puts 'hello world!'      .gsub('hello', 'good evening')
# => good evening world!

なんだこれは、意味がわからん。僕の知っているRubyじゃない。ということで、字句解析と構文解析をしてみました。

require 'ripper'
require 'pp'

code =<<EOF
puts 'hello world!'      .gsub('hello', 'good evenning')
EOF

puts '=====lexer====='
pp Ripper.lex(code)
puts '=====parser====='
pp Ripper.sexp(code)

結果はこちら

=====lexer=====
[[[1, 0], :on_ident, "puts"],
 [[1, 4], :on_sp, " "],
 [[1, 5], :on_tstring_beg, "'"],
 [[1, 6], :on_tstring_content, "hello world!"],
 [[1, 18], :on_tstring_end, "'"],
 [[1, 19], :on_sp, "      "],
 [[1, 25], :on_period, "."],
 [[1, 26], :on_ident, "gsub"],
 [[1, 30], :on_lparen, "("],
 [[1, 31], :on_tstring_beg, "'"],
 [[1, 32], :on_tstring_content, "hello"],
 [[1, 37], :on_tstring_end, "'"],
 [[1, 38], :on_comma, ","],
 [[1, 39], :on_sp, " "],
 [[1, 40], :on_tstring_beg, "'"],
 [[1, 41], :on_tstring_content, "good evenning"],
 [[1, 54], :on_tstring_end, "'"],
 [[1, 55], :on_rparen, ")"],
 [[1, 56], :on_nl, "\n"]]
=====parser=====
[:program,
 [[:command,
   [:@ident, "puts", [1, 0]],
   [:args_add_block,
    [[:method_add_arg,
      [:call,
       [:string_literal,
        [:string_content, [:@tstring_content, "hello world!", [1, 6]]]],
       :".",
       [:@ident, "gsub", [1, 26]]],
      [:arg_paren,
       [:args_add_block,
        [[:string_literal,
          [:string_content, [:@tstring_content, "hello", [1, 32]]]],
         [:string_literal,
          [:string_content, [:@tstring_content, "good evenning", [1, 41]]]]],
        false]]]],
    false]]]]

重要なのは5つ目のon_spのところではないでしょうか。

字句解析の結果、Rubyは'hello world!'.の間のスペースをon_spというトークンとしてひとまとめにしています。
そしてその後、構文解析の結果でon_spが何かに置き換わった様子もありません。
(parserの8行目で'hello world!':string_contentとして、
gsubは10行目で:@ident(identifier)として解析されている。)

つまり、Rubyではスペースを字句解析の段階でトークンとトークンを分けるためだけに使っているのではないかと思いました。(間違っていたらごめんなさい。)
なので、結果的に、スペースがあってもなくても構文解析の結果は、、、

require 'ripper'
require 'pp'

code =<<EOF
puts 'hello world!'.gsub('hello', 'good evenning')
EOF

puts '=====lexer====='
pp Ripper.lex(code)
puts '=====parser====='
pp Ripper.sexp(code)
=====lexer=====
[[[1, 0], :on_ident, "puts"],
 [[1, 4], :on_sp, " "],
 [[1, 5], :on_tstring_beg, "'"],
 [[1, 6], :on_tstring_content, "hello world!"],
 [[1, 18], :on_tstring_end, "'"],
 [[1, 19], :on_period, "."],
 [[1, 20], :on_ident, "gsub"],
 [[1, 24], :on_lparen, "("],
 [[1, 25], :on_tstring_beg, "'"],
 [[1, 26], :on_tstring_content, "hello"],
 [[1, 31], :on_tstring_end, "'"],
 [[1, 32], :on_comma, ","],
 [[1, 33], :on_sp, " "],
 [[1, 34], :on_tstring_beg, "'"],
 [[1, 35], :on_tstring_content, "good evenning"],
 [[1, 48], :on_tstring_end, "'"],
 [[1, 49], :on_rparen, ")"],
 [[1, 50], :on_nl, "\n"]]
=====parser=====
[:program,
 [[:command,
   [:@ident, "puts", [1, 0]],
   [:args_add_block,
    [[:method_add_arg,
      [:call,
       [:string_literal,
        [:string_content, [:@tstring_content, "hello world!", [1, 6]]]],
       :".",
       [:@ident, "gsub", [1, 20]]],
      [:arg_paren,
       [:args_add_block,
        [[:string_literal,
          [:string_content, [:@tstring_content, "hello", [1, 26]]]],
         [:string_literal,
          [:string_content, [:@tstring_content, "good evenning", [1, 35]]]]],
        false]]]],
    false]]]]

同じですね。

[スペースめっちゃあり]

[:program,
 [[:command,
   [:@ident, "puts", [1, 0]],
   [:args_add_block,
    [[:method_add_arg,
      [:call,
       [:string_literal,
        [:string_content, [:@tstring_content, "hello world!", [1, 6]]]],
       :".",
       [:@ident, "gsub", [1, 26]]],
      [:arg_paren,
       [:args_add_block,
        [[:string_literal,
          [:string_content, [:@tstring_content, "hello", [1, 32]]]],
         [:string_literal,
          [:string_content, [:@tstring_content, "good evenning", [1, 41]]]]],
        false]]]],
    false]]]]

[スペースなし]

[:program,
 [[:command,
   [:@ident, "puts", [1, 0]],
   [:args_add_block,
    [[:method_add_arg,
      [:call,
       [:string_literal,
        [:string_content, [:@tstring_content, "hello world!", [1, 6]]]],
       :".",
       [:@ident, "gsub", [1, 20]]],
      [:arg_paren,
       [:args_add_block,
        [[:string_literal,
          [:string_content, [:@tstring_content, "hello", [1, 26]]]],
         [:string_literal,
          [:string_content, [:@tstring_content, "good evenning", [1, 35]]]]],
        false]]]],
    false]]]]

なので、先ほどの

puts 'hello world!'      .gsub('hello', 'good evening')

は正常に動いていたのだと考えられます。

実験

他にも、スペースが必要な場面も解析してみました。
以下のコードを解析すると

[1,2].each do |e|
end
=====lexer=====
[[[1, 0], :on_lbracket, "["],
 [[1, 1], :on_int, "1"],
 [[1, 2], :on_comma, ","],
 [[1, 3], :on_int, "2"],
 [[1, 4], :on_rbracket, "]"],
 [[1, 5], :on_period, "."],
 [[1, 6], :on_ident, "each"],
 [[1, 10], :on_sp, " "],
 [[1, 11], :on_kw, "do"],
 [[1, 13], :on_sp, " "],
 [[1, 14], :on_op, "|"],
 [[1, 15], :on_ident, "e"],
 [[1, 16], :on_op, "|"],
 [[1, 17], :on_ignored_nl, "\n"],
 [[2, 0], :on_kw, "end"],
 [[2, 3], :on_nl, "\n"]]
=====parser=====
[:program,
 [[:method_add_block,
   [:call,
    [:array, [[:@int, "1", [1, 1]], [:@int, "2", [1, 3]]]],
    :".",
    [:@ident, "each", [1, 6]]],
   [:do_block,
    [:block_var,
     [:params, [[:@ident, "e", [1, 15]]], nil, nil, nil, nil, nil, nil],
     false],
    [[:void_stmt]]]]]]

lexerの7つ目の要素でeachをしっかりトークンにして、ASTにも変換できていますが、

eachdoの間にスペースを入れないと。。。

=====lexer=====
[[[1, 0], :on_lbracket, "["],
 [[1, 1], :on_int, "1"],
 [[1, 2], :on_comma, ","],
 [[1, 3], :on_int, "2"],
 [[1, 4], :on_rbracket, "]"],
 [[1, 5], :on_period, "."],
 [[1, 6], :on_ident, "eachdo"],
 [[1, 12], :on_sp, " "],
 [[1, 13], :on_op, "|"],
 [[1, 14], :on_ident, "e"],
 [[1, 15], :on_op, "|"],
 [[1, 16], :on_ignored_nl, "\n"],
 [[2, 0], :on_kw, "end"],
 [[2, 3], :on_nl, "\n"]]
=====parser=====
nil

eachdoがくっついてeachdoになり、構文解析で失敗しているようです。

3
4
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
3
4