2
1

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

More than 3 years have passed since last update.

Heroku SeachBox ElasticsearchがRails Searchkickのデフォルト設定で動かない

Last updated at Posted at 2020-05-06

実運用しているアプリが2020/2/13あたりからインデックスの新規作成が上手くいかず調査しました。

エラー

app/models/post.rb
class Post < ApplicationRecord
  searchkick
end

動かない状態のサンプル(GitHub)

reindexでエラーが出ます。

irb(main):001:0> Post.reindex
Traceback (most recent call last):
        1: from (irb):1
Elasticsearch::Transport::Transport::Errors::BadRequest ([400] {"error":{"root_cause":[{"type":"remote_transport_exception","reason":"[oin-1][10.0.24.93:9300][indices:admin/create]"}],"type":"illegal_argument_exception","reason":"The difference between max_gram and min_gram in NGram Tokenizer must be less than or equal to: [1] but was [49]. This limit can be set by changing the [index.max_ngram_diff] index level setting."},"status":400})

create_indexでエラーが出ている模様

対応

searchkick のインデックス設定を変更

  • max_shingle_size を4にする
  • min_gram, max_gram の差を1にする

※値は各自の環境、必要要件に合わせてください

app/models/post.rb
class Post < ApplicationRecord
  MIN_GRAM = 1.freeze
  searchkick settings: {
    analysis: {
      filter: {
        searchkick_suggest_shingle: {
          max_shingle_size: 4
        },
        searchkick_ngram: {
          min_gram: MIN_GRAM,
          max_gram: MIN_GRAM + 1
        }
      }
    }
  }
end

サンプル(GitHub)

調査過程

index.max_ngram_diff を1に設定

The difference between max_gram and min_gram in NGram Tokenizer must be less than or equal to: [1] but was [49]. This limit can be set by changing the [index.max_ngram_diff] index level setting.

index.max_ngram_diff は1にしないといけないということで設定する

app/models/post.rb
class Post < ApplicationRecord
  searchkick settings: {
    index: { max_ngram_diff: 1 }
  }
end

結果

エラー変わらず

irb(main):001:0> Post.reindex
Traceback (most recent call last):
        1: from (irb):1
Elasticsearch::Transport::Transport::Errors::BadRequest ([400] {"error":{"root_cause":[{"type":"illegal_argument_exception","reason":"The difference between max_gram and min_gram in NGram Tokenizer must be less than or equal to: [1] but was [49]. This limit can be set by changing the [index.max_ngram_diff] index level setting."}],"type":"illegal_argument_exception","reason":"The difference between max_gram and min_gram in NGram Tokenizer must be less than or equal to: [1] but was [49]. This limit can be set by changing the [index.max_ngram_diff] index level setting."},"status":400})

indexの設定を確認する

irb(main):001:0> Post.searchkick_index.index_options
{:settings=>
  {:analysis=>
    {:analyzer=>{},
     :filter=>
      {:searchkick_index_shingle=>{:type=>"shingle", :token_separator=>""},
       :searchkick_search_shingle=>
        {:type=>"shingle",
         :token_separator=>"",
         :output_unigrams=>false,
         :output_unigrams_if_no_shingles=>true},
       :searchkick_suggest_shingle=>{:type=>"shingle", :max_shingle_size=>5},
       :searchkick_edge_ngram=>
        {:type=>"edge_ngram", :min_gram=>1, :max_gram=>50},
       :searchkick_ngram=>{:type=>"ngram", :min_gram=>1, :max_gram=>50},
       :searchkick_stemmer=>{:type=>"snowball", :language=>"English"}},
     :char_filter=>{:ampersand=>{:type=>"mapping", :mappings=>["&=> and "]}}},
   :index=>{:max_ngram_diff=>1, :max_shingle_diff=>4}},
# ...省略

max_ngram_diff は1になったが min と max の実際の差が問題のよう

:searchkick_ngram=>{:type=>"ngram", :min_gram=>1, :max_gram=>50},

analysis.filter.searchkick_ngram.max_gram を2に設定

app/models/post.rb
class Post < ApplicationRecord
  searchkick settings: {
    analysis: {
      filter: {
        searchkick_ngram: {
          max_gram: 2
        }
      }
    }
  }
end

結果

エラーが変わった

irb(main):001:0> Post.reindex
Traceback (most recent call last):
        1: from (irb):1
Elasticsearch::Transport::Transport::Errors::BadRequest ([400] {"error":{"root_cause":[{"type":"illegal_argument_exception","reason":"In Shingle TokenFilter the difference between max_shingle_size and min_shingle_size (and +1 if outputting unigrams) must be less than or equal to: [3] but was [4]. This limit can be set by changing the [index.max_shingle_diff] index level setting."}],"type":"illegal_argument_exception","reason":"In Shingle TokenFilter the difference between max_shingle_size and min_shingle_size (and +1 if outputting unigrams) must be less than or equal to: [3] but was [4]. This limit can be set by changing the [index.max_shingle_diff] index level setting."},"status":400})

次は max_shingle_diff は3にしないといけないとのこと

index.max_shingle_diff を3に設定

app/models/post.rb
class Post < ApplicationRecord
  searchkick settings: {
    index: { 
      max_shingle_diff: 3 
    },
    analysis: {
      filter: {
        searchkick_ngram: {
          max_gram: 2
        }
      }
    }
  }
end

結果

エラー変わらず

irb(main):001:0> Post.reindex
Traceback (most recent call last):
        1: from (irb):1
Elasticsearch::Transport::Transport::Errors::BadRequest ([400] {"error":{"root_cause":[{"type":"illegal_argument_exception","reason":"In Shingle TokenFilter the difference between max_shingle_size and min_shingle_size (and +1 if outputting unigrams) must be less than or equal to: [3] but was [4]. This limit can be set by changing the [index.max_shingle_diff] index level setting."}],"type":"illegal_argument_exception","reason":"In Shingle TokenFilter the difference between max_shingle_size and min_shingle_size (and +1 if outputting unigrams) must be less than or equal to: [3] but was [4]. This limit can be set by changing the [index.max_shingle_diff] index level setting."},"status":400})

max_ngram_diff と同じで実際の数値が問題のよう
:searchkick_suggest_shingle=>{:type=>"shingle", :max_shingle_size=>5}, 現状5になっている

analysis.filter.searchkick_suggest_shingle.max_shingle_size を4に設定

app/models/post.rb
class Post < ApplicationRecord
  searchkick settings: {
    analysis: {
      filter: {
        searchkick_suggest_shingle: {
          max_shingle_size: 4
        },
        searchkick_ngram: {
          max_gram: 2
        }
      }
    }
  }
end

結果

reindex成功

irb(main):001:0> Post.reindex
D, [2020-05-06T04:53:35.428208 #4] DEBUG -- :   Post Load (632.2ms)  SELECT "posts".* FROM "posts" ORDER BY "posts"."id" ASC LIMIT $1  [["LIMIT", 1000]]
D, [2020-05-06T04:53:35.532908 #4] DEBUG -- :   Post Import (84.4ms)  {"count":3}
=> true

※こちらの記事は自ブログからの転載です
https://akinov.hatenablog.com/entry/2020/05/06/141639

2
1
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
2
1

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?