More than 5 years have passed since last update.

Railsアプリの検索処理にElasticsearchを組み込むのにやったことまとめ

Last updated at 2019-06-25Posted at 2015-08-23

はじめに

とあるRailsで書いたWebサービスの検索処理を、MySQLのInnoDB FTSからElasticsearchに置き換えた際にやったことメモです。

Elasticsearchのchefでのインストール、serverspecでのテスト、Elasticsearch単体での稼働確認、Railsアプリへの組み込み、RSpecでのテストまでの一連の流れを広く浅く説明します。それぞれの内容はすごく初歩的な範囲ですが、一連の流れとしてまとまっていることに価値があるかと思うので、まとめておきました。

ちなみにとあるWebサービスというのはこちらです。

commit-m: GitHubコミットメッセージの文例が検索できるサービス
http://commit-m.minamijoyo.com/

システムの規模的には全然Elastcisearchが必要なレベルじゃないんですが、逆にちょっとElasticsearchを触ってみたいという導入には、これぐらいシンプルな構成の方が、最低限何しないといけないがコンパクトにまとまって分かりやすくてよかったです。

説明に使用しているソースコードの全体は以下のブランチを参照して下さい。

Railsアプリ部分
https://github.com/minamijoyo/commit-m/tree/change-fts-to-es

chefの設定など
https://github.com/minamijoyo/commit-infra/tree/change-fts-to-es

使用したElasticsearchのバージョンは1.5.0、Railsのバージョンは4.2.0です。

Elasticsearchのインストール

Elasticsearchのインストールはjavaが入ってれば公式サイトから配布されてるzipをダウンロードしてきて展開するだけでOK。
プラットフォームによってはrpmなどのパッケージ形式で配布されてたりもします。その辺の手順はググればいっぱい出てくるので割愛します。

あんまり情報が少ないElasticsearchをchefで入れる方法を書いておきます。

まずBerksfileにelasticsearchを追加します。

commit-infra/Berksfile

source "https://supermarket.chef.io"

(略)
cookbook 'elasticsearch', '~>1.0.0'

elasticsearchのcookbookは0.3系と1.0系で書き方が変わってるのでWeb上のサンプルをコピペする際に注意して下さい。基本的にレシピよりも大半をLWRP提供する形態に変わってます。

berksでcookbookをとってきます。

$ bundle exec berks vendor cookbooks

Elasticsearchサーバの役割を与えるesロールを作って、Elasticsearchのインストールとカスタマイズ設定を読むようにします。

commit-infra/roles/es.json

{
  "name": "es",
  "chef_type": "role",
  "json_class": "Chef::Role",
  "default_attributes": {
    "java": {
      "install_flavor": "openjdk",
      "jdk_version": "8"
    }
  },
  "override_attributes": {},
  "run_list": [
    "recipe[java]",
    "recipe[elasticsearch]",
    "recipe[commitm-elasticsearch]"
  ]
}

ポイントとして、javaに依存してるんですが、jdkのバージョンはOpenJDKの8を明示的に指定しています。jdkのバージョンが7だと手元の開発用に使ってるCentOS6.5で、SSL証明書関連のエラーでheadプラグインのインストールがこけてハマりました。

recipe[elasticsearch]で標準的なインストールがおこなわれるので、その他のプラグインのインストールなどのために、追加で設定のカスタマイズ用のsite-cookbookを作ります。

$ bundle exec knife cookbook create -o site-cookbooks commitm-elasticsearch

metadata.rbに依存を追加します。

commit-infra/site-cookbooks/commitm-elasticsearch/metadata.rb

name             'commitm-elasticsearch'
(略)
depends          'elasticsearch', '~> 1.0.0'

プラグインのインストールとサービスの起動設定を追加します。
ここではheadというElasticsearchのWebUIコンソールを入れます。ちなみに日本語扱う場合はkuromojiとかでググってください。

commit-infra/site-cookbooks/commitm-elasticsearch/recipes/default.rb

elasticsearch_plugin 'mobz/elasticsearch-head'

service 'elasticsearch' do
  action :start
end

作ったesロールをnodeのrun_listに追加しておきます。

commit-infra/nodes/commitm-ap.json

{
  "environment": "production",
  "run_list": [
    "role[base]",
    "role[ap]",
    "role[db]",
    "role[es]"
  ]
}

準備出来たらknife soloでcookします。
（まだchef-soloなの、今どきchef-zeroだよねとかはさておき）

$ bundle exec knife solo cook commitm-ap

serverspecでテスト

serverspecでesロールのテストを追加しておきます。

commit-infra/spec/es/elasticsearch_spec.rb

require 'spec_helper'

describe "elasticsearch spec" do
  # package
  describe package('java-1.8.0-openjdk') do
    it { should be_installed }
  end

  # command
  describe command('which elasticsearch') do
    let(:disable_sudo) { true }
    its(:exit_status) { should eq 0 }
  end

  # service
  describe service('elasticsearch') do
    it { should be_enabled }
    it { should be_running }
  end

  # port
  describe port("9200") do
    it { should be_listening }
  end

  # plugin
  describe command('curl http://127.0.0.1:9200/_plugin/head/ -o /dev/null -w "%{http_code}\n" -s') do
    its(:stdout) { should match /^200$/ }
  end
end

簡単にOpenJDK 1.8のインストール確認と、elasticsearchコマンドの有無、elasticsearchのサービス自動起動設定、リッスンポートの確認、headプラグインの応答有無をテストしています。

serverspecのロールはchefに依存しないようにあえて分けて管理してるので、こっちもesロールを追加しておきます。serverspecでテスト対象のIPとロール管理するやり方は以前ブログに書いたことがあるので、この辺も参考にしてください。
http://d.hatena.ne.jp/minamijoyo/20150301/p1

commit-infra/hosts.json

[
  (略)
  {
    "name": "commitm-ap",
    "host_name": "<%= ENV['TARGET_IP'] %>",
    "user": "ec2-user",
    "port": 22,
    "keys":  "<%= ENV['TARGET_SSH_KEYPATH'] %>",
    "roles":["base", "ap", "db", "es"]
  }
]

準備できたらserverspecも流しておきましょう。

$ bundle exec rake serverspec:commitm-ap

Elasticsearchの簡単な使い方

Elasticsearchのセットアップができたので、Railsアプリに組み込む前に、ちょっとElasticsearch単体で稼働確認をしてみましょう。ElasticsearchそのものはcurlのHTTPリクエストで操作できるので、これで使い方のイメージを大雑把に把握できれば、あとはそれをどうやってRailsから使うかという話になるので、基礎の理解があるといろいろはかどります。

Elaticsearchはデフォルトで9200番ポートで稼働しています。ちょっとcurlで叩いてみましょう。ルート直下だとこんなかんじで、Elasticsearchのバージョン番号などの応答を返してきます。

$ curl http://localhost:9200/
{
  "status" : 200,
  "name" : "commitm-dev",
  "cluster_name" : "elasticsearch",
  "version" : {
    "number" : "1.5.0",
    "build_hash" : "544816042d40151d3ce4ba4f95399d7860dc2e92",
    "build_timestamp" : "2015-03-23T14:30:58Z",
    "build_snapshot" : false,
    "lucene_version" : "4.10.4"
  },
  "tagline" : "You Know, for Search"
}

次にElasticsearchのインデックスを作ってみます。
ElasticsearchのインデックスとはRDSでいうところのデータベースのようなものです。
RESTなAPIになっているので、例えばcommitmというインデックスを作るのは、こんなかんじでPUTします。

$ curl -XPUT http://localhost:9200/commitm/

{"acknowledged":true}

次にcommitというmappingを定義してみます。RDSでいうところのテーブルの型定義みたいなものです。

$ curl -XPUT http://localhost:9200/commitm/commit/_mapping -d '{
  "commit": {
    "properties": {
      "id": { "type": "integer", "index": "not_analyzed" },
      "repo_full_name": { "type": "string" },
      "sha": { "type": "string", "index": "not_analyzed" },
      "message": { "type": "string" }
    }
  }
}'

{"acknowledged":true}

typeのintegerとstringは型定義なので説明不要だと思います。not_analyzedというのはanalyzeしないということなんですが、検索する場合に部分一致ではなく完全一致させたいようなフィールドに指定します。今回扱うデータがシンプルな英語のスペース区切りの文章なので、tokenizerとかanalyzerの説明は省きます。

試しにデータを1件登録してみます。こんなかんじでPUTで実データを投入します。

$ curl -XPUT http://localhost:9200/commitm/commit/1 -d '{
  "id": 1,
  "repo_full_name": "twbs/bootstrap",
  "sha": "9e1e73f9dcfdf20305dcb6a83e77e67efe1948c5",
  "message": "Merge pull request #15762 from twbs/twitter-handle"
}'

{"_index":"commitm","_type":"commit","_id":"1","_version":1,"created":true}

検索するにはGETでクエリを投げます。messageにmergeというキーワードを含むものを検索してみます。
出力が長い場合は、pretty=trueをつけると出力を整形してくれます。

$ curl -XGET 'http://localhost:9200/commitm/commit/_search?pretty=true' -d '{
  "query": {
    "match": {
      "message": "merge"
    }
  }
}'

{
  "took" : 13,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "failed" : 0
  },
  "hits" : {
    "total" : 1,
    "max_score" : 0.095891505,
    "hits" : [ {
      "_index" : "commitm",
      "_type" : "commit",
      "_id" : "1",
      "_score" : 0.095891505,
      "_source":{
  "id": 1,
  "repo_full_name": "twbs/bootstrap",
  "sha": "9e1e73f9dcfdf20305dcb6a83e77e67efe1948c5",
  "message": "Merge pull request #15762 from twbs/twitter-handle"
}
    } ]
  }
}

なんとなく素のElasticsearchの雰囲気が掴めてきたでしょうか？

Elasticsearchが何してくれてるか分かってきたものの、JSONの読み書きを手書きでやるのはなんだか辛いかんじがしますねー。ではそろそろRailsアプリから使えるようにしましょう。

ElasticsearchをRailsアプリに組み込む

ElasticsearchをRailsアプリに組み込むには、ElaticsearchのAPIをRailsアプリからいい感じに使えるようにしてくれる以下のgemをGemfileに追加します。

commit-m/Gemfile

(略)
gem 'elasticsearch-rails', '~> 0.1.7'
gem 'elasticsearch-model', '~> 0.1.7'

bundleでinstallします。

$ bundle install

次に、commitモデルの検索周りの処理をconcernとして作っていきます。

commit-m/app/models/concerns/commit/searchable.rb

require 'active_support/concern'
module Commit::Searchable
  extend ActiveSupport::Concern

  included do
    include Elasticsearch::Model

    index_name "commitm"

    settings index: {
      number_of_shards: 1,
      number_of_replicas: 0
    } do
      mapping _source: { enabled: true } do
        indexes :id, type: 'integer', index: 'not_analyzed'
        indexes :repo_full_name, type: 'string'
        indexes :sha, type: 'string', index: 'not_analyzed'
        indexes :message, type: 'string'
      end
    end
  end

  module ClassMethods
    def create_index!(options={})
      client = __elasticsearch__.client
      client.indices.delete index: "commitm" rescue nil if options[:force]
      client.indices.create index: "commitm",
        body: {
          settings: settings.to_hash,
          mappings: mappings.to_hash
        }
    end
  end
end

モジュールの中でinclude Elasticsearch::Modelして便利なメソッド群をincludeします。

index_nameはインデックス名、settingsはインデックスの設定を書きます。number_of_shardsやnumber_of_replicasはシャードやレプリカの設定で耐障害性や性能に関連するのですが、今回は対した要件ではないので一旦忘れます。

mappingのところに先ほどのmappingの定義と同じようなものを書きます。Railsのモデルのマイグレーション書くイメージです。

create_index!は実際にインデックスを作成するヘルパーです。あとでRailsコンソールから実行します。__elasticsearch__.clientでElasticsearchのクライアントのオブジェクトがとれるので、このクライアント経由でいろいろ操作できます。

作ったモジュールをモデルにincludeします。

commit-m/app/models/commit.rb

class Commit < ActiveRecord::Base
  include Commit::Searchable
  def self.search_message(keyword)
    if keyword.present?
      query = {
        "query": {
          "match": {
            "message": keyword
          }
        }
      }
      Commit.__elasticsearch__.search(query)
    else
      Commit.none
    end
  end
end

もらったキーワードから検索のクエリを組み立てて、Commit.__elasticsearch__.searchに渡します。なんか知らぬ間にCommitモデルに__elasticsearch__.searchとか生えてびっくりしますが、elasticsearch-railsとelasticsearch-modelがよしなにElasticsearchに問い合わせを投げてくれます。

コントローラ周りで気にするのは今回ページネーションぐらいなんですが、will_pagenateから見ると、ActiveRecord使ってた時と同じになるようによしなに吸収してくれているようです。

commit-m/app/controllers/commits_controller.rb

class CommitsController < ApplicationController
  def index
    @commits = []
    @keyword = ""
  end

  def search
    @keyword = params[:keyword]
    @commits = Commit.search_message(@keyword).paginate(page: params[:page])
  end
end

一部だけ吸収してくれなかったのはviewで@commits.countじゃなくて@commits.total_entriesみないと全体の件数にならなかったことぐらいです。

commit-m/app/views/commits/search.html.erb

<%= render 'search_form' %>
<hr>
<% unless @commits.nil? %>
    <%= pluralize(@commits.total_entries, "result") %>.
<% end %>
<% if @commits.any? %>
  <table class="table table-hover">
  (略)
  </table>
  <%= will_paginate @commits, :params => { :keyword => @keyword} %>
<% end %>

準備出来たら、Railsコンソールからデータを投入して、実際に検索してみましょう。

$ bundle exec rails c
rails> Commit.create_index!
rails> Commit.import

さっき作っておいたヘルパーのcreate_index!でインデックスを作って、importでDBのデータを元にElasticsearchにデータを投入してくれます。

ついでにRailsコンソールから検索もできるか試してみる。

rails> Commit.__elasticsearch__.search(
  {
    "query": {
      "match": {
        "message": "merge"
      }
    }
  }
).records.to_a

とかすると検索クエリで返ってきたのを配列にまとめて返してくれる。

最後にWeb画面から確認して検索キーワード入力して結果が返って来ればOK。
動いた！めでたしめでたし。という方は以上で帰って頂いても問題ありません。

Elasticsearchのテストを書く

大抵の入門記事は、動いた！めでたしめでたし。
で終わっていて、テストについての記載がないのでRSpecでのテストについても補足しておきます。

elasticsearch-extensionsというgemをGemfileに追加します。

commit-m/Gemfile

group :test do
    (略)
    gem 'elasticsearch-extensions', '~> 0.0.18'
end

$ bundle install

specのヘルパーで読み込まれるようにします。:elasticsearchのテストの前後でElasticsearchが起動停止するように仕込みます。ヘルパーの位置は適宜読み替えてください。あと、インデックス登録とデータ登録、インデックス削除のヘルパーも作っておきます。

commit-m/spec/rails_helper.rb

（略）
Spork.prefork do
（略）
  require 'elasticsearch/extensions/test/cluster'
（略）
  RSpec.configure do |config|
  (略)
    # Elasticsearch test setting
    config.before(:all, :elasticsearch) do
      Elasticsearch::Extensions::Test::Cluster.start(nodes: 1) unless Elasticsearch::Extensions::Test::Cluster.running?
    end

    config.after(:all, :elasticsearch) do
      Elasticsearch::Extensions::Test::Cluster.stop if Elasticsearch::Extensions::Test::Cluster.running?
    end
  end

  def elasticsearch_create_index_and_import
    Commit.__elasticsearch__.create_index! force: true
    Commit.import
    sleep 1
  end

  def elasticsearch_delete_index
    Commit.__elasticsearch__.client.indices.delete index: Commit.index_name
  end
end

あとは、実際のRSpecのテストを書いていきます。

commit-m/spec/requests/commits_spec.rb

require 'rails_helper'

RSpec.describe "Commits", type: :request do
  subject { page }

  describe "Root Page" do
    before { visit root_path }

    (略)
    describe "Search form", :elasticsearch do
      before do
        3.times { FactoryGirl.create(:commit) }
        elasticsearch_create_index_and_import
      end

      after do
        elasticsearch_delete_index
        Commit.delete_all
      end

      describe 'Click Search button' do
        before do
          fill_in "keyword", with:"Message"
          click_button "Search"
        end
        it { should have_content('3 results.') }
      end

    end
  end
  (略)
end

describeで:elasticsearch を指定してElasticsearchの起動停止の制御をして、elasticsearch_create_index_and_importでElasticsearchのインデックス作成とデータ投入、elasticsearch_delete_indexでデータ削除をしています。

$ bundle exec rake spec

こんなかんじでElasticsearchを使ったテストもできるようになりました。
今度こそ、めでたしめでたし。

おわりに

非常にシンプルなモデルであればgemのおかげでElasitcsearchをRailsアプリに組み込むのはそこまで難しいものでもなかったです。モデルやクエリが複雑になっていくと、話はこんなにシンプルにならないんですが、それはまたノウハウを身につけたら共有したいです。

参考

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up