More than 5 years have passed since last update.

Riak2.0のSearch

riak

Last updated at 2015-10-13Posted at 2014-09-08

Riak2.0から正式に採用されたYokozuna(内部ではSolr)ベースのSearchを試してみた時のメモ。
公式ドキュメントはここにある。

準備

Riak2.0以降が既にセットアップ済みのものとする。
また、ここではRubyクライアントから試したのでこれも使用できる状態とする。Rubyクライアントはgithubの最新masterを使用する。

$ git clone https://github.com/basho/riak-ruby-client.git
$ cd riak-ruby-client
$ bundle install --path vendor/bundle
$ bundle exec irb
>

以上で検証環境ができた。

設定

設定はriak.confのsearchをonにするだけ。

riak.conf

search = on

設定変更後は再起動する。

インデックスの作成

irbから以下を実行

irb

require "riak"

client = Riak::Client.new(nodes:[{:host => "127.0.0.1"}])
client.create_search_index("famous")

Bucket Typeの作成

シェルから以下を実行してバケットタイプanimalsを作成。

$ riak-admin bucket-type create animals '{"props": {}}'
$ riak-admin bucket-type activate animals

インデックスの有効化

それからanimals以下のバケットcatsに対してインデックスするように設定。

$ curl -XPUT http://localhost:8098/types/animals/buckets/cats/props \
       -H 'Content-Type:application/json' \
       -d '{"props": {"search_index": "famous"}}'

ちなみに、bucket type配下の全てのbucketに対して同様にインデックスする場合は個別にやるよりbucket type作るときにプロパティとして設定しておけばよいとのこと。

$ riak-admin bucket-type create animals '{"props": {"search_index":"famous"}}'
$ riak-admin bucket-type activate animals

データの保存 & インデックス

Rubyインターフェースからデータを保存してみる。

irb

require "riak"

client = Riak::Client.new(nodes:[{:host => "127.0.0.1"}])
bucket = client.bucket("cats")

cat = bucket.get_or_new("liono")
cat.data = {"name_s" => "Lion-o", "age_i" => 30, "leader_b" => true}
cat.store(type:"animals")

cat = bucket.get_or_new("cheetara")
cat.data = {"name_s" => "Cheetara", "age_i" => 28, "leader_b" => false}
cat.store(type:"animals")

cat = bucket.get_or_new("snarf")
cat.data = {"name_s" => "Snarf", "age_i" => 43}
cat.store(type:"animals")

cat = bucket.get_or_new("panthro")
cat.data = {"name_s" => "Panthro", "age_i" => 36}
cat.store(type:"animals")

store 時にtype:"animals" としてインデクシングを有効化したbucket typeを指定した。

なお、Riak2.0 Searchの機能として、スキーマに_yz_defaultを指定した場合、フィールド名のサフィックスで自動的に型を推測してインデックスしてくれるみたい。今回はインデックスを作るときに、

$ client.create_search_index("famous")

として作ったけど、これは

$ client.create_search_index("famous", "_yz_default")

と同義となり、スキーマに_yz_defaultを指定した事になる為、フィールド名で型を推測してくれるぽい。

なお、フィールド名に付加するサフィックスは以下のようなものが在るようです。

_s: string
_i: integer
_b: binary(true,falseの真偽値もこれ)

値としてリストを格納する場合は、サフィックスを２つ続ければ良いみたい。
例えば文字列のリストの値の場合は_ssという具合。

json

{"people_ss":["Ryan", "Eric", "Brett"]}

また、自動的に以下のフィールドが付加されている。

_yz_rk: キー
_yz_rt: バケットタイプ
_yz_rb: バケット
_yz_err: エラー

クエリ

前項でデータを保存すると同時にインデックスされているので、公式ドキュメントの例を真似てRubyインターフェースからクエリを投げてデータを取得してみる。

irb

results = client.search("famous", "name_s:Lion*")

この例ではname_sフィールドの値がLionから始まるオブジェクトを検索している。

数値の範囲検索。

irb

results = client.search("famous", "age_i:[30 TO *]")

複数条件はANDで結ぶ

irb

results = client.search("famous", "leader_b:true AND age_i:[30 TO *]")

検索結果は返り値の["docs"]から配列として取り出せる。

irb

results["docs"]

[{"score"=>"2.32468770000000013454e+00",
  "_yz_rb"=>"cats",
  "_yz_rt"=>"animals",
  "_yz_rk"=>"liono",
  "_yz_id"=>"1*animals*cats*liono*10",
  "name_s"=>"Lion-o", "age_i"=>"30", "leader_b"=>"true"
}]

なお返り値全体は以下のようになっていた。

irb

results

{"max_score"=>2.3246877193450928,
 "num_found"=>1,
 "docs"=>[{"score"=>"2.32468770000000013454e+00", 
           "_yz_rb"=>"cats",
           "_yz_rt"=>"animals",
           "_yz_rk"=>"liono",
           "_yz_id"=>"1*animals*cats*liono*10",
           "name_s"=>"Lion-o", "age_i"=>"30", "leader_b"=>"true"
         }]
}

まだページネーションとか、Searchと連携したMapReduceなどワクテカな機能とか在るけど長くなるのでとりあえず以上。
Data Typesと合わせて、2.0になってよりアプリよりの機能も実装されてて色々助かるなーって思った。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up