##1. download & install
bin/elasticsearch
elasticsearchだとマストなプラグイン
bin/plugin install mobz/elasticsearch-head
形態素解析してくれるプラグイン
bin/plugin install elasticsearch/elasticsearch-analysis-kuromoji/2.5.0
##2. setting kuromoji
面倒くさいなら (restart elasticsearch)
config/elasticsearch.yml
index.analysis.analyzer.default.type: custom
index.analysis.analyzer.default.tokenizer: kuromoji_tokenizer
インデックス単位で設定するなら
curl -XPUT http://localhost:9200/index1/ -d '
{
"index": {
"analysis": {
"tokenizer": {
"kuromoji": {
"type": "kuromoji_tokenizer"
}
},
"analyzer": {
"analyzer": {
"type": "custom",
"tokenizer":"kuromoji"
}
}
}
}
}'
##3. check
analyzerの確認
curl -XPOST http://localhost:9200/index1/_analyze?analyzer=analyzer&petty -d 'これはペンです'
{
"tokens": [
{
"token": "これ",
"start_offset": 0,
"end_offset": 2,
"type": "word",
"position": 1
},
{
"token": "は",
"start_offset": 2,
"end_offset": 3,
"type": "word",
"position": 2
},
{
"token": "ペン",
"start_offset": 3,
"end_offset": 5,
"type": "word",
"position": 3
},
{
"token": "です",
"start_offset": 5,
"end_offset": 7,
"type": "word",
"position": 4
}
]
}
うむ。ちゃんと動いてるっぽい。
サンプル登録1
curl -XPUT http://localhost:9200/index1/type1/1 -d '{"text":"これはパンです"}'
サンプル登録2
curl -XPUT http://localhost:9200/index1/type1/2 -d '{"text":"これはペンです"}'
検索!
curl -XGET http://localhost:9200/index1/type1/_search -d '{"query": {"match": {"text": "ペン"}}}'
{
"took": 5,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 0.15342641,
"hits": [
{
"_index": "index1",
"_type": "type1",
"_id": "2",
"_score": 0.15342641,
"_source": {
"text": "これはペンです"
}
}
]
}
}
##4. python client
$ pip install elasticsearch