TL;DR
- 簡単にやる場合はMatch all queryを使う
- データ量が多い場合はScroll APIを使う
前準備
Elastic Searchのサーバーを用意します:
https://qiita.com/tobita_yoshiki/items/9de9cfc4beed26b792c1
Match all queryで全件取得
以下のようにMatch all queryで取得可能です。
ただし、sizeパラメータによりデフォルトでは最大10件しか取得できないため注意が必要です。
[~]$ curl --cacert http_ca.crt -u elastic:$ELASTIC_PASSWORD -X GET "https://localhost:9200/my-index-000001/_search" -H 'Content-Type: application/json' -d'
{
"query": {
"match_all": {}
}
}
'
{"took":132,"timed_out":false,"_shards":{"total":1,"successful":1,"skipped":0,"failed":0},"hits":{"total":{"value":1,"relation":"eq"},"max_score":1.0,"hits":[{"_index":"my-index-000001","_id":"pck94o8BH2JNu7nWA_Fi","_score":1.0,"_source":{"k": "v"}}]}}%
sizeパラメータの最大値はインデックスのmax_result_window
と同じです。
最大値を超える値をsizeパラメータに指定した場合は400エラーでレスポンスされます。
Scroll API
Match all queryにはインデックスのmax_result_window
以上のデータ件数が取れないという問題がありましたが、Scroll APIを使うと上限なくデータが取得できます。
全部で3件のデータを2件ずつ取得してみる
以下試しにsize=2にして、2件ずつデータをスクロールした結果です。
初回リクエストではSearch APIの任意パラメータscroll
を指定することによって、スクロール結果とドキュメント検索結果を入手しています。
2回目以降のリクエストでは、Scroll APIを使ってデータを検索しています。
$ curl --cacert http_ca.crt -u elastic:$ELASTIC_PASSWORD -X POST "https://localhost:9200/my-index-000001/_search?size=2&scroll=1d" -H 'Content-Type: application/json' -d '
{
"query": {
"match_all": {}
}
}
'
{"_scroll_id":"FGluY2x1ZGVfY29udGV4dF91dWlkDXF1ZXJ5QW5kRmV0Y2gBFjRZQUV4MXpfUTFDcnJucFBOTzBqMWcAAAAAAAAADhZzdUptOEhXSlFfcUpJLWJDcXhpcGdn","took":6,"timed_out":false,"_shards":{"total":1,"successful":1,"skipped":0,"failed":0},"hits":{"total":{"value":3,"relation":"eq"},"max_score":1.0,"hits":[{"_index":"my-index-000001","_id":"vMmsNJABH2JNu7nW7vGZ","_score":1.0,"_source":{"k": 1}},{"_index":"my-index-000001","_id":"vcmsNJABH2JNu7nW9_Hd","_score":1.0,"_source":{"k": 2}}]}}
$ curl --cacert http_ca.crt -u elastic:$ELASTIC_PASSWORD -X POST "https://localhost:9200/_search/scroll?scroll=1d" -H 'Content-Type: application/json' -d '
{
"scroll_id": "FGluY2x1ZGVfY29udGV4dF91dWlkDXF1ZXJ5QW5kRmV0Y2gBFjRZQUV4MXpfUTFDcnJucFBOTzBqMWcAAAAAAAAADhZzdUptOEhXSlFfcUpJLWJDcXhpcGdn"
}
'
{"_scroll_id":"FGluY2x1ZGVfY29udGV4dF91dWlkDXF1ZXJ5QW5kRmV0Y2gBFjRZQUV4MXpfUTFDcnJucFBOTzBqMWcAAAAAAAAADhZzdUptOEhXSlFfcUpJLWJDcXhpcGdn","took":7,"timed_out":false,"_shards":{"total":1,"successful":1,"skipped":0,"failed":0},"hits":{"total":{"value":3,"relation":"eq"},"max_score":1.0,"hits":[{"_index":"my-index-000001","_id":"vsmtNJABH2JNu7nWAPEh","_score":1.0,"_source":{"k": 3}}]}}
$ curl --cacert http_ca.crt -u elastic:$ELASTIC_PASSWORD -X POST "https://localhost:9200/_search/scroll?scroll=1d" -H 'Content-Type: application/json' -d '
{
"scroll_id": "FGluY2x1ZGVfY29udGV4dF91dWlkDXF1ZXJ5QW5kRmV0Y2gBFjRZQUV4MXpfUTFDcnJucFBOTzBqMWcAAAAAAAAADhZzdUptOEhXSlFfcUpJLWJDcXhpcGdn"
}
'
{"_scroll_id":"FGluY2x1ZGVfY29udGV4dF91dWlkDXF1ZXJ5QW5kRmV0Y2gBFjRZQUV4MXpfUTFDcnJucFBOTzBqMWcAAAAAAAAADhZzdUptOEhXSlFfcUpJLWJDcXhpcGdn","took":4,"timed_out":false,"_shards":{"total":1,"successful":1,"skipped":0,"failed":0},"hits":{"total":{"value":3,"relation":"eq"},"max_score":1.0,"hits":[]}}
同じscroll idを使ってリクエストを投げましたが、
リクエスト毎にレスポンスされるデータが変わっている点に注意です。
Scrollをclear
Clear scroll APIを使用することにより、スクロールを削除することができます。
$ curl --cacert http_ca.crt -u elastic:$ELASTIC_PASSWORD -X DELETE "https://localhost:9200/_search/scroll" -H 'Content-Type: application/json' -d '
{
"scroll_id": "FGluY2x1ZGVfY29udGV4dF91dWlkDXF1ZXJ5QW5kRmV0Y2gBFjRZQUV4MXpfUTFDcnJucFBOTzBqMWcAAAAAAAAADhZzdUptOEhXSlFfcUpJLWJDcXhpcGdn"
}
'
{"succeeded":true,"num_freed":1}
$ curl --cacert http_ca.crt -u elastic:$ELASTIC_PASSWORD -X POST "https://localhost:9200/_search/scroll?scroll=1d" -H 'Content-Type: application/json' -d '
{
"scroll_id": "FGluY2x1ZGVfY29udGV4dF91dWlkDXF1ZXJ5QW5kRmV0Y2gBFjRZQUV4MXpfUTFDcnJucFBOTzBqMWcAAAAAAAAADhZzdUptOEhXSlFfcUpJLWJDcXhpcGdn"
}
'
{"error":{"root_cause":[{"type":"search_context_missing_exception","reason":"No search context found for id [14]"}],"type":"search_phase_execution_exception","reason":"all shards failed","phase":"query","grouped":true,"failed_shards":[{"shard":-1,"index":null,"reason":{"type":"search_context_missing_exception","reason":"No search context found for id [14]"}}],"caused_by":{"type":"search_context_missing_exception","reason":"No search context found for id [14]"}},"status":404}