More than 3 years have passed since last update.

Python Elasticsearchクライアントの実装

Last updated at 2021-10-10Posted at 2021-10-09

はじめに

ESがいくつかのクライアントとのAPIを提供し、ここでpythonを使って、実装しながら、使い方を解説する。

事前準備

Elasticsearch7.15.0 インストール
Python3.9.7 インストール
kibana7.15.0 インストール(なくてもよい、postmanでもなんでも好きなのがあれば)
CentOS8.3(linuxじゃなくてもよい)

※Elasticsearchとkibanaバージョン一致しないと動かない。

Elasticsearchパッケージインストール

# python -m pip install elasticsearch

クライアントに接続

ESにデーターを詰めたいため、ESのサーバーに接続
以下のサンプルはES三台構成とする、SSL通信ならhttp_authが必要。

from elasticsearch import Elasticsearch

es = Elasticsearch(
    [
        {"host": "172.x.x.x", "port": 9200}, 
        {"host": "172.x.x.x", "port": 9200},
        {"host": "172.x.x.x", "port": 9200}
    ],
    http_auth=("username", "secret"), 
    timeout=3600
)

es = Elasticsearch()だけならローカルの9200を接続しに行く。

インデックス作成

# インデックス名はnewsのインデックスを作成
result = es.indices.create(index='my-index', ignore=400)
print(result)

実際のプログラミングにはエラーをキャッチするように、ignore=400などを使用した方がよい。
ignoreを使うと、同じ名前のインデックスが作成されたらエラーを無視する意味。

インデックス削除

result = es.indices.delete(index='my-index', ignore=[400, 404])
print(result)

404は存在してないインデックスを削除しようとするとエラーが発生する。
そのエラーを無視(ignore)する

データー挿入

indexメソッドを使用する。DB作成みたいなもので、引数にはindex、doc_type、bodyが必要です。

# bodyとして使う
doc = {
    'level': 'info',
    'timestamp': datetime.now(),
    'detail': 'connect',
}
# インデックスmy-indexを作成する
res = es.index(index="my-index", doc_type="log", id="100001", body=doc)
# 結果出力
print(res['result'])

index()メソッド使う時にid指定しなくても良いです。その場合、自動にid番号が振られる。

create()メソッドもデーターの挿入できる。create()メソッド使う時idは必須となる。

配列型の挿入方法

datas = [
    {
        'title': 'hello morning',
        'date': '2021-11-16'
    },
    {
        'title': 'hello morning2',
        'date': '2021-12-16'
    },
    {
        'title': 'hello morning3',
        'date': '2021-12-17'
    },
    {
        'title': 'hello morning4',
        'date': '2021-12-18'
    }
]

for data in datas:
    es.index(index='my-index', doc_type='log', body=data)

データー更新

doc = {
    'level': 'info',
    'timestamp': datetime.now(),
    'detail': 'connect',
}
result = es.update(index='my-index', doc_type='log', body=doc, id=1)
print(result)

idを指定し、update()メソッドで更新する

データー削除

result = es.delete(index='my-index', doc_type='log', id=1)
print(result)

id指定で削除する

インデックス取得

res = es.get(index="my-index", id='100001')
print(res['_source'])

検索

# 検索(最初の100件)
query = {
  "size": 100, 
  "query": {
    "match_all": {}
  }
}
res = es.search(index="my-index", body=query)
for hit in res['hits']['hits']:
    print(hit['_source'])

res['hits']['hits']ヒットした情報の一ページ目の情報が取得できる

res['hits']['total']ヒットした件数が取得できる

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up