More than 5 years have passed since last update.

【2019年5月版】Firestore のデータメンテって Jupyter notebook (Python3) 使うとめっちゃ楽

Last updated at 2019-05-20Posted at 2019-05-20

Python 版の Firestore クライアントを使ってみよう

Firestore の操作は cloud functions や firebase-admin などの js 系のクライアントからが一般的だと思いますが、データメンテなどで単発のバッチ系作業をするのに Jupyter notebook からPython使うとめっちゃ楽ちんなのです！

そして意外と本家 Firebase のヘルプサイトからのナビゲーションが悪いので Python 版 Firestore クライアントのヘルプに辿り着くのが難易度高いのです。

早速ですが以下が Python 版 Firebase クライアントのドキュメントです。

Python Client for Google Cloud Firestore

いや～本家のヘルプサイトにはサクッとしか使い方載ってないので上記のドキュメントに辿り着くのにまたググらないといけないのがアレですね。。。

それでは早速、準備に取り掛かりましょう。

サービスアカウントの準備

まず、firestore 編集者 の権限をもったサービスアカウントの認証キー json をダウンロードしておきましょう。
それを適当な場所に配置しておきます。

準備が整いましたら notebook から叩いてみたいと思います。

Python クライアントのインストール

今回は jupyter notebookから!pipでインストールしてみます！

notebook

!pip install google-cloud-firestore

クライアントのインストールはこれだけでOKです！

とりあえずデータを取得

実際、あっけないほどシンプルではございますが、以下のようなコードでシンプルにデータ取得できます。

notebook

from google.cloud import firestore
service_account_json = '/path/to/service_account.json'

client = firestore.Client.from_service_account_json(service_account_json)

mycollection = client.collection('Sample_Collection')

# 全件
for mycol in mycollection.stream():
    item = mycol.to_dict()
    ...

# 500件
for mycol in mycollection.limit(500).stream():
    item = mycol.to_dict()
    ...

# 検索条件付きで 500件
for mycol in mycollection.select('target','=','0').limit(500).stream():
    item = mycol.to_dict()
    ...

# 検索条件とソート付きで 500件
for mycol in mycollection.select('target','=','0').order_by('name').limit(500).stream():
    item = mycol.to_dict()
    ...

サービスアカウントのjsonを渡せばもうクライアントの接続準備完了です。

後は好きなコレクションをstream()で取得してto_dic()してしまえばデータの加工も問題なしです。

検索条件やソートについてはあらかじめインデクスの設定が必要です。
offsetやstart_afterなどを組み合わせればページングもできます。

このあたりの詳しい操作は公式ガイドにも載っていますね。

クエリカーソルを使用したデータのページ設定

データの保存

続いて書き込みですが・・・

notebook

mycollection = client.collection('Sample_Collection')
target = client.collection('Target_Collection')

# IDを自動生成で追加
for mycol in mycollection.stream():
    target.add(mycol.to_dict())

# IDを指定して
for mycol in mycollection.stream():
    item = mycol.to_dict()
    target.document(mycol.id).set(item)

# 既知のIDを指定して特定のフィールドをアップデート
for mycol in mycollection.stream():
    item = mycol.to_dict()
    target.document(item['name']).update({'updateDate': firestore.SERVER_TIMESTAMP})

なにが素晴らしいって、dict をそのまま add やら set でデータ突っ込んでOKということです。
dict の値は Python の型でOKですので True とか datetime も宜しく変換してくれます。

firebase のプロジェクトをまたいだデータ投入

サービスアカウントjson で接続していますので、ステージングから本番にデータを移行とかデータのバックアップなどのプロジェクトをまたいだデータ操作も以下のようにお手軽です。

notebook

from google.cloud import firestore
staging_sa_json = '/path/to/staging/service_account.json'
prod_sa_json = '/path/to/prod/service_account.json'

stagingProj = firestore.Client.from_service_account_json(staging_sa_json)
prodProj = firestore.Client.from_service_account_json(prod_sa_json)

stagingCollection = stagingProj.collection('Sample_Collection')
prodCollection = prodProj.collection('Sample_Collection')

for mycol in stagingCollection.stream():
    item = mycol.to_dict()
    prodCollection.document(mycol.id).set(item)

上記でIDそのままで全データがコピーできます。

また、firestore のドキュメントはかなり複雑な階層も作れます。
dictでどうやって指定しよう？というときは以下の FieldPath を使うことになるかと思います。

.(ピリオド)区切りの名前でだいたい大丈夫な模様です。

Python からのデータ操作、非常に簡単です。
いやぁ、ちょっとデータ取ってきたりごにょごにょするには非常に便利ですね～！

料金に注意！

ただし、クライアントの操作なので読み込みと書き込みでしっかり課金されます。。。。
ご利用は計画的に！

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up

【2019年5月版】Firestore のデータメンテって Jupyter notebook (Python3) 使うとめっちゃ楽

Python 版 の Firestore クライアントを使ってみよう

サービスアカウントの準備

Python クライアントのインストール

とりあえずデータを取得

データの保存

firebase のプロジェクトをまたいだデータ投入

料金に注意！

Python 版の Firestore クライアントを使ってみよう