LoginSignup
5
5

More than 5 years have passed since last update.

PostgreSQL JSONB #> 演算子の性能

Last updated at Posted at 2016-10-12

はじめに

例によって役に立つかどうかは分からないけど、JSONBに対する検索で何が性能に影響を与えるのか、気になったので調べてみた。

JSONBの検索に影響を与えそうなもの

単純に考えると以下の3つ。

  • 階層数
  • 同一階層のキー/値の組の数
  • 配列要素数

とりあえず、どれが一番影響しそうかを調べてみた。

測定パターン

測定環境は手持ちのノートPC上のVM。
なので数値の絶対値にそれほど意味はない。

階層数の測定

検索対象のデータ

以下のようなJSONデータを100万件挿入する。

{"key00":{"key00":{"key00":{"key00":{"key00":{"key00":{"key00":{"key00":{"key00":{"key00":{"key00":{"key00":{"key00":{"key00":{"key00":{"key00":{"key00":{"key00":{"key00":{"key00":"dummy", "key01":"dummy","key02":"dummy","key03":"dummy","key04":"dummy","key05":"dummy","key06":"dummy","key07":"dummy","key08":"dummy","key09":"dummy"}, "key01":"dummy","key02":"dummy","key03":"dummy","key04":"dummy","key05":"dummy","key06":"dummy","key07":"dummy","key08":"dummy","key09":"dummy"}, "key01":"dummy","key02":"dummy","key03":"dummy","key04":"dummy","key05":"dummy","key06":"dummy","key07":"dummy","key08":"dummy","key09":"dummy"}, "key01":"dummy","key02":"dummy","key03":"dummy","key04":"dummy","key05":"dummy","key06":"dummy","key07":"dummy","key08":"dummy","key09":"dummy"}, "key01":"dummy","key02":"dummy","key03":"dummy","key04":"dummy","key05":"dummy","key06":"dummy","key07":"dummy","key08":"dummy","key09":"dummy"}, "key01":"dummy","key02":"dummy","key03":"dummy","key04":"dummy","key05":"dummy","key06":"dummy","key07":"dummy","key08":"dummy","key09":"dummy"}, "key01":"dummy","key02":"dummy","key03":"dummy","key04":"dummy","key05":"dummy","key06":"dummy","key07":"dummy","key08":"dummy","key09":"dummy"}, "key01":"dummy","key02":"dummy","key03":"dummy","key04":"dummy","key05":"dummy","key06":"dummy","key07":"dummy","key08":"dummy","key09":"dummy"}, "key01":"dummy","key02":"dummy","key03":"dummy","key04":"dummy","key05":"dummy","key06":"dummy","key07":"dummy","key08":"dummy","key09":"dummy"}, "key01":"dummy","key02":"dummy","key03":"dummy","key04":"dummy","key05":"dummy","key06":"dummy","key07":"dummy","key08":"dummy","key09":"dummy"}, "key01":"dummy","key02":"dummy","key03":"dummy","key04":"dummy","key05":"dummy","key06":"dummy","key07":"dummy","key08":"dummy","key09":"dummy"}, "key01":"dummy","key02":"dummy","key03":"dummy","key04":"dummy","key05":"dummy","key06":"dummy","key07":"dummy","key08":"dummy","key09":"dummy"}, "key01":"dummy","key02":"dummy","key03":"dummy","key04":"dummy","key05":"dummy","key06":"dummy","key07":"dummy","key08":"dummy","key09":"dummy"}, "key01":"dummy","key02":"dummy","key03":"dummy","key04":"dummy","key05":"dummy","key06":"dummy","key07":"dummy","key08":"dummy","key09":"dummy"}, "key01":"dummy","key02":"dummy","key03":"dummy","key04":"dummy","key05":"dummy","key06":"dummy","key07":"dummy","key08":"dummy","key09":"dummy"}, "key01":"dummy","key02":"dummy","key03":"dummy","key04":"dummy","key05":"dummy","key06":"dummy","key07":"dummy","key08":"dummy","key09":"dummy"}, "key01":"dummy","key02":"dummy","key03":"dummy","key04":"dummy","key05":"dummy","key06":"dummy","key07":"dummy","key08":"dummy","key09":"dummy"}, "key01":"dummy","key02":"dummy","key03":"dummy","key04":"dummy","key05":"dummy","key06":"dummy","key07":"dummy","key08":"dummy","key09":"dummy"}, "key01":"dummy","key02":"dummy","key03":"dummy","key04":"dummy","key05":"dummy","key06":"dummy","key07":"dummy","key08":"dummy","key09":"dummy"}, "key01":"dummy","key02":"dummy","key03":"dummy","key04":"dummy","key05":"dummy","key06":"dummy","key07":"dummy","key08":"dummy","key09":"dummy"}

長くてわかりづらいけど、10個の階層を持つJSON文書である。

検索パターン

これに対して、以下のようなクエリをそれぞれ5回ずつ発行して、そのacural timeの平均をとる。

EXPLAIN ANALYZE SELECT data #> '{key01}' FROM test;
EXPLAIN ANALYZE SELECT data #> '{key00,key00,key01}' FROM test;
EXPLAIN ANALYZE SELECT data #> '{key00,key00,key00,key00,key01}' FROM test;
EXPLAIN ANALYZE SELECT data #> '{key00,key00,key00,key00,key00,key00,key00,key00,key00,key01}' FROM test;
EXPLAIN ANALYZE SELECT data #> '{key00,key00,key00,key00,key00,key00,key00,key00,key00,key00,key00,key00,key00,key00,key01}' FROM test;
EXPLAIN ANALYZE SELECT data #> '{key00,key00,key00,key00,key00,key00,key00,key00,key00,key00,key00,key00,key00,key00,key00,key00,key00,key00,key00,key01}' FROM test;

測定結果

jsonb-depth.png

階層数は明らかに検索性能に影響していそうだ。
これが演算子に指定した配列数が主要因なのか、JSONBの内部構造のツリートラバース時間自体が主要因なのかまでは調査できていないが・・・。

まあ、それ以前に「そんな深い階層を持つようなJSON文書なんてつかわねーよ!」という説もあるが。

同一階層の項目数

同一階層に多数の項目数が合った場合、指定したキーによって性能に影響があるのか確認してみる。

検索対象のデータ

以下のようなJSONデータを100万件挿入する。

{ "key00":"val00" ,"key01":"val01" ,"key02":"val02" ,"key03":"val03" ,"key04":"val04" ,"key05":"val05" ,"key06":"val06" ,"key07":"val07" ,"key08":"val08" ,"key09":"val09" ,"key10":"val10" ,"key11":"val11" ,"key12":"val12" ,"key13":"val13" ,"key14":"val14" ,"key15":"val15" ,"key16":"val16" ,"key17":"val17" ,"key18":"val18" ,"key19":"val19" ,"key20":"val20" ,"key21":"val21" ,"key22":"val22" ,"key23":"val23" ,"key24":"val24" ,"key25":"val25" ,"key26":"val26" ,"key27":"val27" ,"key28":"val28" ,"key29":"val29" ,"key30":"val30" ,"key31":"val31" ,"key32":"val32" ,"key33":"val33" ,"key34":"val34" ,"key35":"val35" ,"key36":"val36" ,"key37":"val07" ,"key38":"val38" ,"key39":"val39" ,"key40":"val40" ,"key41":"val41" ,"key42":"val42" ,"key43":"val43" ,"key44":"val44" ,"key45":"val45" ,"key46":"val46" ,"key47":"val47" ,"key48":"val48" ,"key49":"val49" }

長くてわかりづらいけど、1つの階層に50個のキー/値のペアを持つJSON文書である。

検索パターン

これに対して、以下のようなクエリをそれぞれ5回ずつ発行して、そのacural timeの平均をとる。

EXPLAIN ANALYZE SELECT data #> '{key00}' FROM test;
EXPLAIN ANALYZE SELECT data #> '{key10}' FROM test;
EXPLAIN ANALYZE SELECT data #> '{key20}' FROM test;
EXPLAIN ANALYZE SELECT data #> '{key30}' FROM test;
EXPLAIN ANALYZE SELECT data #> '{key49}' FROM test;

測定結果

jsonb-items.png

同一階層で50個程度のキー/値のペアがある場合、どのキーを指定しても、性能への影響はなさそうだ。

配列要素番号の指定

配列内に多数の要素があった場合、指定した配列番号によって性能に影響があるのか確認してみる。

検索対象のデータ

以下のようなJSONデータを100万件挿入する。

{ "key":[ "val00", "val01", "val02", "val03", "val04", "val05", "val06", "val07", "val08", "val09", "val10", "val11", "val12", "val13", "val14", "val15", "val16", "val17", "val18", "val19", "val20", "val21", "val22", "val23", "val24", "val25", "val26", "val27", "val28", "val29", "val30", "val31", "val32", "val33", "val34", "val35", "val36", "val37", "val38", "val39", "val40", "val41", "val42", "val43", "val44", "val35", "val46", "val47", "val48", "val49" ]}

長くてわかりづらいけど、1つの配列内に50個の要素を持つJSON文書である。

検索パターン

これに対して、以下のようなクエリをそれぞれ5回ずつ発行して、そのacural timeの平均をとる。

EXPLAIN ANALYZE SELECT data #> '{key, 0}' FROM test;
EXPLAIN ANALYZE SELECT data #> '{key,09}' FROM test;
EXPLAIN ANALYZE SELECT data #> '{key,19}' FROM test;
EXPLAIN ANALYZE SELECT data #> '{key,29}' FROM test;
EXPLAIN ANALYZE SELECT data #> '{key,49}' FROM test;

測定結果

jsonb-array.png

50個程度の配列であれば、どの要素番号を指定しても、性能への影響はなさそうだ。

まとめ

  • JSONBの検索に影響を一番与えそうなのは階層の深さ。
  • キー/値の組の数や、配列要素数は、それほど影響はなさげ。

おわりに

この測定が役に立つのかどうかは知らないが、自分の想定どおりだったので安心して眠れる。

5
5
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
5
5