More than 3 years have passed since last update.

postgresの実行計画を実際に読み解いてみた

PostgreSQL

Posted at 2022-01-27

前提

basketball_teamテーブル、friendsテーブルが存在する。

select * from basketball_team;
 member_id |  position  | profile_id
-----------+------------+------------
         1 | center     |          1
         2 | foward     |          2
         3 | guard      |          3
         4 | center     |          4
         5 | foward     |       1000
(5 rows)

select * from friends;
 profile_id |   color    |      name       | age
------------+------------+-----------------+-----
          1 | yellow     | yuto            |  24
          2 | red        | tosa            |  41
          6 | red        | herry           |  23
          5 | black      |                 |  20
          4 | black      | deji            |  20
          3 | yellow     | ilji            |  24
(6 rows)

select * from customer;
 profile_id |  first_name  |  last_name   |   age
------------+--------------+--------------+----------
          1 | 田中1        | 太郎1        |        1
          2 | 田中2        | 太郎2        |        2
          3 | 田中3        | 太郎3        |        3
          4 | 田中4        | 太郎4        |        4
          5 | 田中5        | 太郎5        |        5
          6 | 田中6        | 太郎6        |        6
          7 | 田中7        | 太郎7        |        7
          8 | 田中8        | 太郎8        |        8
・
・
・
   99999999 | 田中99999999 | 太郎9999999  |  99999999
  10000000  | 田中10000000 | 太郎10000000 |  10000000

概要

cost(a..b): aは最初の行を返すコスト。bは最後の行を返すコスト
width: 取得される行の平均サイズ
actual time：処理時間
rows：実行した結果、戻ってきた行数
loops：ステップの実行回数
Planning time：解析されたクエリから実行計画を生成し、最適化するまでの時間
Execution time：実行時間
Buffers: shared hit キャッシュから取得データの量→時間が早い
Buffers: shared read ディスクから取得データの量→時間がかかる

実行される順番

ツリーの深いものから順番に実行
同じ深さなら上にあるものから実行

内部結合の実行計画

explain (BUFFERS,ANALYZE) 
select * 
from basketball_team
inner join friends
on basketball_team.profile_id=friends.profile_id
where friends.age > 30;
                                                     QUERY PLAN
--------------------------------------------------------------------------------------------------------------------
# ④
 Hash Join  (cost=19.34..75.85 rows=954 width=168) (actual time=0.061..0.100 rows=1 loops=1)
   # JOIN句のONの部分
   Hash Cond: (basketball_team.profile_id = friends.profile_id)
   Buffers: shared hit=2
    # ① 
    # basketball_teamテーブルフルスキャン
   ->  Seq Scan on basketball_team  (cost=0.00..20.20 rows=1020 width=52) (actual time=0.010..0.042 rows=5 loops=1)
        # バッファキャッシュからデータを読み取る
         Buffers: shared hit=1
    # ③
   ->  Hash  (cost=17.00..17.00 rows=187 width=116) (actual time=0.039..0.041 rows=1 loops=1)
         Buckets: 1024  Batches: 1  Memory Usage: 9kB
         # バッファキャッシュからデータを読み取る
         Buffers: shared hit=1
        #  ②
        #  一番最初にfromでのfriendsのテーブルフルスキャン実行(sequence scan)
        # cost=初期処理の推定コスト。 出力段階が開始できるようになる前に消費される時間
        # rows=この計画ノードが出力する行の推定数。
        # width=この計画ノードが出力する行の（バイト単位での）推定平均幅。
         ->  Seq Scan on friends  (cost=0.00..17.00 rows=187 width=116) (actual time=0.034..0.036 rows=1 loops=1)
            # WHEREによる条件での絞り
               Filter: (age > 30)
            #    フィルタによって5行が削除
               Rows Removed by Filter: 5
            #   バッファキャッシュからデータを読み取る
               Buffers: shared hit=1
# 実行計画時間
 Planning Time: 0.131 ms
# 実行時間
 Execution Time: 0.137 ms
(14 rows)


select * from basketball_team
inner join friends
on basketball_team.profile_id=friends.profile_id
where friends.age > 30;
 member_id |  position  | profile_id | profile_id |   color    |      name       | age
-----------+------------+------------+------------+------------+-----------------+-----
         2 | foward     |          2 |          2 | red        | tosa            |  41

Hash Join（ハッシュ結合）

内側テーブルの結合キーでハッシュを作成し、ハッシュと外側テーブルの行を突き合わせて結合する。ハッシュはメモリーに作成するため、一度作成すれば、高速に結合可能。

小さなテーブルと大きなテーブルを結合する時によく使われる

index scanの実行計画

explain (BUFFERS,ANALYZE) select * from customer where first_name = '田中1000001';
                                                            QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------
# customer_first_name_idxというIndexを利用してcustomerテーブルの検索を行う
 Index Scan using customer_first_name_idx on customer  (cost=0.56..8.58 rows=1 width=38) (actual time=0.063..0.066 rows=1 loops=1)
    # インデックスの条件は以下の通り
   Index Cond: (first_name = '田中1000001'::bpchar)
    # バッファキャッシュからデータを読み取る
   Buffers: shared hit=5
# 実行計画時間
 Planning Time: 0.172 ms
#  実行時間
 Execution Time: 0.093 ms
(5 rows)

まとめ

実行計画を正確に読み解くことはチューニングをする中で一番重要と言っても過言ではないため、しっかりどこがボトルネックとなっているのか調べ、適切なチューニングを行うようにしよう

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up