こちらの記事の続きです。
「on AWS」固有の内容ではないのですが、(2) でロードしたサンプルデータを使い、別のクエリを流して HeatWave の効果を確認してみます。
2023/7/29 時点の情報です。
今後、結果などが変わる可能性があります。
試すクエリの内容
サンプルデータには複数のテーブルがありますが、そのうち
part
partsupp
の 2 つを使った集計クエリ(GROUP BY
してSUM()
)、および
supplier
を加えた 3 つのテーブルの単純な結合、の 2 種類のクエリを試します。
(1) の記事で作成した環境(HeatWave 付きの最小構成)で実行しています。
テーブル定義
mysql> DESC part;
+---------------+---------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+---------------+---------------+------+-----+---------+-------+
| P_PARTKEY | int | NO | PRI | NULL | |
| P_NAME | varchar(55) | NO | | NULL | |
| P_MFGR | char(25) | NO | | NULL | |
| P_BRAND | char(10) | NO | | NULL | |
| P_TYPE | varchar(25) | NO | | NULL | |
| P_SIZE | int | NO | | NULL | |
| P_CONTAINER | char(10) | NO | | NULL | |
| P_RETAILPRICE | decimal(15,2) | NO | | NULL | |
| P_COMMENT | varchar(23) | NO | | NULL | |
+---------------+---------------+------+-----+---------+-------+
9 rows in set (0.00 sec)
mysql> DESC partsupp;
+---------------+---------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+---------------+---------------+------+-----+---------+-------+
| PS_PARTKEY | int | NO | PRI | NULL | |
| PS_SUPPKEY | int | NO | PRI | NULL | |
| PS_AVAILQTY | int | NO | | NULL | |
| PS_SUPPLYCOST | decimal(15,2) | NO | | NULL | |
| PS_COMMENT | varchar(199) | NO | | NULL | |
+---------------+---------------+------+-----+---------+-------+
5 rows in set (0.00 sec)
mysql> DESC supplier;
+-------------+---------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------------+---------------+------+-----+---------+-------+
| S_SUPPKEY | int | NO | PRI | NULL | |
| S_NAME | char(25) | NO | | NULL | |
| S_ADDRESS | varchar(40) | NO | | NULL | |
| S_NATIONKEY | int | NO | | NULL | |
| S_PHONE | char(15) | NO | | NULL | |
| S_ACCTBAL | decimal(15,2) | NO | | NULL | |
| S_COMMENT | varchar(101) | NO | | NULL | |
+-------------+---------------+------+-----+---------+-------+
7 rows in set (0.01 sec)
テスト 1 : 2 テーブル結合でGROUP BY
を使った値の集計(SUM()
)
HeatWave
mysql> SELECT P_PARTKEY, P_NAME, SUM(PS_SUPPLYCOST) total_cost FROM part, partsupp WHERE P_PARTKEY = PS_PARTKEY GROUP BY P_PARTKEY, P_NAME ORDER BY total_cost DESC LIMIT 100;
+-----------+----------------------------------------------+------------+
| P_PARTKEY | P_NAME | total_cost |
+-----------+----------------------------------------------+------------+
| 13421 | burlywood green brown hot goldenrod | 3919.30 |
| 94418 | azure magenta steel beige deep | 3913.01 |
| 110193 | dark royal purple burnished saddle | 3888.39 |
| 180043 | dim cyan salmon firebrick seashell | 3881.98 |
| 39320 | azure almond drab rose aquamarine | 3881.55 |
(中略)
| 101055 | purple papaya chartreuse turquoise midnight | 3686.83 |
| 96608 | lime indian metallic coral honeydew | 3686.54 |
| 68787 | cornsilk ghost peru medium grey | 3685.65 |
| 24421 | blanched linen slate cornflower magenta | 3685.32 |
+-----------+----------------------------------------------+------------+
100 rows in set (0.53 sec)
mysql> EXPLAIN FORMAT=TREE SELECT P_PARTKEY, P_NAME, SUM(PS_SUPPLYCOST) total_cost FROM part, partsupp WHERE P_PARTKEY = PS_PARTKEY GROUP BY P_PARTKEY, P_NAME ORDER BY total_cost DESC LIMIT 100\G
*************************** 1. row ***************************
EXPLAIN: -> Sort: total_cost DESC, limit input to 100 row(s) per chunk (cost=442e+6..442e+6 rows=0)
-> Group aggregate: sum(partsupp.PS_SUPPLYCOST) (cost=442e+6..442e+6 rows=0)
-> Inner hash join (part.P_PARTKEY = partsupp.PS_PARTKEY) (cost=314e+6..314e+6 rows=0)
-> Table scan on part in secondary engine RAPID (cost=67.2e+6..67.2e+6 rows=0)
-> Hash
-> Table scan on partsupp in secondary engine RAPID (cost=0..0 rows=0)
1 row in set, 1 warning (0.03 sec)
セカンダリエンジン(HeatWave)で実行されていますね。
ここではGROUP BY P_PARTKEY, P_NAME
としていますが、このケースではP_PARTKEY
に対するP_NAME
の値が一意なのは自明なので、P_NAME
は省略可能です。
MySQL DB
ヒント句を使って HeatWave を無効にして実行してみます。
mysql> SELECT /*+ SET_VAR(use_secondary_engine=OFF) */ P_PARTKEY, P_NAME, SUM(PS_SUPPLYCOST) total_cost FROM part, partsupp WHERE P_PARTKEY = PS_PARTKEY GROUP BY P_PARTKEY, P_NAME ORDER BY total_cost DESC LIMIT 100;
+-----------+----------------------------------------------+------------+
| P_PARTKEY | P_NAME | total_cost |
+-----------+----------------------------------------------+------------+
| 13421 | burlywood green brown hot goldenrod | 3919.30 |
| 94418 | azure magenta steel beige deep | 3913.01 |
| 110193 | dark royal purple burnished saddle | 3888.39 |
| 180043 | dim cyan salmon firebrick seashell | 3881.98 |
| 39320 | azure almond drab rose aquamarine | 3881.55 |
(中略)
| 101055 | purple papaya chartreuse turquoise midnight | 3686.83 |
| 96608 | lime indian metallic coral honeydew | 3686.54 |
| 68787 | cornsilk ghost peru medium grey | 3685.65 |
| 24421 | blanched linen slate cornflower magenta | 3685.32 |
+-----------+----------------------------------------------+------------+
100 rows in set (3.70 sec)
mysql> EXPLAIN FORMAT=TREE SELECT /*+ SET_VAR(use_secondary_engine=OFF) */ P_PARTKEY, P_NAME, SUM(PS_SUPPLYCOST) total_cost FROM part, partsupp WHERE P_PARTKEY= PS_PARTKEY GROUP BY P_PARTKEY, P_NAME ORDER BY total_cost DESC LIMIT 100\G
*************************** 1. row ***************************
EXPLAIN: -> Limit: 100 row(s)
-> Sort: total_cost DESC, limit input to 100 row(s) per chunk
-> Table scan on <temporary>
-> Aggregate using temporary table
-> Nested loop inner join (cost=149044 rows=790264)
-> Table scan on part (cost=20315 rows=198100)
-> Index lookup on partsupp using PRIMARY (PS_PARTKEY=part.P_PARTKEY) (cost=0.251 rows=3.99)
1 row in set (0.00 sec)
HeatWave の結果と比べると 7 倍近く時間が掛かっています。
テスト 2 : 3 テーブルを単純に結合
集計クエリではないケースでも高速化するか試してみます。
HeatWave
mysql> SELECT P_PARTKEY, P_NAME, PS_SUPPLYCOST, S_SUPPKEY, S_NAME, S_ACCTBAL FROM part, partsupp, supplier WHERE P_PARTKEY = PS_PARTKEY AND PS_SUPPKEY = S_SUPPKEY ORDER BY PS_SUPPLYCOST DESC, S_ACCTBAL DESC LIMIT 100;
+-----------+--------------------------------------------+---------------+-----------+--------------------+-----------+
| P_PARTKEY | P_NAME | PS_SUPPLYCOST | S_SUPPKEY | S_NAME | S_ACCTBAL |
+-----------+--------------------------------------------+---------------+-----------+--------------------+-----------+
| 26924 | gainsboro white tomato lavender rose | 1000.00 | 1929 | Supplier#000001929 | 7580.71 |
| 31554 | drab almond rosy azure blanched | 1000.00 | 1555 | Supplier#000001555 | 5266.89 |
| 99907 | wheat pink slate orchid beige | 1000.00 | 7435 | Supplier#000007435 | 4711.16 |
| 185850 | sienna cyan light snow hot | 1000.00 | 5851 | Supplier#000005851 | 3179.49 |
| 142176 | gainsboro navy forest olive tan | 1000.00 | 7205 | Supplier#000007205 | 295.77 |
(中略)
| 155363 | rosy forest magenta wheat drab | 999.88 | 7879 | Supplier#000007879 | 1046.93 |
| 48609 | orange chiffon burnished drab slate | 999.88 | 8610 | Supplier#000008610 | 307.37 |
| 24175 | steel brown papaya almond firebrick | 999.88 | 6678 | Supplier#000006678 | 9.41 |
| 176526 | ivory red peru smoke chiffon | 999.88 | 9044 | Supplier#000009044 | -208.23 |
+-----------+--------------------------------------------+---------------+-----------+--------------------+-----------+
100 rows in set (0.15 sec)
mysql> EXPLAIN FORMAT=TREE SELECT P_PARTKEY, P_NAME, PS_SUPPLYCOST, S_SUPPKEY, S_NAME, S_ACCTBAL FROM part, partsupp, supplier WHERE P_PARTKEY = PS_PARTKEY AND
PS_SUPPKEY = S_SUPPKEY ORDER BY PS_SUPPLYCOST DESC, S_ACCTBAL DESC LIMIT 100\G
*************************** 1. row ***************************
EXPLAIN: -> Sort: partsupp.PS_SUPPLYCOST DESC, supplier.S_ACCTBAL DESC, limit input to 100 row(s) per chunk (cost=608e+6..608e+6 rows=0)
-> Inner hash join (partsupp.PS_SUPPKEY = supplier.S_SUPPKEY) (cost=608e+6..608e+6 rows=0)
-> Table scan on supplier in secondary engine RAPID (cost=1.92e+6..1.92e+6 rows=0)
-> Hash
-> Inner hash join (part.P_PARTKEY = partsupp.PS_PARTKEY) (cost=343e+6..343e+6 rows=0)
-> Table scan on part in secondary engine RAPID (cost=67.2e+6..67.2e+6 rows=0)
-> Hash
-> Table scan on partsupp in secondary engine RAPID (cost=0..0 rows=0)
1 row in set, 1 warning (0.03 sec)
集計クエリではありませんが、セカンダリエンジン(HeatWave)で実行されていますね。
MySQL DB
先ほどと同様、ヒント句を使って HeatWave を無効にして実行してみます。
mysql> SELECT /*+ SET_VAR(use_secondary_engine=OFF) */ P_PARTKEY, P_NAME, PS_SUPPLYCOST, S_SUPPKEY, S_NAME, S_ACCTBAL FROM part, partsupp, supplier WHERE P_PARTKEY = PS_PARTKEY AND PS_SUPPKEY = S_SUPPKEY ORDER BY PS_SUPPLYCOST DESC, S_ACCTBAL DESC LIMIT 100;
+-----------+--------------------------------------------+---------------+-----------+--------------------+-----------+
| P_PARTKEY | P_NAME | PS_SUPPLYCOST | S_SUPPKEY | S_NAME | S_ACCTBAL |
+-----------+--------------------------------------------+---------------+-----------+--------------------+-----------+
| 26924 | gainsboro white tomato lavender rose | 1000.00 | 1929 | Supplier#000001929 | 7580.71 |
| 31554 | drab almond rosy azure blanched | 1000.00 | 1555 | Supplier#000001555 | 5266.89 |
| 99907 | wheat pink slate orchid beige | 1000.00 | 7435 | Supplier#000007435 | 4711.16 |
| 185850 | sienna cyan light snow hot | 1000.00 | 5851 | Supplier#000005851 | 3179.49 |
| 142176 | gainsboro navy forest olive tan | 1000.00 | 7205 | Supplier#000007205 | 295.77 |
(中略)
| 155363 | rosy forest magenta wheat drab | 999.88 | 7879 | Supplier#000007879 | 1046.93 |
| 48609 | orange chiffon burnished drab slate | 999.88 | 8610 | Supplier#000008610 | 307.37 |
| 24175 | steel brown papaya almond firebrick | 999.88 | 6678 | Supplier#000006678 | 9.41 |
| 176526 | ivory red peru smoke chiffon | 999.88 | 9044 | Supplier#000009044 | -208.23 |
+-----------+--------------------------------------------+---------------+-----------+--------------------+-----------+
100 rows in set (2.01 sec)
mysql> EXPLAIN FORMAT=TREE SELECT /*+ SET_VAR(use_secondary_engine=OFF) */ P_PARTKEY, P_NAME, PS_SUPPLYCOST, S_SUPPKEY, S_NAME, S_ACCTBAL FROM part, partsupp, supplier WHERE P_PARTKEY = PS_PARTKEY AND PS_SUPPKEY = S_SUPPKEY ORDER BY PS_SUPPLYCOST DESC, S_ACCTBAL DESC LIMIT 100\G
*************************** 1. row ***************************
EXPLAIN: -> Limit: 100 row(s)
-> Sort: partsupp.PS_SUPPLYCOST DESC, supplier.S_ACCTBAL DESC, limit input to 100 row(s) per chunk
-> Stream results (cost=425780 rows=790738)
-> Nested loop inner join (cost=425780 rows=790738)
-> Nested loop inner join (cost=149021 rows=790738)
-> Table scan on part (cost=20295 rows=197901)
-> Index lookup on partsupp using PRIMARY (PS_PARTKEY=part.P_PARTKEY) (cost=0.251 rows=4)
-> Single-row index lookup on supplier using PRIMARY (S_SUPPKEY=partsupp.PS_SUPPKEY) (cost=0.25 rows=1)
1 row in set (0.00 sec)
HeatWave の結果と比べると 13 倍以上時間が掛かっています。
まとめ
- テーブル結合を伴うクエリも HeatWave で高速化しうる
- 集計を伴わない単純な結合クエリも HeatWave で高速化しうる
その他
HeavWave では、MySQL DB で実行可能な全てのクエリが実行できるわけではありません。
公式リファレンスマニュアルのこちらのページに HeatWave の制限がまとめられています。
- 2.16 Limitations(MySQL HeatWave User Guide)
今後、OCI 上の HeatWave を使ってこのあたりの制限を意識しながらクエリを流し、
- 実行可否
- 実行結果の差異
- 性能面の効果
などを確認していく予定です。
すでに試した範囲では、sql_mode
が空、またはデフォルトの指定からONLY_FULL_GROUP_BY
だけを外した状態で、ONLY_FULL_GROUP_BY
に反するクエリが HeatWave で実行可能なことを確認しています。