2
1

More than 1 year has passed since last update.

東京リージョンにやってきた MySQL HeatWave on AWS を試す (4) サンプルデータを使って他のクエリを流してみる編

Last updated at Posted at 2023-07-28

こちらの記事の続きです。

「on AWS」固有の内容ではないのですが、(2) でロードしたサンプルデータを使い、別のクエリを流して HeatWave の効果を確認してみます。

2023/7/29 時点の情報です。
今後、結果などが変わる可能性があります。

試すクエリの内容

サンプルデータには複数のテーブルがありますが、そのうち

  • part
  • partsupp

の 2 つを使った集計クエリ(GROUP BYしてSUM())、および

  • supplier

を加えた 3 つのテーブルの単純な結合、の 2 種類のクエリを試します。

(1) の記事で作成した環境(HeatWave 付きの最小構成)で実行しています。

テーブル定義

mysql> DESC part;
+---------------+---------------+------+-----+---------+-------+
| Field         | Type          | Null | Key | Default | Extra |
+---------------+---------------+------+-----+---------+-------+
| P_PARTKEY     | int           | NO   | PRI | NULL    |       |
| P_NAME        | varchar(55)   | NO   |     | NULL    |       |
| P_MFGR        | char(25)      | NO   |     | NULL    |       |
| P_BRAND       | char(10)      | NO   |     | NULL    |       |
| P_TYPE        | varchar(25)   | NO   |     | NULL    |       |
| P_SIZE        | int           | NO   |     | NULL    |       |
| P_CONTAINER   | char(10)      | NO   |     | NULL    |       |
| P_RETAILPRICE | decimal(15,2) | NO   |     | NULL    |       |
| P_COMMENT     | varchar(23)   | NO   |     | NULL    |       |
+---------------+---------------+------+-----+---------+-------+
9 rows in set (0.00 sec)

mysql> DESC partsupp;
+---------------+---------------+------+-----+---------+-------+
| Field         | Type          | Null | Key | Default | Extra |
+---------------+---------------+------+-----+---------+-------+
| PS_PARTKEY    | int           | NO   | PRI | NULL    |       |
| PS_SUPPKEY    | int           | NO   | PRI | NULL    |       |
| PS_AVAILQTY   | int           | NO   |     | NULL    |       |
| PS_SUPPLYCOST | decimal(15,2) | NO   |     | NULL    |       |
| PS_COMMENT    | varchar(199)  | NO   |     | NULL    |       |
+---------------+---------------+------+-----+---------+-------+
5 rows in set (0.00 sec)

mysql> DESC supplier;
+-------------+---------------+------+-----+---------+-------+
| Field       | Type          | Null | Key | Default | Extra |
+-------------+---------------+------+-----+---------+-------+
| S_SUPPKEY   | int           | NO   | PRI | NULL    |       |
| S_NAME      | char(25)      | NO   |     | NULL    |       |
| S_ADDRESS   | varchar(40)   | NO   |     | NULL    |       |
| S_NATIONKEY | int           | NO   |     | NULL    |       |
| S_PHONE     | char(15)      | NO   |     | NULL    |       |
| S_ACCTBAL   | decimal(15,2) | NO   |     | NULL    |       |
| S_COMMENT   | varchar(101)  | NO   |     | NULL    |       |
+-------------+---------------+------+-----+---------+-------+
7 rows in set (0.01 sec)

テスト 1 : 2 テーブル結合でGROUP BYを使った値の集計(SUM()

HeatWave

mysql> SELECT P_PARTKEY, P_NAME, SUM(PS_SUPPLYCOST) total_cost FROM part, partsupp WHERE P_PARTKEY = PS_PARTKEY GROUP BY P_PARTKEY, P_NAME ORDER BY total_cost DESC LIMIT 100;
+-----------+----------------------------------------------+------------+
| P_PARTKEY | P_NAME                                       | total_cost |
+-----------+----------------------------------------------+------------+
|     13421 | burlywood green brown hot goldenrod          |    3919.30 |
|     94418 | azure magenta steel beige deep               |    3913.01 |
|    110193 | dark royal purple burnished saddle           |    3888.39 |
|    180043 | dim cyan salmon firebrick seashell           |    3881.98 |
|     39320 | azure almond drab rose aquamarine            |    3881.55 |
(中略)
|    101055 | purple papaya chartreuse turquoise midnight  |    3686.83 |
|     96608 | lime indian metallic coral honeydew          |    3686.54 |
|     68787 | cornsilk ghost peru medium grey              |    3685.65 |
|     24421 | blanched linen slate cornflower magenta      |    3685.32 |
+-----------+----------------------------------------------+------------+
100 rows in set (0.53 sec)

mysql> EXPLAIN FORMAT=TREE SELECT P_PARTKEY, P_NAME, SUM(PS_SUPPLYCOST) total_cost FROM part, partsupp WHERE P_PARTKEY = PS_PARTKEY GROUP BY P_PARTKEY, P_NAME ORDER BY total_cost DESC LIMIT 100\G
*************************** 1. row ***************************
EXPLAIN: -> Sort: total_cost DESC, limit input to 100 row(s) per chunk  (cost=442e+6..442e+6 rows=0)
    -> Group aggregate: sum(partsupp.PS_SUPPLYCOST)  (cost=442e+6..442e+6 rows=0)
        -> Inner hash join (part.P_PARTKEY = partsupp.PS_PARTKEY)  (cost=314e+6..314e+6 rows=0)
            -> Table scan on part in secondary engine RAPID  (cost=67.2e+6..67.2e+6 rows=0)
            -> Hash
                -> Table scan on partsupp in secondary engine RAPID  (cost=0..0 rows=0)

1 row in set, 1 warning (0.03 sec)

セカンダリエンジン(HeatWave)で実行されていますね。

ここではGROUP BY P_PARTKEY, P_NAMEとしていますが、このケースではP_PARTKEYに対するP_NAMEの値が一意なのは自明なので、P_NAMEは省略可能です。

MySQL DB

ヒント句を使って HeatWave を無効にして実行してみます。

mysql> SELECT /*+ SET_VAR(use_secondary_engine=OFF) */ P_PARTKEY, P_NAME, SUM(PS_SUPPLYCOST) total_cost FROM part, partsupp WHERE P_PARTKEY = PS_PARTKEY GROUP BY P_PARTKEY, P_NAME ORDER BY total_cost DESC LIMIT 100;
+-----------+----------------------------------------------+------------+
| P_PARTKEY | P_NAME                                       | total_cost |
+-----------+----------------------------------------------+------------+
|     13421 | burlywood green brown hot goldenrod          |    3919.30 |
|     94418 | azure magenta steel beige deep               |    3913.01 |
|    110193 | dark royal purple burnished saddle           |    3888.39 |
|    180043 | dim cyan salmon firebrick seashell           |    3881.98 |
|     39320 | azure almond drab rose aquamarine            |    3881.55 |
(中略)
|    101055 | purple papaya chartreuse turquoise midnight  |    3686.83 |
|     96608 | lime indian metallic coral honeydew          |    3686.54 |
|     68787 | cornsilk ghost peru medium grey              |    3685.65 |
|     24421 | blanched linen slate cornflower magenta      |    3685.32 |
+-----------+----------------------------------------------+------------+
100 rows in set (3.70 sec)

mysql> EXPLAIN FORMAT=TREE SELECT /*+ SET_VAR(use_secondary_engine=OFF) */ P_PARTKEY, P_NAME, SUM(PS_SUPPLYCOST) total_cost FROM part, partsupp WHERE P_PARTKEY= PS_PARTKEY GROUP BY P_PARTKEY, P_NAME ORDER BY total_cost DESC LIMIT 100\G
*************************** 1. row ***************************
EXPLAIN: -> Limit: 100 row(s)
    -> Sort: total_cost DESC, limit input to 100 row(s) per chunk
        -> Table scan on <temporary>
            -> Aggregate using temporary table
                -> Nested loop inner join  (cost=149044 rows=790264)
                    -> Table scan on part  (cost=20315 rows=198100)
                    -> Index lookup on partsupp using PRIMARY (PS_PARTKEY=part.P_PARTKEY)  (cost=0.251 rows=3.99)

1 row in set (0.00 sec)

HeatWave の結果と比べると 7 倍近く時間が掛かっています。

テスト 2 : 3 テーブルを単純に結合

集計クエリではないケースでも高速化するか試してみます。

HeatWave

mysql> SELECT P_PARTKEY, P_NAME, PS_SUPPLYCOST, S_SUPPKEY, S_NAME, S_ACCTBAL FROM part, partsupp, supplier WHERE P_PARTKEY = PS_PARTKEY AND PS_SUPPKEY = S_SUPPKEY ORDER BY PS_SUPPLYCOST DESC, S_ACCTBAL DESC LIMIT 100;
+-----------+--------------------------------------------+---------------+-----------+--------------------+-----------+
| P_PARTKEY | P_NAME                                     | PS_SUPPLYCOST | S_SUPPKEY | S_NAME             | S_ACCTBAL |
+-----------+--------------------------------------------+---------------+-----------+--------------------+-----------+
|     26924 | gainsboro white tomato lavender rose       |       1000.00 |      1929 | Supplier#000001929 |   7580.71 |
|     31554 | drab almond rosy azure blanched            |       1000.00 |      1555 | Supplier#000001555 |   5266.89 |
|     99907 | wheat pink slate orchid beige              |       1000.00 |      7435 | Supplier#000007435 |   4711.16 |
|    185850 | sienna cyan light snow hot                 |       1000.00 |      5851 | Supplier#000005851 |   3179.49 |
|    142176 | gainsboro navy forest olive tan            |       1000.00 |      7205 | Supplier#000007205 |    295.77 |
(中略)
|    155363 | rosy forest magenta wheat drab             |        999.88 |      7879 | Supplier#000007879 |   1046.93 |
|     48609 | orange chiffon burnished drab slate        |        999.88 |      8610 | Supplier#000008610 |    307.37 |
|     24175 | steel brown papaya almond firebrick        |        999.88 |      6678 | Supplier#000006678 |      9.41 |
|    176526 | ivory red peru smoke chiffon               |        999.88 |      9044 | Supplier#000009044 |   -208.23 |
+-----------+--------------------------------------------+---------------+-----------+--------------------+-----------+
100 rows in set (0.15 sec)

mysql> EXPLAIN FORMAT=TREE SELECT P_PARTKEY, P_NAME, PS_SUPPLYCOST, S_SUPPKEY, S_NAME, S_ACCTBAL FROM part, partsupp, supplier WHERE P_PARTKEY = PS_PARTKEY AND
PS_SUPPKEY = S_SUPPKEY ORDER BY PS_SUPPLYCOST DESC, S_ACCTBAL DESC LIMIT 100\G
*************************** 1. row ***************************
EXPLAIN: -> Sort: partsupp.PS_SUPPLYCOST DESC, supplier.S_ACCTBAL DESC, limit input to 100 row(s) per chunk  (cost=608e+6..608e+6 rows=0)
    -> Inner hash join (partsupp.PS_SUPPKEY = supplier.S_SUPPKEY)  (cost=608e+6..608e+6 rows=0)
        -> Table scan on supplier in secondary engine RAPID  (cost=1.92e+6..1.92e+6 rows=0)
        -> Hash
            -> Inner hash join (part.P_PARTKEY = partsupp.PS_PARTKEY)  (cost=343e+6..343e+6 rows=0)
                -> Table scan on part in secondary engine RAPID  (cost=67.2e+6..67.2e+6 rows=0)
                -> Hash
                    -> Table scan on partsupp in secondary engine RAPID  (cost=0..0 rows=0)

1 row in set, 1 warning (0.03 sec)

集計クエリではありませんが、セカンダリエンジン(HeatWave)で実行されていますね。

MySQL DB

先ほどと同様、ヒント句を使って HeatWave を無効にして実行してみます。

mysql> SELECT /*+ SET_VAR(use_secondary_engine=OFF) */ P_PARTKEY, P_NAME, PS_SUPPLYCOST, S_SUPPKEY, S_NAME, S_ACCTBAL FROM part, partsupp, supplier WHERE P_PARTKEY = PS_PARTKEY AND PS_SUPPKEY = S_SUPPKEY ORDER BY PS_SUPPLYCOST DESC, S_ACCTBAL DESC LIMIT 100;
+-----------+--------------------------------------------+---------------+-----------+--------------------+-----------+
| P_PARTKEY | P_NAME                                     | PS_SUPPLYCOST | S_SUPPKEY | S_NAME             | S_ACCTBAL |
+-----------+--------------------------------------------+---------------+-----------+--------------------+-----------+
|     26924 | gainsboro white tomato lavender rose       |       1000.00 |      1929 | Supplier#000001929 |   7580.71 |
|     31554 | drab almond rosy azure blanched            |       1000.00 |      1555 | Supplier#000001555 |   5266.89 |
|     99907 | wheat pink slate orchid beige              |       1000.00 |      7435 | Supplier#000007435 |   4711.16 |
|    185850 | sienna cyan light snow hot                 |       1000.00 |      5851 | Supplier#000005851 |   3179.49 |
|    142176 | gainsboro navy forest olive tan            |       1000.00 |      7205 | Supplier#000007205 |    295.77 |
(中略)
|    155363 | rosy forest magenta wheat drab             |        999.88 |      7879 | Supplier#000007879 |   1046.93 |
|     48609 | orange chiffon burnished drab slate        |        999.88 |      8610 | Supplier#000008610 |    307.37 |
|     24175 | steel brown papaya almond firebrick        |        999.88 |      6678 | Supplier#000006678 |      9.41 |
|    176526 | ivory red peru smoke chiffon               |        999.88 |      9044 | Supplier#000009044 |   -208.23 |
+-----------+--------------------------------------------+---------------+-----------+--------------------+-----------+
100 rows in set (2.01 sec)

mysql> EXPLAIN FORMAT=TREE SELECT /*+ SET_VAR(use_secondary_engine=OFF) */ P_PARTKEY, P_NAME, PS_SUPPLYCOST, S_SUPPKEY, S_NAME, S_ACCTBAL FROM part, partsupp, supplier WHERE P_PARTKEY = PS_PARTKEY AND PS_SUPPKEY = S_SUPPKEY ORDER BY PS_SUPPLYCOST DESC, S_ACCTBAL DESC LIMIT 100\G
*************************** 1. row ***************************
EXPLAIN: -> Limit: 100 row(s)
    -> Sort: partsupp.PS_SUPPLYCOST DESC, supplier.S_ACCTBAL DESC, limit input to 100 row(s) per chunk
        -> Stream results  (cost=425780 rows=790738)
            -> Nested loop inner join  (cost=425780 rows=790738)
                -> Nested loop inner join  (cost=149021 rows=790738)
                    -> Table scan on part  (cost=20295 rows=197901)
                    -> Index lookup on partsupp using PRIMARY (PS_PARTKEY=part.P_PARTKEY)  (cost=0.251 rows=4)
                -> Single-row index lookup on supplier using PRIMARY (S_SUPPKEY=partsupp.PS_SUPPKEY)  (cost=0.25 rows=1)

1 row in set (0.00 sec)

HeatWave の結果と比べると 13 倍以上時間が掛かっています。

まとめ

  • テーブル結合を伴うクエリも HeatWave で高速化しうる
  • 集計を伴わない単純な結合クエリも HeatWave で高速化しうる

その他

HeavWave では、MySQL DB で実行可能な全てのクエリが実行できるわけではありません。

公式リファレンスマニュアルのこちらのページに HeatWave の制限がまとめられています。

今後、OCI 上の HeatWave を使ってこのあたりの制限を意識しながらクエリを流し、

  • 実行可否
  • 実行結果の差異
  • 性能面の効果

などを確認していく予定です。

すでに試した範囲では、sql_modeが空、またはデフォルトの指定からONLY_FULL_GROUP_BYだけを外した状態で、ONLY_FULL_GROUP_BYに反するクエリが HeatWave で実行可能なことを確認しています。

2
1
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
2
1