はじめに
Quick Start Guide for TiDB HTAPのマニュアルを参考にHTAPの動作確認を下記環境にて行いました。
[root@tisim ~]# tiup cluster display demo-cluster
tiup is checking updates for component cluster ...timeout(2s)!
Starting component `cluster`: /root/.tiup/components/cluster/v1.12.1/tiup-cluster display demo-cluster
Cluster type: tidb
Cluster name: demo-cluster
Cluster version: v7.0.0
Deploy user: tidb
SSH type: builtin
Dashboard URL: http://192.168.3.171:2379/dashboard
Grafana URL: http://192.168.3.171:3000
ID Role Host Ports OS/Arch Status Data Dir Deploy Dir
-- ---- ---- ----- ------- ------ -------- ----------
192.168.3.171:3000 grafana 192.168.3.171 3000 linux/x86_64 Up - /tidb-deploy/grafana-3000
192.168.3.171:2379 pd 192.168.3.171 2379/2380 linux/x86_64 Up|L|UI /tidb-data/pd-2379 /tidb-deploy/pd-2379
192.168.3.171:9090 prometheus 192.168.3.171 9090/12020 linux/x86_64 Up /tidb-data/prometheus-9090 /tidb-deploy/prometheus-9090
192.168.3.171:4000 tidb 192.168.3.171 4000/10080 linux/x86_64 Up - /tidb-deploy/tidb-4000
192.168.3.171:9000 tiflash 192.168.3.171 9000/8123/3930/20170/20292/8234 linux/x86_64 Up /tidb-data/tiflash-9000 /tidb-deploy/tiflash-9000
192.168.3.171:20160 tikv 192.168.3.171 20160/20180 linux/x86_64 Up /tidb-data/tikv-20160 /tidb-deploy/tikv-20160
192.168.3.171:20161 tikv 192.168.3.171 20161/20181 linux/x86_64 Up /tidb-data/tikv-20161 /tidb-deploy/tikv-20161
192.168.3.171:20162 tikv 192.168.3.171 20162/20182 linux/x86_64 Up /tidb-data/tikv-20162 /tidb-deploy/tikv-20162
192.168.3.171:20163 tikv 192.168.3.171 20163/20183 linux/x86_64 Up /tidb-data/tikv-20163 /tidb-deploy/tikv-20163
Total nodes: 9
テストデータ作成
テストデータ生成ツールをインストールします。
[root@tisim ~]# tiup install bench
download https://tiup-mirrors.pingcap.com/bench-v1.12.0-linux-amd64.tar.gz 33.39 MiB / 33.39 MiB 100.00% 11.43 MiB/s
テストデータを生成します。
[root@tisim ~]# tiup bench tpch --sf=1 prepare
tiup is checking updates for component bench ...
Starting component `bench`: /root/.tiup/components/bench/v1.12.0/tiup-bench tpch --sf=1 prepare
creating nation
creating region
creating part
creating supplier
creating partsupp
creating customer
creating orders
creating lineitem
generating nation table
generate nation table done
generating region table
generate region table done
generating customers table
generate customers table done
generating suppliers table
generate suppliers table done
generating part/partsupplier tables
generate part/partsupplier tables done
generating orders/lineitem tables
generate orders/lineitem tables done
Finished
上記テーブルはtestデータベースに作成されるようです。作成されたテーブルとデータ量を下記に示します。
mysql> SELECT
-> CONCAT(table_schema,'.',table_name) AS 'Table Name',
-> table_rows AS 'Number of Rows',
-> FORMAT_BYTES(data_length) AS 'Data Size',
-> FORMAT_BYTES(index_length) AS 'Index Size',
-> FORMAT_BYTES(data_length+index_length) AS'Total'
-> FROM
-> information_schema.TABLES
-> WHERE
-> table_schema='test';
+---------------+----------------+------------+------------+------------+
| Table Name | Number of Rows | Data Size | Index Size | Total |
+---------------+----------------+------------+------------+------------+
| test.nation | 25 | 2.44 KiB | 0 bytes | 2.44 KiB |
| test.region | 5 | 416 bytes | 0 bytes | 416 bytes |
| test.part | 279168 | 33.40 MiB | 0 bytes | 33.40 MiB |
| test.supplier | 10000 | 1.41 MiB | 0 bytes | 1.41 MiB |
| test.partsupp | 1050880 | 156.88 MiB | 16.04 MiB | 172.92 MiB |
| test.customer | 150000 | 24.06 MiB | 0 bytes | 24.06 MiB |
| test.orders | 1500000 | 166.17 MiB | 0 bytes | 166.17 MiB |
| test.lineitem | 6183487 | 788.89 MiB | 94.35 MiB | 883.25 MiB |
+---------------+----------------+------------+------------+------------+
8 rows in set (0.01 sec)
下記のSELECT文を実行します。まだ、TiFlashにレプリケーションしていないため、行ベースのストレージエンジンが使用されます。
mysql> USE test
Database changed
mysql> SELECT
-> l_orderkey,
-> SUM(
-> l_extendedprice * (1 - l_discount)
-> ) AS revenue,
-> o_orderdate,
-> o_shippriority
-> FROM
-> customer,
-> orders,
-> lineitem
-> WHERE
-> c_mktsegment = 'BUILDING'
-> AND c_custkey = o_custkey
-> AND l_orderkey = o_orderkey
-> AND o_orderdate < DATE '1996-01-01'
-> AND l_shipdate > DATE '1996-02-01'
-> GROUP BY
-> l_orderkey,
-> o_orderdate,
-> o_shippriority
-> ORDER BY
-> revenue DESC,
-> o_orderdate
-> limit 10;
+------------+-------------+-------------+----------------+
| l_orderkey | revenue | o_orderdate | o_shippriority |
+------------+-------------+-------------+----------------+
| 5828933 | 386117.1688 | 1995-12-03 | 0 |
| 1648647 | 373096.8642 | 1995-12-06 | 0 |
| 1364641 | 352640.6056 | 1995-12-19 | 0 |
| 3949606 | 347750.4435 | 1995-12-23 | 0 |
| 4792161 | 347715.0509 | 1995-12-30 | 0 |
| 4340739 | 347490.5251 | 1995-12-06 | 0 |
| 1609574 | 342497.8886 | 1995-12-31 | 0 |
| 3076934 | 338202.3259 | 1995-12-24 | 0 |
| 3232933 | 337349.2536 | 1995-12-26 | 0 |
| 2345058 | 335142.6104 | 1995-12-31 | 0 |
+------------+-------------+-------------+----------------+
10 rows in set (1.95 sec)
実行計画も確認します。
mysql> EXPLAIN
-> SELECT
-> l_orderkey,
-> SUM(
-> l_extendedprice * (1 - l_discount)
-> ) AS revenue,
-> o_orderdate,
-> o_shippriority
-> FROM
-> customer,
-> orders,
-> lineitem
-> WHERE
-> c_mktsegment = 'BUILDING'
-> AND c_custkey = o_custkey
-> AND l_orderkey = o_orderkey
-> AND o_orderdate < DATE '1996-01-01'
-> AND l_shipdate > DATE '1996-02-01'
-> GROUP BY
-> l_orderkey,
-> o_orderdate,
-> o_shippriority
-> ORDER BY
-> revenue DESC,
-> o_orderdate
-> limit 10;
+------------------------------------------+------------+-----------+----------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| id | estRows | task | access object | operator info |
+------------------------------------------+------------+-----------+----------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Projection_14 | 10.00 | root | | test.lineitem.l_orderkey, Column#34, test.orders.o_orderdate, test.orders.o_shippriority |
| └─TopN_17 | 10.00 | root | | Column#34:desc, test.orders.o_orderdate, offset:0, count:10 |
| └─HashAgg_22 | 604803.69 | root | | group by:Column#39, Column#40, Column#41, funcs:sum(Column#35)->Column#34, funcs:firstrow(Column#36)->test.orders.o_orderdate, funcs:firstrow(Column#37)->test.orders.o_shippriority, funcs:firstrow(Column#38)->test.lineitem.l_orderkey |
| └─Projection_77 | 1848809.61 | root | | mul(test.lineitem.l_extendedprice, minus(1, test.lineitem.l_discount))->Column#35, test.orders.o_orderdate, test.orders.o_shippriority, test.lineitem.l_orderkey, test.lineitem.l_orderkey, test.orders.o_orderdate, test.orders.o_shippriority |
| └─IndexJoin_30 | 1848809.61 | root | | inner join, inner:TableReader_26, outer key:test.orders.o_orderkey, inner key:test.lineitem.l_orderkey, equal cond:eq(test.orders.o_orderkey, test.lineitem.l_orderkey) |
| ├─HashJoin_65(Build) | 458293.65 | root | | inner join, equal:[eq(test.customer.c_custkey, test.orders.o_custkey)] |
| │ ├─TableReader_71(Build) | 30323.15 | root | | data:Selection_70 |
| │ │ └─Selection_70 | 30323.15 | cop[tikv] | | eq(test.customer.c_mktsegment, "BUILDING") |
| │ │ └─TableFullScan_69 | 150000.00 | cop[tikv] | table:customer | keep order:false |
| │ └─TableReader_68(Probe) | 907920.67 | root | | data:Selection_67 |
| │ └─Selection_67 | 907920.67 | cop[tikv] | | lt(test.orders.o_orderdate, 1996-01-01) |
| │ └─TableFullScan_66 | 1500000.00 | cop[tikv] | table:orders | keep order:false |
| └─TableReader_26(Probe) | 186323.41 | root | | data:Selection_25 |
| └─Selection_25 | 186323.41 | cop[tikv] | | gt(test.lineitem.l_shipdate, 1996-02-01) |
| └─TableRangeScan_24 | 458293.65 | cop[tikv] | table:lineitem | range: decided by [eq(test.lineitem.l_orderkey, test.orders.o_orderkey)], keep order:false |
+------------------------------------------+------------+-----------+----------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
15 rows in set (0.00 sec)
下記のSQLを実行し、TiFlashにレプリケートします。
mysql> ALTER TABLE test.customer SET TIFLASH REPLICA 1;
Query OK, 0 rows affected (0.10 sec)
mysql> ALTER TABLE test.orders SET TIFLASH REPLICA 1;
Query OK, 0 rows affected (0.10 sec)
mysql> ALTER TABLE test.lineitem SET TIFLASH REPLICA 1;
Query OK, 0 rows affected (0.09 sec)
information_schema.tiflash_replicaを確認します。
- availableが1の場合はTiFlashが使用可能となり、0は使用不可
- progressは進捗率を表しており、1の場合はTiFlashへのレプリケーションが完了している状態
mysql> SELECT * FROM information_schema.tiflash_replica WHERE TABLE_SCHEMA = 'test' and TABLE_NAME = 'customer';
+--------------+------------+----------+---------------+-----------------+-----------+----------+
| TABLE_SCHEMA | TABLE_NAME | TABLE_ID | REPLICA_COUNT | LOCATION_LABELS | AVAILABLE | PROGRESS |
+--------------+------------+----------+---------------+-----------------+-----------+----------+
| test | customer | 106 | 1 | | 1 | 1 |
+--------------+------------+----------+---------------+-----------------+-----------+----------+
1 row in set (0.00 sec)
mysql> SELECT * FROM information_schema.tiflash_replica WHERE TABLE_SCHEMA = 'test' and TABLE_NAME = 'orders';
+--------------+------------+----------+---------------+-----------------+-----------+----------+
| TABLE_SCHEMA | TABLE_NAME | TABLE_ID | REPLICA_COUNT | LOCATION_LABELS | AVAILABLE | PROGRESS |
+--------------+------------+----------+---------------+-----------------+-----------+----------+
| test | orders | 108 | 1 | | 1 | 1 |
+--------------+------------+----------+---------------+-----------------+-----------+----------+
1 row in set (0.00 sec)
mysql> SELECT * FROM information_schema.tiflash_replica WHERE TABLE_SCHEMA = 'test' and TABLE_NAME = 'lineitem';
+--------------+------------+----------+---------------+-----------------+-----------+----------+
| TABLE_SCHEMA | TABLE_NAME | TABLE_ID | REPLICA_COUNT | LOCATION_LABELS | AVAILABLE | PROGRESS |
+--------------+------------+----------+---------------+-----------------+-----------+----------+
| test | lineitem | 110 | 1 | | 0 | 0.38 |
+--------------+------------+----------+---------------+-----------------+-----------+----------+
1 row in set (0.00 sec)
先ほどのSELECT文を再実行します。(実行結果のみ記載しています)
mpp[tiflash]となっており、TiFlashが選択された事が確認できます。
これはTiDBのオプティマイザが列指向のストレージエンジンであるTiFlashを使用するほうが効率が良いと判断した事を示しています。
+----------------------------------------------------------------+------------+--------------+----------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| id | estRows | task | access object | operator info |
+----------------------------------------------------------------+------------+--------------+----------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Projection_14 | 10.00 | root | | test.lineitem.l_orderkey, Column#34, test.orders.o_orderdate, test.orders.o_shippriority |
| └─TopN_18 | 10.00 | root | | Column#34:desc, test.orders.o_orderdate, offset:0, count:10 |
| └─TableReader_140 | 10.00 | root | | MppVersion: 1, data:ExchangeSender_139 |
| └─ExchangeSender_139 | 10.00 | mpp[tiflash] | | ExchangeType: PassThrough |
| └─TopN_138 | 10.00 | mpp[tiflash] | | Column#34:desc, test.orders.o_orderdate, offset:0, count:10 |
| └─Projection_134 | 604803.69 | mpp[tiflash] | | Column#34, test.orders.o_orderdate, test.orders.o_shippriority, test.lineitem.l_orderkey |
| └─HashAgg_135 | 604803.69 | mpp[tiflash] | | group by:test.lineitem.l_orderkey, test.orders.o_orderdate, test.orders.o_shippriority, funcs:sum(Column#43)->Column#34, funcs:firstrow(test.orders.o_orderdate)->test.orders.o_orderdate, funcs:firstrow(test.orders.o_shippriority)->test.orders.o_shippriority, funcs:firstrow(test.lineitem.l_orderkey)->test.lineitem.l_orderkey, stream_count: 2 |
| └─ExchangeReceiver_137 | 604803.69 | mpp[tiflash] | | stream_count: 2 |
| └─ExchangeSender_136 | 604803.69 | mpp[tiflash] | | ExchangeType: HashPartition, Compression: FAST, Hash Cols: [name: test.lineitem.l_orderkey, collate: binary], [name: test.orders.o_orderdate, collate: binary], [name: test.orders.o_shippriority, collate: binary], stream_count: 2 |
| └─HashAgg_132 | 604803.69 | mpp[tiflash] | | group by:Column#48, Column#49, Column#50, funcs:sum(Column#47)->Column#43 |
| └─Projection_141 | 1848809.61 | mpp[tiflash] | | mul(test.lineitem.l_extendedprice, minus(1, test.lineitem.l_discount))->Column#47, test.lineitem.l_orderkey, test.orders.o_orderdate, test.orders.o_shippriority |
| └─HashJoin_121 | 1848809.61 | mpp[tiflash] | | inner join, equal:[eq(test.orders.o_orderkey, test.lineitem.l_orderkey)], stream_count: 2 |
| ├─ExchangeReceiver_52(Build) | 458293.65 | mpp[tiflash] | | stream_count: 2 |
| │ └─ExchangeSender_51 | 458293.65 | mpp[tiflash] | | ExchangeType: HashPartition, Compression: FAST, Hash Cols: [name: test.orders.o_orderkey, collate: binary], stream_count: 2 |
| │ └─HashJoin_44 | 458293.65 | mpp[tiflash] | | inner join, equal:[eq(test.customer.c_custkey, test.orders.o_custkey)] |
| │ ├─ExchangeReceiver_48(Build) | 30323.15 | mpp[tiflash] | | |
| │ │ └─ExchangeSender_47 | 30323.15 | mpp[tiflash] | | ExchangeType: Broadcast |
| │ │ └─Selection_46 | 30323.15 | mpp[tiflash] | | eq(test.customer.c_mktsegment, "BUILDING") |
| │ │ └─TableFullScan_45 | 150000.00 | mpp[tiflash] | table:customer | keep order:false |
| │ └─Selection_50(Probe) | 907920.67 | mpp[tiflash] | | lt(test.orders.o_orderdate, 1996-01-01) |
| │ └─TableFullScan_49 | 1500000.00 | mpp[tiflash] | table:orders | keep order:false |
| └─ExchangeReceiver_56(Probe) | 2439848.03 | mpp[tiflash] | | |
| └─ExchangeSender_55 | 2439848.03 | mpp[tiflash] | | ExchangeType: HashPartition, Compression: FAST, Hash Cols: [name: test.lineitem.l_orderkey, collate: binary] |
| └─Selection_54 | 2439848.03 | mpp[tiflash] | | gt(test.lineitem.l_shipdate, 1996-02-01) |
| └─TableFullScan_53 | 6001215.00 | mpp[tiflash] | table:lineitem | keep order:false |
+----------------------------------------------------------------+------------+--------------+----------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
25 rows in set (0.01 sec)
通常はTiDBのオプティマイザが、最適なストレージエンジンを判断しますが、SQLヒントを使用して、ストレージを明示的に選択する事も可能です。
下記ではSQLヒントを使用して、TiKVを選択します。
mysql> EXPLAIN ANALYZE SELECT /*+ read_from_storage(tikv[customer]) */ COUNT(*) FROM customer;
+----------------------------+-----------+---------+-----------+----------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------+-----------+------+
| id | estRows | actRows | task | access object | execution info | operator info | memory | disk |
+----------------------------+-----------+---------+-----------+----------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------+-----------+------+
| HashAgg_13 | 1.00 | 1 | root | | time:56.3ms, loops:2, RRU:80.225594, WRU:0.000000, partial_worker:{wall_time:56.270019ms, concurrency:5, task_num:1, tot_wait:280.609892ms, tot_exec:3.096μs, tot_time:280.616945ms, max:56.124661ms, p95:56.124661ms}, final_worker:{wall_time:0s, concurrency:5, task_num:1, tot_wait:280.687013ms, tot_exec:6.461μs, tot_time:280.695207ms, max:56.145762ms, p95:56.145762ms} | funcs:count(Column#10)->Column#9 | 49.2 KB | N/A |
| └─TableReader_14 | 1.00 | 1 | root | | time:56.1ms, loops:2, cop_task: {num: 1, max: 56.2ms, proc_keys: 150000, tot_proc: 54.5ms, tot_wait: 525.4μs, rpc_num: 1, rpc_time: 56.1ms, copr_cache_hit_ratio: 0.00, build_task_duration: 6.73μs, max_distsql_concurrency: 1} | data:HashAgg_6 | 309 Bytes | N/A |
| └─HashAgg_6 | 1.00 | 1 | cop[tikv] | | tikv_task:{time:54ms, loops:147}, scan_detail: {total_process_keys: 150000, total_process_keys_size: 4050000, total_keys: 150001, get_snapshot_time: 491.6μs, rocksdb: {key_skipped_count: 150000, block: {cache_hit_count: 1001}}} | funcs:count(1)->Column#10 | N/A | N/A |
| └─TableFullScan_12 | 150000.00 | 150000 | cop[tikv] | table:customer | tikv_task:{time:54ms, loops:147} | keep order:false | N/A | N/A |
+----------------------------+-----------+---------+-----------+----------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------+-----------+------+
4 rows in set (0.06 sec)
下記は、TiFlashを選択させるSQLヒントです。
mysql> EXPLAIN ANALYZE SELECT /*+ read_from_storage(tiflash[customer]) */ COUNT(*) FROM customer;
+------------------------------+-----------+---------+--------------+----------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------+-----------+------+
| id | estRows | actRows | task | access object | execution info | operator info | memory | disk |
+------------------------------+-----------+---------+--------------+----------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------+-----------+------+
| HashAgg_18 | 1.00 | 1 | root | | time:19.1ms, loops:2, RRU:0.000000, WRU:0.000000, partial_worker:{wall_time:19.062334ms, concurrency:5, task_num:1, tot_wait:94.721098ms, tot_exec:3.266μs, tot_time:94.72863ms, max:18.947535ms, p95:18.947535ms}, final_worker:{wall_time:0s, concurrency:5, task_num:1, tot_wait:94.815685ms, tot_exec:6.762μs, tot_time:94.824181ms, max:18.970298ms, p95:18.970298ms} | funcs:count(Column#11)->Column#9 | 49.2 KB | N/A |
| └─TableReader_20 | 1.00 | 1 | root | | time:18.9ms, loops:2, cop_task: {num: 2, max: 0s, min: 0s, avg: 0s, p95: 0s, copr_cache_hit_ratio: 0.00} | MppVersion: 1, data:ExchangeSender_19 | 282 Bytes | N/A |
| └─ExchangeSender_19 | 1.00 | 1 | mpp[tiflash] | | tiflash_task:{time:10.1ms, loops:1, threads:1} | ExchangeType: PassThrough | N/A | N/A |
| └─HashAgg_9 | 1.00 | 1 | mpp[tiflash] | | tiflash_task:{time:9.08ms, loops:1, threads:1} | funcs:count(test.customer.C_MKTSEGMENT)->Column#11 | N/A | N/A |
| └─TableFullScan_17 | 150000.00 | 150000 | mpp[tiflash] | table:customer | tiflash_task:{time:9.08ms, loops:3, threads:1}, tiflash_scan:{dtfile:{total_scanned_packs:19, total_skipped_packs:0, total_scanned_rows:150000, total_skipped_rows:0, total_rs_index_load_time: 0ms, total_read_time: 1ms}, total_create_snapshot_time: 0ms, total_local_region_num: 0, total_remote_region_num: 0} | keep order:false | N/A | N/A |
+------------------------------+-----------+---------+--------------+----------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------+-----------+------+
5 rows in set (0.02 sec)
おわりに
本記事では分析系クエリを実行し、TiDBオプティマイザが最適なストレージエンジンを選択する動作を確認しました。