概要
MySQLで「過去データをサクッと消したい」という要望があります。
drop partitionを利用したことがないので「drop partitionは本当に早いのか」を検証します。そのメモです。
ついでにテスト用データのインポートがてらパーティションの有無でinsertの速度が変わるのかゆるく検証してみます。
環境
テーブル定義
CREATE TABLE `example_table` (
`user_id` varchar(36) NOT NULL,
`item_id` int(11) NOT NULL,
`batch_date` date NOT NULL,
PRIMARY KEY (`user_id`,`item_id`,`batch_date`),
KEY `item_id_batch_date_index` (`item_id`,`batch_date`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
パーティションの追加
alter table example_table
partition by range (to_days(batch_date))
(
partition p201812 values less than (to_days('2019-01-01')),
partition p201901 values less than (to_days('2019-02-01')),
partition p201902 values less than (to_days('2019-03-01')),
partition p201903 values less than (to_days('2019-04-01')),
partition p201904 values less than (to_days('2019-05-01')),
partition p201905 values less than (to_days('2019-06-01')),
partition p201906 values less than (to_days('2019-07-01')),
partition p201907 values less than (to_days('2019-08-01')),
partition p201908 values less than (to_days('2019-09-01')),
partition p201909 values less than (to_days('2019-10-01')),
partition p201910 values less than (to_days('2019-11-01')),
partition p201911 values less than (to_days('2019-12-01')),
partition p201912 values less than (to_days('2020-01-01')),
partition p202001 values less than (to_days('2020-02-01'))
);
追加後に show create table
した結果
CREATE TABLE `example_table` (
`user_id` varchar(36) NOT NULL,
`item_id` int(11) NOT NULL,
`batch_date` date NOT NULL,
PRIMARY KEY (`user_id`,`item_id`,`batch_date`),
KEY `item_id_batch_date_index` (`item_id`,`batch_date`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
/*!50100 PARTITION BY RANGE (to_days(batch_date))
(PARTITION p201812 VALUES LESS THAN (737425) ENGINE = InnoDB,
PARTITION p201901 VALUES LESS THAN (737456) ENGINE = InnoDB,
PARTITION p201902 VALUES LESS THAN (737484) ENGINE = InnoDB,
PARTITION p201903 VALUES LESS THAN (737515) ENGINE = InnoDB,
PARTITION p201904 VALUES LESS THAN (737545) ENGINE = InnoDB,
PARTITION p201905 VALUES LESS THAN (737576) ENGINE = InnoDB,
PARTITION p201906 VALUES LESS THAN (737606) ENGINE = InnoDB,
PARTITION p201907 VALUES LESS THAN (737637) ENGINE = InnoDB,
PARTITION p201908 VALUES LESS THAN (737668) ENGINE = InnoDB,
PARTITION p201909 VALUES LESS THAN (737698) ENGINE = InnoDB,
PARTITION p201910 VALUES LESS THAN (737729) ENGINE = InnoDB,
PARTITION p201911 VALUES LESS THAN (737759) ENGINE = InnoDB,
PARTITION p201912 VALUES LESS THAN (737790) ENGINE = InnoDB,
PARTITION p202001 VALUES LESS THAN (737821) ENGINE = InnoDB) */
検証で使うデータ
-
select count(*)
で 12,550,978 件 - 期間: 約4ヶ月分
- 1日あたり9万から11万件程度
例)
mysql> select batch_date, count(*)
from example_table
group by batch_date;
+------------+----------+
| batch_date | count(*) |
+------------+----------+
| 2018-12-31 | 105525 |
| 2019-01-01 | 98958 |
| 2019-01-02 | 91431 |
| 2019-01-03 | 101750 |
| 2019-01-04 | 95111 |
| 2019-01-05 | 93807 |
...
| 2019-04-28 | 100994 |
| 2019-04-29 | 92203 |
| 2019-04-30 | 103806 |
| 2019-05-01 | 95683 |
| 2019-05-02 | 104395 |
| 2019-05-03 | 90513 |
| 2019-05-04 | 99946 |
+------------+----------+
insert
load data infile
で空のテーブルにデータをinsert。
それにどのくらいかかるかゆるく計測
対象ファイルの行数は 12,550,978 行
結果
パーティションの有無で大きく時間が変わることはなかった
どちらもだいたい50分
削除の検証
delete文
delete前確認
mysql> select date_format(batch_date, '%Y-%m') d, count(*)
from example_table
group by d;
+---------+----------+
| d | count(*) |
+---------+----------+
| 2018-12 | 105525 |
| 2019-01 | 3061619 |
| 2019-02 | 2824514 |
| 2019-03 | 3141687 |
| 2019-04 | 3027096 |
| 2019-05 | 390537 |
+---------+----------+
6 rows in set (8.97 sec)
delete
mysql> delete from example_table
where batch_date between '2019-01-01' and '2019-01-31';
Query OK, 3061619 rows affected (55.01 sec)
今回のデータ量だと1分弱
delete後確認
mysql> select date_format(batch_date, '%Y-%m') d, count(*)
from example_table
group by d;
+---------+----------+
| d | count(*) |
+---------+----------+
| 2018-12 | 105525 |
| 2019-02 | 2824514 |
| 2019-03 | 3141687 |
| 2019-04 | 3027096 |
| 2019-05 | 390537 |
+---------+----------+
5 rows in set (5.74 sec)
1月だけ消えた
drop partition
drop前確認
mysql> select date_format(batch_date, '%Y-%m') d, count(*)
from example_table
group by d;
+---------+----------+
| d | count(*) |
+---------+----------+
| 2018-12 | 105525 |
| 2019-01 | 3061619 |
| 2019-02 | 2824514 |
| 2019-03 | 3141687 |
| 2019-04 | 3027096 |
| 2019-05 | 390537 |
+---------+----------+
6 rows in set (7.62 sec)
drop partition
mysql> alter table example_table drop partition p201901;
Query OK, 0 rows affected (0.11 sec)
Records: 0 Duplicates: 0 Warnings: 0
一瞬で消えた。
drop後確認
mysql> select date_format(batch_date, '%Y-%m') d, count(*)
from example_table
group by d;
+---------+----------+
| d | count(*) |
+---------+----------+
| 2018-12 | 105525 |
| 2019-02 | 2824514 |
| 2019-03 | 3141687 |
| 2019-04 | 3027096 |
| 2019-05 | 390537 |
+---------+----------+
5 rows in set (5.92 sec)
1月だけ消えた
まとめ
delete
drop partition
で一瞬で消せた
よかった
insert
真面目に検証してないけど、パーティションの有無でinsertの速度はそれほど変わらなそう
真面目に細かく確認したら差が出るのかな