More than 5 years have passed since last update.

PostgreSQL 11 宣言的パーティショニングを勉強してみた（メモ）

Last updated at 2020-08-17Posted at 2019-05-25

はじめに

PostgreSQL 11の宣言的パーティショニングを勉強してみた際のメモです。
パーティションの基本知識はある前提です。

主に以下のドキュメントを参考にしています。

PostgreSQL 11.1文書　5.10. テーブルのパーティショニング

宣言的パーティショニング

PostgreSQL9.6以前でパーティショニングを使用する場合は、継承・トリガーを使用したパーティショニングで、疑似パーティショニングのようなものでした。
PostgreSQL10以降では宣言的パーティショニングを使用します。

PostgreSQL11で使用できるパーティションは以下の3つです。
ハッシュパーティションはPostgreSQL11から使用できるようになりました。

レンジ（範囲）パーティション
リストパーティション
ハッシュパーティション

レンジパーティション作成

まずパーティションテーブルを作成します。
"PARTITION BY RANGE"の後に、パーティションキー（logdate）を指定します。

CREATE TABLE measurement (
    city_id         int not null,
    logdate         date not null,
    peaktemp        int,
    unitsales       int
) PARTITION BY RANGE (logdate);

次にパーティションを作成する。
2019年1月、2月、3月の3つのパーティションを作成しています。

CREATE TABLE measurement_y2019m01 PARTITION OF measurement
    FOR VALUES FROM ('2019-01-01') TO ('2019-02-01');

CREATE TABLE measurement_y2019m02 PARTITION OF measurement
    FOR VALUES FROM ('2019-02-01') TO ('2019-03-01');

CREATE TABLE measurement_y2019m03 PARTITION OF measurement
    FOR VALUES FROM ('2019-03-01') TO ('2019-04-01');

キー列にインデックスを作成します。各パーティションでも自動でインデックスが作成されます。

CREATE INDEX ON measurement (logdate);

testdb=> SELECT tablename, indexname FROM pg_indexes;
        tablename        |                      indexname                      
-------------------------+-----------------------------------------------------
 measurement_y2019m01    | measurement_y2019m01_logdate_idx
 measurement_y2019m02    | measurement_y2019m02_logdate_idx
 measurement_y2019m03    | measurement_y2019m03_logdate_idx

"\d+"でテーブルの情報を確認します。

testdb=> \d+ measurement
                                 Table "public.measurement"
  Column   |  Type   | Collation | Nullable | Default | Storage | Stats target | Description 
-----------+---------+-----------+----------+---------+---------+--------------+-------------
 city_id   | integer |           | not null |         | plain   |              | 
 logdate   | date    |           | not null |         | plain   |              | 
 peaktemp  | integer |           |          |         | plain   |              | 
 unitsales | integer |           |          |         | plain   |              | 
Partition key: RANGE (logdate)
Indexes:
    "measurement_logdate_idx" btree (logdate)
Partitions: measurement_y2019m01 FOR VALUES FROM ('2019-01-01') TO ('2019-02-01'),
            measurement_y2019m02 FOR VALUES FROM ('2019-02-01') TO ('2019-03-01'),
            measurement_y2019m03 FOR VALUES FROM ('2019-03-01') TO ('2019-04-01')

\d+ではなく、SQLでパーティションの情報を表示するなら以下のようになります。

select pt.relnamespace::regnamespace::text as schema,
  base_tb.relname as parent_table_name, 
  pt.relname as table_name, 
  pg_get_partkeydef(base_tb.oid) as partition_key,
  pg_get_expr(pt.relpartbound, pt.oid, true) as partition_expression
from
  pg_class base_tb
  join pg_inherits i on i.inhparent = base_tb.oid 
  join pg_class pt on pt.oid = i.inhrelid;

 schema | parent_table_name |      table_name      |  partition_key  |               partition_expression               
--------+-------------------+----------------------+-----------------+--------------------------------------------------
 public | measurement       | measurement_y2019m01 | RANGE (logdate) | FOR VALUES FROM ('2019-01-01') TO ('2019-02-01')
 public | measurement       | measurement_y2019m02 | RANGE (logdate) | FOR VALUES FROM ('2019-02-01') TO ('2019-03-01')
 public | measurement       | measurement_y2019m03 | RANGE (logdate) | FOR VALUES FROM ('2019-03-01') TO ('2019-04-01')

システムカタログを確認すると以下のようになります。

select 
    par.relnamespace::regnamespace::text as schema, 
    par.relname as table_name, 
    partnatts as num_columns,
    column_index,
    col.column_name
from   
    (select
         partrelid,
         partnatts,
         case partstrat 
              when 'l' then 'list' 
              when 'r' then 'range' end as partition_strategy,
         unnest(partattrs) column_index
     from
         pg_partitioned_table) pt 
join   
    pg_class par 
on     
    par.oid = pt.partrelid
join
    information_schema.columns col
on  
    col.table_schema = par.relnamespace::regnamespace::text
    and col.table_name = par.relname
    and ordinal_position = pt.column_index;

 schema | table_name  | num_columns | column_index | column_name 
--------+-------------+-------------+--------------+-------------
 public | measurement |           1 |            2 | logdate
(1 row)

テーブル一覧は以下のように、テーブルとパーティションが表示される。

testdb=> \dt
                List of relations
 Schema |         Name         | Type  |  Owner   
--------+----------------------+-------+----------
 public | measurement          | table | testuser
 public | measurement_y2019m01 | table | testuser
 public | measurement_y2019m02 | table | testuser
 public | measurement_y2019m03 | table | testuser

各パーティションに適当にデータを入れてみます。

insert into measurement values(1, '2019-01-01', 35, 1);
insert into measurement values(1, '2019-02-01', 28, 1);
insert into measurement values(1, '2019-03-01', 33, 1);

SELECT

SELECTを確認してみます。
テーブルをSELECTすれば普通に挿入された3レコードが表示されます。

testdb=> select * from measurement;
 city_id |  logdate   | peaktemp | unitsales 
---------+------------+----------+-----------
       1 | 2019-01-01 |       35 |         1
       1 | 2019-02-01 |       28 |         1
       1 | 2019-03-01 |       33 |         1
(3 rows)

パーティションを絞って検索するには、fromにパーティションを指定します。
whereで条件つけるから、こういう使い方はあまりしないような気はしますが。

testdb=> select * from measurement_y2019m01;
 city_id |  logdate   | peaktemp | unitsales 
---------+------------+----------+-----------
       1 | 2019-01-01 |       35 |         1
(1 row)

SELECT(実行計画)

SELECT時の実行計画をexplainで表示してみます。
"Seq Scan on measurement_y2019m01"からパーティションが検索されていることが分かります。

testdb=> explain analyze select * from measurement where logdate between '2019-01-01' and '2019-02-01';
                                                     QUERY PLAN                                                      
---------------------------------------------------------------------------------------------------------------------
 Append  (cost=0.00..2.04 rows=2 width=16) (actual time=0.006..0.009 rows=2 loops=1)
   ->  Seq Scan on measurement_y2019m01  (cost=0.00..1.01 rows=1 width=16) (actual time=0.005..0.006 rows=1 loops=1)
         Filter: ((logdate >= '2019-01-01'::date) AND (logdate <= '2019-02-01'::date))
   ->  Seq Scan on measurement_y2019m02  (cost=0.00..1.01 rows=1 width=16) (actual time=0.002..0.002 rows=1 loops=1)
         Filter: ((logdate >= '2019-01-01'::date) AND (logdate <= '2019-02-01'::date))
 Planning Time: 0.275 ms
 Execution Time: 0.027 ms
(7 rows)

TRUNCATE

TRUNCATEはテーブル、パーティションの両方で実行可能です。

testdb=> truncate table measurement;
TRUNCATE TABLE
testdb=> select * from measurement;
 city_id | logdate | peaktemp | unitsales 
---------+---------+----------+-----------
(0 rows)

テーブルを削除

テーブル削除は普通の非パーティションテーブルと同様です。

testdb=> drop table measurement;
DROP TABLE

パーティションを削除

パーティションを削除するためには、テーブルの代わりにパーティションを指定するだけです。

testdb=> drop table measurement_y2019m01;
DROP TABLE
testdb=> \dt
                List of relations
 Schema |         Name         | Type  |  Owner   
--------+----------------------+-------+----------
 public | measurement          | table | testuser
 public | measurement_y2019m02 | table | testuser
 public | measurement_y2019m03 | table | testuser

ただしこの場合、親テーブルでACCESS EXCLUSIVEロックが取得される。
以下のようにテーブルから該当のパーティションテーブルをDETACHしてあげるのが良い（らしい）。
この後にDETACHしたパーティションテーブルをDROPする。

testdb=> ALTER TABLE measurement DETACH PARTITION measurement_y2019m02;
ALTER TABLE

インデックス再構成

パーティションに対してインデックス再構成を実行する。
パーティションテーブルに対しては実行できません。

testdb=> reindex index measurement_y2019m01_logdate_idx;
REINDEX

プライマリキーの作成

11からは全てのパーティションに跨るプライマリキーを作成することができます。
なお、パーティションキーを含む必要があります。

testdb=> alter table measurement add constraint pk_measurement primary key (city_id, logdate);
ALTER TABLE

VACUUM

データベースの不要領域を回収します。
パーティションもまとめてVACUUMが実行されています。

testdb=# vacuum verbose measurement;
INFO:  vacuuming "public.measurement_y2019m01"
INFO:  index "measurement_y2019m01_logdate_idx" now contains 0 row versions in 1 pages
DETAIL:  0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
INFO:  "measurement_y2019m01": found 0 removable, 0 nonremovable row versions in 0 out of 0 pages
DETAIL:  0 dead row versions cannot be removed yet, oldest xmin: 5476446
There were 0 unused item identifiers.
Skipped 0 pages due to buffer pins, 0 frozen pages.
0 pages are entirely empty.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
INFO:  vacuuming "public.measurement_y2019m02"
INFO:  index "measurement_y2019m02_logdate_idx" now contains 0 row versions in 1 pages
DETAIL:  0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
INFO:  "measurement_y2019m02": found 0 removable, 0 nonremovable row versions in 0 out of 0 pages
DETAIL:  0 dead row versions cannot be removed yet, oldest xmin: 5476446
There were 0 unused item identifiers.
Skipped 0 pages due to buffer pins, 0 frozen pages.
0 pages are entirely empty.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
INFO:  vacuuming "public.measurement_y2019m03"
INFO:  index "measurement_y2019m03_logdate_idx" now contains 0 row versions in 1 pages
DETAIL:  0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
INFO:  "measurement_y2019m03": found 0 removable, 0 nonremovable row versions in 0 out of 0 pages
DETAIL:  0 dead row versions cannot be removed yet, oldest xmin: 5476446
There were 0 unused item identifiers.
Skipped 0 pages due to buffer pins, 0 frozen pages.
0 pages are entirely empty.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
VACUUM

アナライズ

テーブルに関する統計情報を集計する。
パーティションもまとめてアナライズされています。

testdb=> analyze verbose measurement;
INFO:  analyzing "public.measurement" inheritance tree
INFO:  "measurement_y2019m01": scanned 1 of 1 pages, containing 1 live rows and 0 dead rows; 1 rows in sample, 1 estimated total rows
INFO:  "measurement_y2019m02": scanned 1 of 1 pages, containing 1 live rows and 0 dead rows; 1 rows in sample, 1 estimated total rows
INFO:  "measurement_y2019m03": scanned 1 of 1 pages, containing 1 live rows and 0 dead rows; 1 rows in sample, 1 estimated total rows
INFO:  analyzing "public.measurement_y2019m01"
INFO:  "measurement_y2019m01": scanned 1 of 1 pages, containing 1 live rows and 0 dead rows; 1 rows in sample, 1 estimated total rows
INFO:  analyzing "public.measurement_y2019m02"
INFO:  "measurement_y2019m02": scanned 1 of 1 pages, containing 1 live rows and 0 dead rows; 1 rows in sample, 1 estimated total rows
INFO:  analyzing "public.measurement_y2019m03"
INFO:  "measurement_y2019m03": scanned 1 of 1 pages, containing 1 live rows and 0 dead rows; 1 rows in sample, 1 estimated total rows
ANALYZE

COPY

COPY TOはパーティションテーブルには使用できません。
パーティション単位で実行します。

testdb=# copy measurement_y2019m01 to '/tmp/measurement_y2019m01.csv' with csv;
COPY 1

-bash-4.2$ cat /tmp/measurement_y2019m01.csv 
1,2019-01-01,35,1

COPY FROMも試してみます。COPY FROMはCOPY TOと違いパーティション、テーブルの両方に対して実行できます。

testdb=# truncate table measurement_y2019m01;
TRUNCATE TABLE
testdb=# select * from measurement_y2019m01;
 city_id | logdate | peaktemp | unitsales 
---------+---------+----------+-----------
(0 rows)

testdb=# copy measurement_y2019m01 from '/tmp/measurement_y2019m01.csv' with csv;
COPY 1
testdb=# select * from measurement_y2019m01;
 city_id |  logdate   | peaktemp | unitsales 
---------+------------+----------+-----------
       1 | 2019-01-01 |       35 |         1
(1 row)

デフォルトパーティション

デフォルトパーティションを指定することができます。
今回のテーブルは1月から3月までのパーティションを作成済みです。
これに4月のデータをINSERTするとERRORになります。

testdb=> insert into measurement values(1, '2019-04-01', 35, 1);
ERROR:  no partition of relation "measurement" found for row
DETAIL:  Partition key of the failing row contains (logdate) = (2019-04-01).

そのため、以下のように1月から3月まで以外のデータを格納するデフォルトパーティションを作成してみます。（今回は4月のパーティションを作るのでも可なので例が悪いですが）
最後に"DEFAULT"をつけています。

testdb=> CREATE TABLE measurement_default PARTITION OF measurement DEFAULT;
CREATE TABLE

testdb=> insert into measurement values(1, '2019-04-01', 35, 1);
INSERT 0 1
testdb=> select * from measurement_default;
 city_id |  logdate   | peaktemp | unitsales 
---------+------------+----------+-----------
       1 | 2019-04-01 |       35 |         1
(1 row)

ここで'2019-04-01'が入るパーティションを作成するとエラーになってしまいます。

CREATE TABLE measurement_y2019m04 PARTITION OF measurement
    FOR VALUES FROM ('2019-04-01') TO ('2019-05-01');

ERROR:  updated partition constraint for default partition "measurement_default" would be violated by some row

回避策としては、'2019-04-01'をSELECTして別テーブルにINSERTしておきます。
その上でデフォルトパーティションに入っていたレコードを削除します。
最後に新しいテーブルをパーティションとしてmeasurementにアタッチすればよいはずです。
と、以下のサイトに書いていました！

Default Partition - adopting the odds

リストパーティション

リストパーティションの場合は、"PARTITION BY LIST"をつけます。

CREATE TABLE measurement_list (
    city_id         int not null,
    logdate         date not null,
    peaktemp        int,
    unitsales       int
) PARTITION BY LIST (city_id);

CREATE TABLE measurement_list_1 PARTITION OF measurement_list FOR VALUES IN (1);CREATE TABLE measurement_list_2 PARTITION OF measurement_list FOR VALUES IN (2);

testdb=> \d+ measurement_list
                               Table "public.measurement_list"
  Column   |  Type   | Collation | Nullable | Default | Storage | Stats target | Description 
-----------+---------+-----------+----------+---------+---------+--------------+-------------
 city_id   | integer |           | not null |         | plain   |              | 
 logdate   | date    |           | not null |         | plain   |              | 
 peaktemp  | integer |           |          |         | plain   |              | 
 unitsales | integer |           |          |         | plain   |              | 
Partition key: LIST (city_id)
Partitions: measurement_list_1 FOR VALUES IN (1),
            measurement_list_2 FOR VALUES IN (2)

ハッシュパーティション

ハッシュパーティションの場合は、"PARTITION BY HASH"をつけます。

CREATE TABLE measurement_hash (
     city_id         int not null,
     logdate         date not null,
     peaktemp        int,
     unitsales       int
 ) PARTITION BY HASH (city_id);

パーティションは"WITH (MODULUS 4, REMAINDER 0)"のように指定します。
MODULUSはハッシュパーティションの数です。REMAINDERはパーティションのIDみたいなものです。パーティションが4つなので、0から3までを指定します。

CREATE TABLE measurement_hash_p0 PARTITION OF measurement_hash FOR VALUES WITH (MODULUS 4, REMAINDER 0);
CREATE TABLE measurement_hash_p1 PARTITION OF measurement_hash FOR VALUES WITH (MODULUS 4, REMAINDER 1);
CREATE TABLE measurement_hash_p2 PARTITION OF measurement_hash FOR VALUES WITH (MODULUS 4, REMAINDER 2);
CREATE TABLE measurement_hash_p3 PARTITION OF measurement_hash FOR VALUES WITH (MODULUS 4, REMAINDER 3);


testdb=> \d+ measurement_hash
                               Table "public.measurement_hash"
  Column   |  Type   | Collation | Nullable | Default | Storage | Stats target | Description 
-----------+---------+-----------+----------+---------+---------+--------------+-------------
 city_id   | integer |           | not null |         | plain   |              | 
 logdate   | date    |           | not null |         | plain   |              | 
 peaktemp  | integer |           |          |         | plain   |              | 
 unitsales | integer |           |          |         | plain   |              | 
Partition key: HASH (city_id)
Partitions: measurement_hash_p0 FOR VALUES WITH (modulus 4, remainder 0),
            measurement_hash_p1 FOR VALUES WITH (modulus 4, remainder 1),
            measurement_hash_p2 FOR VALUES WITH (modulus 4, remainder 2),
            measurement_hash_p3 FOR VALUES WITH (modulus 4, remainder 3)

おわりに

今回は試していませんが、パーティションごとに表領域を変更することもできます。
2019年5月時点でPostgreSQL 11.3がリリースされていますが、リリースノートを見るとパーティショニング関連での修正が多いので、なるべくマイナーバージョンもあげていった方が良さそうです。

参考

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up