配列型や JSON 型によるデータサイズ削減
次の記事ではデータモデルがストレージに与える影響について記載されています。
中でも配列を使用することによるストレージ格納効率の向上については興味深い内容でした。
配列は取り扱いにくいもので、カラムの明瞭度も落ち、UPDATE 時のパフォーマンスの低下などありますが、アーカイブデータであれば、積極的に使って良いと感じました。
ブログの部分的な抜粋になりますが、実際に試してみました。
データベース作成
masami@masami-L ~/Desktop> sudo -i -u postgres
postgres@masami-L ~> createdb -U postgres sample
postgres@masami-L ~> psql -U postgres -d sample
psql (16.2 (Ubuntu 16.2-1.pgdg22.04+1))
配列を使用しないベーシックなデータモデル
テーブル作成
sample=# CREATE TABLE parent (
id bigserial PRIMARY KEY,
pname text NOT NULL, pts timestamptz NOT NULL
);
CREATE UNIQUE INDEX parent_pname ON parent(pname, pts);
CREATE INDEX parent_pts ON parent(pts);
CREATE TABLE detail (
pid int8 NOT NULL REFERENCES parent(id),
elemname text NOT NULL,
elemval float8 NOT NULL,
PRIMARY KEY (pid, elemname)
);
CREATE INDEX detail_elemname ON detail(elemname);
CREATE TABLE
CREATE INDEX
CREATE INDEX
CREATE TABLE
CREATE INDEX
データ登録
親テーブルに 100,000 行、詳細テーブルに 1,000 万行を作成
sample=# INSERT INTO parent (pname, pts)
SELECT
'pname ' || (g.i % 14197)::text,
now() + (g.i::text || ' seconds')::interval
FROM generate_series(1,100000) AS g(i);
INSERT INTO detail
SELECT p.id,
'elem_' || (
CASE WHEN g.i < 10 THEN '0' || g.i::text
ELSE g.i::text
END
),
(g.i % 10) + random()
FROM parent AS p, generate_series(0,99) AS g(i);
VACUUM FREEZE ANALYZE;
INSERT 0 100000
INSERT 0 10000000
VACUUM
sample=# select count(*) from parent;
count
--------
100000
(1 row)
sample=# select count(*) from detail;
count
----------
10000000
(1 row)
sample=# select * from parent limit 3;
id | pname | pts
----+---------+-------------------------------
1 | pname 1 | 2024-02-20 22:12:56.364079+09
2 | pname 2 | 2024-02-20 22:12:57.364079+09
3 | pname 3 | 2024-02-20 22:12:58.364079+09
(3 rows)
sample=# select * from detail limit 3;
pid | elemname | elemval
-----+----------+---------------------
1 | elem_00 | 0.8073799597017413
2 | elem_00 | 0.39073776262991444
3 | elem_00 | 0.4568458953944372
(3 rows)
2 つのテーブルの合計サイズ
sample=# SELECT pg_size_pretty(pg_total_relation_size('parent'::regclass) +
pg_total_relation_size('detail'::regclass));
pg_size_pretty
----------------
897 MB
(1 row)
クエリ実行速度計測関数の定義
CREATE OR REPLACE FUNCTION timeit(insql text)
RETURNS interval
AS $$
DECLARE
tgtpid bigint;
startts timestamp;
sumint interval = '0 seconds';
rec record;
i int; numiters int := 1000;
BEGIN
FOR i IN 1..numiters LOOP
tgtpid := round(100000 * random());
startts := clock_timestamp();
EXECUTE insql INTO rec using tgtpid;
sumint := sumint + (clock_timestamp() - startts)::interval;
END LOOP;
RETURN (sumint / numiters);
END;
$$ LANGUAGE plpgsql;
SELECTの実行時間を測定します
SELECT timeit(
$$
SELECT count(1) FROM parent p JOIN detail d ON d.pid = p.id WHERE p.id = $1
$$);
timeit
-----------------
00:00:00.000174
配列化されたデータモデル
テーブル作成
CREATE TABLE parentdetail (
id int8 PRIMARY KEY,
pname text NOT NULL,
pts timestamptz NOT NULL,
elemnames text[],
elemvals float8[]
);
CREATE UNIQUE INDEX parentdetail_pname ON parentdetail(pname, pts);
CREATE INDEX parentdetail_pts ON parentdetail(pts);
CREATE INDEX parentdetail_elemnames ON parentdetail USING GIN(elemnames);
CREATE TABLE
CREATE INDEX
CREATE INDEX
CREATE INDEX
データ登録
配列を使用しないテーブルからコピーします
INSERT INTO parentdetail
SELECT id, pname, pts, array_agg(elemname), array_agg(elemval)
FROM (
SELECT p.id, p.pname, p.pts, d.elemname, d.elemval
FROM parent p
JOIN detail d ON d.pid = p.id
ORDER BY p.id, p.pts, d.elemname
) AS ss
GROUP BY id, pname, pts;
INSERT 0 100000
sample=# VACUUM FREEZE ANALYZE parentdetail;
VACUUM
sample=# select count(*) from parentdetail;
count
--------
100000
(1 row)
テーブルの配列カラムには100個の要素があります
sample=# \x
Expanded display is on.
sample=# select * from parentdetail limit 3;
-[ RECORD 1 ]-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
id | 1
pname | pname 1
pts | 2024-02-20 22:34:59.266191+09
elemnames | {elem_00,elem_01,elem_02,elem_03,elem_04,elem_05,elem_06,elem_07,elem_08,elem_09,elem_10,elem_11,elem_12,elem_13,elem_14,elem_15,elem_16,elem_17,elem_18,elem_19,elem_20,elem_21,elem_22,elem_23,elem_24,elem_25,elem_26,elem_27,elem_28,elem_29,elem_30,elem_31,elem_32,elem_33,elem_34,elem_35,elem_36,elem_37,elem_38,elem_39,elem_40,elem_41,elem_42,elem_43,elem_44,elem_45,elem_46,elem_47,elem_48,elem_49,elem_50,elem_51,elem_52,elem_53,elem_54,elem_55,elem_56,elem_57,elem_58,elem_59,elem_60,elem_61,elem_62,elem_63,elem_64,elem_65,elem_66,elem_67,elem_68,elem_69,elem_70,elem_71,elem_72,elem_73,elem_74,elem_75,elem_76,elem_77,elem_78,elem_79,elem_80,elem_81,elem_82,elem_83,elem_84,elem_85,elem_86,elem_87,elem_88,elem_89,elem_90,elem_91,elem_92,elem_93,elem_94,elem_95,elem_96,elem_97,elem_98,elem_99}
elemvals | {0.5578424781174114,1.8688987233777858,2.532454784695549,3.551012583798954,4.781002846813838,5.692092965591744,6.052276109038051,7.038741584440093,8.274526448795289,9.290948370139265,0.9158197388739815,1.087443029095855,2.594387409923886,3.2410466068277834,4.938978012956088,5.2240230504643,6.6992087818430015,7.903430771245983,8.105917195818368,9.526474207024737,0.48151254197725635,1.9558553847637974,2.100327477776066,3.1551254830377253,4.298969593532522,5.924036394532411,6.83667297492568,7.8838922885450735,8.505649765495921,9.712901173484259,0.991441729144956,1.6315264593698409,2.9832718126358237,3.1378385203330232,4.534395708092891,5.9928662571163365,6.6724534155018524,7.4559486595736395,8.259820805209731,9.77008636917012,0.10396118042487856,1.8972932414822914,2.3177768397169736,3.5039479086979988,4.929960639517024,5.196145342116413,6.745347556617364,7.005048414648034,8.45526625067166,9.122239463314688,0.9978906266949004,1.885071851749533,2.1685913975634072,3.0120215326970516,4.480287646514828,5.088038610513056,6.273798389648139,7.299898903664687,8.078194138423644,9.421555507230412,0.2211484621629758,1.049490355400028,2.689289550549095,3.0880657839746597,4.23855177612629,5.4848760928667595,6.754527256800177,7.216099108600108,8.862817418337702,9.521847746809815,0.7893845568671374,1.3915215747423169,2.4709034013780844,3.299158373755379,4.915112676221472,5.1887857018180945,6.8111666636095585,7.709772456010885,8.389986766115928,9.7021804350255,0.5346130691754958,1.9321159016650178,2.1405559035845023,3.0770811453438434,4.726147408000518,5.961326044587711,6.292893091442441,7.041199629533683,8.435823395790496,9.140501644430149,0.7038452582862398,1.4358341101368275,2.210093674032553,3.6919528866247653,4.4922832584936465,5.747119235476337,6.623059809995059,7.248451382385245,8.570351872223657,9.63727607965652}
-[ RECORD 2 ]-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
id | 2
pname | pname 2
pts | 2024-02-20 22:35:00.266191+09
elemnames | {elem_00,elem_01,elem_02,elem_03,elem_04,elem_05,elem_06,elem_07,elem_08,elem_09,elem_10,elem_11,elem_12,elem_13,elem_14,elem_15,elem_16,elem_17,elem_18,elem_19,elem_20,elem_21,elem_22,elem_23,elem_24,elem_25,elem_26,elem_27,elem_28,elem_29,elem_30,elem_31,elem_32,elem_33,elem_34,elem_35,elem_36,elem_37,elem_38,elem_39,elem_40,elem_41,elem_42,elem_43,elem_44,elem_45,elem_46,elem_47,elem_48,elem_49,elem_50,elem_51,elem_52,elem_53,elem_54,elem_55,elem_56,elem_57,elem_58,elem_59,elem_60,elem_61,elem_62,elem_63,elem_64,elem_65,elem_66,elem_67,elem_68,elem_69,elem_70,elem_71,elem_72,elem_73,elem_74,elem_75,elem_76,elem_77,elem_78,elem_79,elem_80,elem_81,elem_82,elem_83,elem_84,elem_85,elem_86,elem_87,elem_88,elem_89,elem_90,elem_91,elem_92,elem_93,elem_94,elem_95,elem_96,elem_97,elem_98,elem_99}
elemvals | {0.5517032157172821,1.5748317008439585,2.2451892766674177,3.548539114875382,4.314386996179621,5.027413195081579,6.549060614637995,7.249279171226533,8.048426429311405,9.869324486208995,0.9994731068534861,1.863419108562482,2.705281995802636,3.681435844955274,4.86334743908202,5.650570652690419,6.093897086499567,7.62866295220573,8.718212207247973,9.907515939573784,0.7869480024046993,1.3662168990019268,2.358453917431973,3.874457515332267,4.027093954676786,5.9458541865416805,6.7015669858708975,7.641268336241808,8.63322706463083,9.722126726179056,0.6944037389578739,1.0537417687345965,2.9067223637380835,3.2586318394243676,4.21942441324228,5.619841589399073,6.5376877936260485,7.234262257944181,8.000947155429742,9.415951984979927,0.5112142060784777,1.3494561235613105,2.5113980223821386,3.9931275523780982,4.0136253630353735,5.2324469882548215,6.877560981117597,7.283343298650777,8.338727936592988,9.345513814160029,0.7868279088104941,1.505744641011404,2.794061509003825,3.891230973568497,4.393448592791458,5.072897406829668,6.6071485726766355,7.718718248928042,8.224780730547366,9.49703783363151,0.8317445301764224,1.2298908328427274,2.087539929721345,3.296322569099118,4.254087694224438,5.285709328072869,6.974049708112773,7.901078671070934,8.299149287698185,9.112429747535032,0.968491493677277,1.5600536075416471,2.936883443631416,3.2078535143020304,4.153154624526735,5.246665256662197,6.588477205214133,7.24757746160293,8.514686348929274,9.565251906739775,0.03260052579258854,1.991243832823045,2.850341825309272,3.657322256237819,4.311656268869285,5.188790281503941,6.674234122247356,7.1078054137760205,8.638030208102975,9.486699871343433,0.1235842184804028,1.8513008981303187,2.3003410273086615,3.334251076195585,4.864971002901541,5.078328638232904,6.569690320457596,7.889767780070713,8.000581274560147,9.141578973172216}
-[ RECORD 3 ]-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
id | 3
pname | pname 3
pts | 2024-02-20 22:35:01.266191+09
elemnames | {elem_00,elem_01,elem_02,elem_03,elem_04,elem_05,elem_06,elem_07,elem_08,elem_09,elem_10,elem_11,elem_12,elem_13,elem_14,elem_15,elem_16,elem_17,elem_18,elem_19,elem_20,elem_21,elem_22,elem_23,elem_24,elem_25,elem_26,elem_27,elem_28,elem_29,elem_30,elem_31,elem_32,elem_33,elem_34,elem_35,elem_36,elem_37,elem_38,elem_39,elem_40,elem_41,elem_42,elem_43,elem_44,elem_45,elem_46,elem_47,elem_48,elem_49,elem_50,elem_51,elem_52,elem_53,elem_54,elem_55,elem_56,elem_57,elem_58,elem_59,elem_60,elem_61,elem_62,elem_63,elem_64,elem_65,elem_66,elem_67,elem_68,elem_69,elem_70,elem_71,elem_72,elem_73,elem_74,elem_75,elem_76,elem_77,elem_78,elem_79,elem_80,elem_81,elem_82,elem_83,elem_84,elem_85,elem_86,elem_87,elem_88,elem_89,elem_90,elem_91,elem_92,elem_93,elem_94,elem_95,elem_96,elem_97,elem_98,elem_99}
elemvals | {0.011192177232583589,1.2261017701753012,2.6476683728557013,3.2989821410690006,4.423396950169131,5.039980834521867,6.705997676957946,7.986420148226198,8.13047389644667,9.455212986563748,0.936126589799283,1.5047769231057195,2.553468438619216,3.1469482631127725,4.441137887449354,5.808896106035018,6.172813206272036,7.04503640801202,8.77412655300905,9.498946044372797,0.30957803602164446,1.1152768721358193,2.7194497766105137,3.6016697925090106,4.906719971515809,5.140668813389748,6.073976955417134,7.351635111864862,8.310333263433545,9.502661096710636,0.4283396936235526,1.9724844708928053,2.050899369485119,3.9624022940665604,4.948181802455661,5.458185045076544,6.624536954412047,7.441990684456851,8.1554093001706,9.35427871693103,0.27425188998709515,1.8057242539120857,2.709440412056761,3.5391320760024705,4.771187255014279,5.6413506954940935,6.188455570433785,7.005186418868316,8.195873335328866,9.041317409295953,0.8706474146525629,1.6402077491372715,2.7194776237973706,3.3840215024419926,4.515470791095236,5.00853677744929,6.38505482031756,7.115059789087791,8.144892753175196,9.132338921475576,0.8887968318184498,1.528478790420177,2.8246425613370825,3.2728543059185817,4.3612827722603065,5.548024734657229,6.445461683056788,7.711647762512012,8.148728962634646,9.508393557048276,0.504353792841453,1.5318588300208198,2.5942389309642344,3.936480899873896,4.885834772229469,5.399451754241209,6.819053412303088,7.332632112445918,8.643182709790477,9.344465488000633,0.5038003487364726,1.9518922511074202,2.7796879011253672,3.542263691157796,4.669744889380905,5.585256079232732,6.029902848866282,7.094784730602651,8.46003939038415,9.340918067227431,0.640892262676612,1.8117916802564018,2.920973414925225,3.4255223925283467,4.153483059250998,5.992122321071502,6.783223733214395,7.925412939603557,8.183514362315332,9.204939141031655}
postgres=# select array_length(elemnames, 1) from parentdetail limit 1;
array_length
--------------
100
(1 row)
テーブルサイズを確認します
sample=# SELECT pg_size_pretty(pg_table_size('parentdetail'::regclass));
pg_size_pretty
----------------
130 MB
(1 row)
sample=# SELECT pg_size_pretty(pg_indexes_size('parentdetail'::regclass));
pg_size_pretty
----------------
28 MB
(1 row)
sample=# SELECT pg_size_pretty(pg_total_relation_size('parentdetail_pkey'));
pg_size_pretty
----------------
2208 kB
(1 row)
sample=# SELECT pg_size_pretty(pg_total_relation_size('parentdetail_pname'));
pg_size_pretty
----------------
4136 kB
(1 row)
sample=# SELECT pg_size_pretty(pg_total_relation_size('parentdetail_pts'));
pg_size_pretty
----------------
2208 kB
(1 row)
sample=# SELECT pg_size_pretty(pg_total_relation_size('parentdetail_elemnames'));
pg_size_pretty
----------------
20 MB
(1 row)
sample=# SELECT pg_size_pretty(pg_total_relation_size('parentdetail'::regclass));
pg_size_pretty
----------------
159 MB
(1 row)
sample=# SELECT timeit( $$ SELECT count(1) FROM parentdetail WHERE id = $1 $$);
timeit
-----------------
00:00:00.000106
(1 row)
元のテーブル(897 MB)から 8 割以上のサイズ削減が出来ました。
テーブルのストレージがextendedの項目はトーストにより圧縮されています。
sample-# \d+ parentdetail
テーブル"public.parentdetail"
列 | タイプ | 照合順序 | Null 値を許容 | デフォルト | ストレージ | 圧縮 | 統計目標 | 説明
-----------+--------------------------+----------+---------------+------------+------------+------+----------+------
id | bigint | | not null | | plain | | |
pname | text | | not null | | extended | | |
pts | timestamp with time zone | | not null | | plain | | |
elemnames | text[] | | | | extended | | |
elemvals | double precision[] | | | | extended | | |
インデックス:
"parentdetail_pkey" PRIMARY KEY, btree (id)
"parentdetail_elemnames" gin (elemnames)
"parentdetail_pname" UNIQUE, btree (pname, pts)
"parentdetail_pts" btree (pts)
アクセスメソッド: heap
トーストは PostgreSQL の内部メカニズムであり、デフォルトで 2040 バイトのしきい値を超える行のデータを圧縮しようとします。カラムを配列化することでしきい値を超えさせてトースト有効化しているわけです。
更にカラムを配列化すること隣接するデータが同じになる可能性が高くなり、高い圧縮率が期待できます。面白いです。
配列型ではなくJSONB型でもトーストによる圧縮効果が得られ、近いパフォーマンスが得られます。