LoginSignup
1
1

PostgreSQL 配列型やJSON 型によるデータサイズ削減

Last updated at Posted at 2024-03-18

配列型や JSON 型によるデータサイズ削減

次の記事ではデータモデルがストレージに与える影響について記載されています。

中でも配列を使用することによるストレージ格納効率の向上については興味深い内容でした。
配列は取り扱いにくいもので、カラムの明瞭度も落ち、UPDATE 時のパフォーマンスの低下などありますが、アーカイブデータであれば、積極的に使って良いと感じました。
ブログの部分的な抜粋になりますが、実際に試してみました。

データベース作成

masami@masami-L ~/Desktop> sudo -i -u postgres
postgres@masami-L ~> createdb -U postgres sample
postgres@masami-L ~> psql -U postgres -d sample
psql (16.2 (Ubuntu 16.2-1.pgdg22.04+1))

配列を使用しないベーシックなデータモデル

テーブル作成

sample=# CREATE TABLE parent (
    id bigserial PRIMARY KEY,
    pname text NOT NULL, pts timestamptz NOT NULL
);
CREATE UNIQUE INDEX parent_pname ON parent(pname, pts);
CREATE INDEX parent_pts ON parent(pts);

CREATE TABLE detail (
    pid int8 NOT NULL REFERENCES parent(id),
    elemname text NOT NULL,
    elemval float8 NOT NULL,
    PRIMARY KEY (pid, elemname)
);
CREATE INDEX detail_elemname ON detail(elemname);
CREATE TABLE
CREATE INDEX
CREATE INDEX
CREATE TABLE
CREATE INDEX

データ登録
親テーブルに 100,000 行、詳細テーブルに 1,000 万行を作成

sample=# INSERT INTO parent (pname, pts)
SELECT
    'pname ' || (g.i % 14197)::text,
    now() + (g.i::text || ' seconds')::interval
FROM generate_series(1,100000) AS g(i);

INSERT INTO detail
SELECT p.id,
    'elem_' || (
        CASE WHEN g.i < 10 THEN '0' || g.i::text
        ELSE g.i::text
        END
    ),
    (g.i % 10) + random()
FROM parent AS p, generate_series(0,99) AS g(i);

VACUUM FREEZE ANALYZE;
INSERT 0 100000
INSERT 0 10000000
VACUUM
sample=# select count(*) from parent;
 count
--------
 100000
(1 row)

sample=# select count(*) from detail;
  count
----------
 10000000
(1 row)
sample=# select * from parent limit 3;
 id |  pname  |              pts
----+---------+-------------------------------
  1 | pname 1 | 2024-02-20 22:12:56.364079+09
  2 | pname 2 | 2024-02-20 22:12:57.364079+09
  3 | pname 3 | 2024-02-20 22:12:58.364079+09
(3 rows)
sample=# select * from detail limit 3;
 pid | elemname |       elemval
-----+----------+---------------------
   1 | elem_00  |  0.8073799597017413
   2 | elem_00  | 0.39073776262991444
   3 | elem_00  |  0.4568458953944372
(3 rows)

2 つのテーブルの合計サイズ

sample=# SELECT pg_size_pretty(pg_total_relation_size('parent'::regclass) +
    pg_total_relation_size('detail'::regclass));
 pg_size_pretty 
----------------
 897 MB
(1 row)

クエリ実行速度計測関数の定義

CREATE OR REPLACE FUNCTION timeit(insql text)
RETURNS interval
AS $$
DECLARE
    tgtpid bigint;
    startts timestamp;
    sumint interval = '0 seconds';
    rec record;
    i int; numiters int := 1000;
BEGIN
    FOR i IN 1..numiters LOOP
        tgtpid := round(100000 * random());
        startts := clock_timestamp();
        EXECUTE insql INTO rec using tgtpid;
        sumint := sumint + (clock_timestamp() - startts)::interval;
    END LOOP;
    RETURN (sumint / numiters);
END;
$$ LANGUAGE plpgsql;

SELECTの実行時間を測定します

SELECT timeit(
$$
    SELECT count(1) FROM parent p JOIN detail d ON d.pid = p.id WHERE p.id = $1
$$);
     timeit      
-----------------
 00:00:00.000174

配列化されたデータモデル

テーブル作成


CREATE TABLE parentdetail (
    id int8 PRIMARY KEY,
    pname text NOT NULL,
    pts timestamptz NOT NULL,
    elemnames text[],
    elemvals float8[]
);
CREATE UNIQUE INDEX parentdetail_pname ON parentdetail(pname, pts);
CREATE INDEX parentdetail_pts ON parentdetail(pts);
CREATE INDEX parentdetail_elemnames ON parentdetail USING GIN(elemnames);

CREATE TABLE
CREATE INDEX
CREATE INDEX
CREATE INDEX

データ登録
配列を使用しないテーブルからコピーします

INSERT INTO parentdetail
SELECT id, pname, pts, array_agg(elemname), array_agg(elemval)
FROM (
    SELECT p.id, p.pname, p.pts, d.elemname, d.elemval
    FROM parent p
    JOIN detail d ON d.pid = p.id
    ORDER BY p.id, p.pts, d.elemname
) AS ss
GROUP BY id, pname, pts;

INSERT 0 100000

sample=# VACUUM FREEZE ANALYZE parentdetail;
VACUUM

sample=# select count(*) from parentdetail;
 count
--------
 100000
(1 row)

テーブルの配列カラムには100個の要素があります

sample=# \x
Expanded display is on.
sample=# select * from parentdetail limit 3;
-[ RECORD 1 ]-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
id        | 1
pname     | pname 1
pts       | 2024-02-20 22:34:59.266191+09
elemnames | {elem_00,elem_01,elem_02,elem_03,elem_04,elem_05,elem_06,elem_07,elem_08,elem_09,elem_10,elem_11,elem_12,elem_13,elem_14,elem_15,elem_16,elem_17,elem_18,elem_19,elem_20,elem_21,elem_22,elem_23,elem_24,elem_25,elem_26,elem_27,elem_28,elem_29,elem_30,elem_31,elem_32,elem_33,elem_34,elem_35,elem_36,elem_37,elem_38,elem_39,elem_40,elem_41,elem_42,elem_43,elem_44,elem_45,elem_46,elem_47,elem_48,elem_49,elem_50,elem_51,elem_52,elem_53,elem_54,elem_55,elem_56,elem_57,elem_58,elem_59,elem_60,elem_61,elem_62,elem_63,elem_64,elem_65,elem_66,elem_67,elem_68,elem_69,elem_70,elem_71,elem_72,elem_73,elem_74,elem_75,elem_76,elem_77,elem_78,elem_79,elem_80,elem_81,elem_82,elem_83,elem_84,elem_85,elem_86,elem_87,elem_88,elem_89,elem_90,elem_91,elem_92,elem_93,elem_94,elem_95,elem_96,elem_97,elem_98,elem_99}
elemvals  | {0.5578424781174114,1.8688987233777858,2.532454784695549,3.551012583798954,4.781002846813838,5.692092965591744,6.052276109038051,7.038741584440093,8.274526448795289,9.290948370139265,0.9158197388739815,1.087443029095855,2.594387409923886,3.2410466068277834,4.938978012956088,5.2240230504643,6.6992087818430015,7.903430771245983,8.105917195818368,9.526474207024737,0.48151254197725635,1.9558553847637974,2.100327477776066,3.1551254830377253,4.298969593532522,5.924036394532411,6.83667297492568,7.8838922885450735,8.505649765495921,9.712901173484259,0.991441729144956,1.6315264593698409,2.9832718126358237,3.1378385203330232,4.534395708092891,5.9928662571163365,6.6724534155018524,7.4559486595736395,8.259820805209731,9.77008636917012,0.10396118042487856,1.8972932414822914,2.3177768397169736,3.5039479086979988,4.929960639517024,5.196145342116413,6.745347556617364,7.005048414648034,8.45526625067166,9.122239463314688,0.9978906266949004,1.885071851749533,2.1685913975634072,3.0120215326970516,4.480287646514828,5.088038610513056,6.273798389648139,7.299898903664687,8.078194138423644,9.421555507230412,0.2211484621629758,1.049490355400028,2.689289550549095,3.0880657839746597,4.23855177612629,5.4848760928667595,6.754527256800177,7.216099108600108,8.862817418337702,9.521847746809815,0.7893845568671374,1.3915215747423169,2.4709034013780844,3.299158373755379,4.915112676221472,5.1887857018180945,6.8111666636095585,7.709772456010885,8.389986766115928,9.7021804350255,0.5346130691754958,1.9321159016650178,2.1405559035845023,3.0770811453438434,4.726147408000518,5.961326044587711,6.292893091442441,7.041199629533683,8.435823395790496,9.140501644430149,0.7038452582862398,1.4358341101368275,2.210093674032553,3.6919528866247653,4.4922832584936465,5.747119235476337,6.623059809995059,7.248451382385245,8.570351872223657,9.63727607965652}
-[ RECORD 2 ]-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
id        | 2
pname     | pname 2
pts       | 2024-02-20 22:35:00.266191+09
elemnames | {elem_00,elem_01,elem_02,elem_03,elem_04,elem_05,elem_06,elem_07,elem_08,elem_09,elem_10,elem_11,elem_12,elem_13,elem_14,elem_15,elem_16,elem_17,elem_18,elem_19,elem_20,elem_21,elem_22,elem_23,elem_24,elem_25,elem_26,elem_27,elem_28,elem_29,elem_30,elem_31,elem_32,elem_33,elem_34,elem_35,elem_36,elem_37,elem_38,elem_39,elem_40,elem_41,elem_42,elem_43,elem_44,elem_45,elem_46,elem_47,elem_48,elem_49,elem_50,elem_51,elem_52,elem_53,elem_54,elem_55,elem_56,elem_57,elem_58,elem_59,elem_60,elem_61,elem_62,elem_63,elem_64,elem_65,elem_66,elem_67,elem_68,elem_69,elem_70,elem_71,elem_72,elem_73,elem_74,elem_75,elem_76,elem_77,elem_78,elem_79,elem_80,elem_81,elem_82,elem_83,elem_84,elem_85,elem_86,elem_87,elem_88,elem_89,elem_90,elem_91,elem_92,elem_93,elem_94,elem_95,elem_96,elem_97,elem_98,elem_99}
elemvals  | {0.5517032157172821,1.5748317008439585,2.2451892766674177,3.548539114875382,4.314386996179621,5.027413195081579,6.549060614637995,7.249279171226533,8.048426429311405,9.869324486208995,0.9994731068534861,1.863419108562482,2.705281995802636,3.681435844955274,4.86334743908202,5.650570652690419,6.093897086499567,7.62866295220573,8.718212207247973,9.907515939573784,0.7869480024046993,1.3662168990019268,2.358453917431973,3.874457515332267,4.027093954676786,5.9458541865416805,6.7015669858708975,7.641268336241808,8.63322706463083,9.722126726179056,0.6944037389578739,1.0537417687345965,2.9067223637380835,3.2586318394243676,4.21942441324228,5.619841589399073,6.5376877936260485,7.234262257944181,8.000947155429742,9.415951984979927,0.5112142060784777,1.3494561235613105,2.5113980223821386,3.9931275523780982,4.0136253630353735,5.2324469882548215,6.877560981117597,7.283343298650777,8.338727936592988,9.345513814160029,0.7868279088104941,1.505744641011404,2.794061509003825,3.891230973568497,4.393448592791458,5.072897406829668,6.6071485726766355,7.718718248928042,8.224780730547366,9.49703783363151,0.8317445301764224,1.2298908328427274,2.087539929721345,3.296322569099118,4.254087694224438,5.285709328072869,6.974049708112773,7.901078671070934,8.299149287698185,9.112429747535032,0.968491493677277,1.5600536075416471,2.936883443631416,3.2078535143020304,4.153154624526735,5.246665256662197,6.588477205214133,7.24757746160293,8.514686348929274,9.565251906739775,0.03260052579258854,1.991243832823045,2.850341825309272,3.657322256237819,4.311656268869285,5.188790281503941,6.674234122247356,7.1078054137760205,8.638030208102975,9.486699871343433,0.1235842184804028,1.8513008981303187,2.3003410273086615,3.334251076195585,4.864971002901541,5.078328638232904,6.569690320457596,7.889767780070713,8.000581274560147,9.141578973172216}
-[ RECORD 3 ]-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
id        | 3
pname     | pname 3
pts       | 2024-02-20 22:35:01.266191+09
elemnames | {elem_00,elem_01,elem_02,elem_03,elem_04,elem_05,elem_06,elem_07,elem_08,elem_09,elem_10,elem_11,elem_12,elem_13,elem_14,elem_15,elem_16,elem_17,elem_18,elem_19,elem_20,elem_21,elem_22,elem_23,elem_24,elem_25,elem_26,elem_27,elem_28,elem_29,elem_30,elem_31,elem_32,elem_33,elem_34,elem_35,elem_36,elem_37,elem_38,elem_39,elem_40,elem_41,elem_42,elem_43,elem_44,elem_45,elem_46,elem_47,elem_48,elem_49,elem_50,elem_51,elem_52,elem_53,elem_54,elem_55,elem_56,elem_57,elem_58,elem_59,elem_60,elem_61,elem_62,elem_63,elem_64,elem_65,elem_66,elem_67,elem_68,elem_69,elem_70,elem_71,elem_72,elem_73,elem_74,elem_75,elem_76,elem_77,elem_78,elem_79,elem_80,elem_81,elem_82,elem_83,elem_84,elem_85,elem_86,elem_87,elem_88,elem_89,elem_90,elem_91,elem_92,elem_93,elem_94,elem_95,elem_96,elem_97,elem_98,elem_99}
elemvals  | {0.011192177232583589,1.2261017701753012,2.6476683728557013,3.2989821410690006,4.423396950169131,5.039980834521867,6.705997676957946,7.986420148226198,8.13047389644667,9.455212986563748,0.936126589799283,1.5047769231057195,2.553468438619216,3.1469482631127725,4.441137887449354,5.808896106035018,6.172813206272036,7.04503640801202,8.77412655300905,9.498946044372797,0.30957803602164446,1.1152768721358193,2.7194497766105137,3.6016697925090106,4.906719971515809,5.140668813389748,6.073976955417134,7.351635111864862,8.310333263433545,9.502661096710636,0.4283396936235526,1.9724844708928053,2.050899369485119,3.9624022940665604,4.948181802455661,5.458185045076544,6.624536954412047,7.441990684456851,8.1554093001706,9.35427871693103,0.27425188998709515,1.8057242539120857,2.709440412056761,3.5391320760024705,4.771187255014279,5.6413506954940935,6.188455570433785,7.005186418868316,8.195873335328866,9.041317409295953,0.8706474146525629,1.6402077491372715,2.7194776237973706,3.3840215024419926,4.515470791095236,5.00853677744929,6.38505482031756,7.115059789087791,8.144892753175196,9.132338921475576,0.8887968318184498,1.528478790420177,2.8246425613370825,3.2728543059185817,4.3612827722603065,5.548024734657229,6.445461683056788,7.711647762512012,8.148728962634646,9.508393557048276,0.504353792841453,1.5318588300208198,2.5942389309642344,3.936480899873896,4.885834772229469,5.399451754241209,6.819053412303088,7.332632112445918,8.643182709790477,9.344465488000633,0.5038003487364726,1.9518922511074202,2.7796879011253672,3.542263691157796,4.669744889380905,5.585256079232732,6.029902848866282,7.094784730602651,8.46003939038415,9.340918067227431,0.640892262676612,1.8117916802564018,2.920973414925225,3.4255223925283467,4.153483059250998,5.992122321071502,6.783223733214395,7.925412939603557,8.183514362315332,9.204939141031655}

postgres=# select array_length(elemnames, 1) from parentdetail limit 1;
 array_length
--------------
          100
(1 row)

テーブルサイズを確認します

sample=# SELECT pg_size_pretty(pg_table_size('parentdetail'::regclass));
 pg_size_pretty
----------------
 130 MB
(1 row)

sample=# SELECT pg_size_pretty(pg_indexes_size('parentdetail'::regclass));
 pg_size_pretty
----------------
 28 MB
(1 row)

sample=# SELECT pg_size_pretty(pg_total_relation_size('parentdetail_pkey'));
 pg_size_pretty
----------------
 2208 kB
(1 row)

sample=# SELECT pg_size_pretty(pg_total_relation_size('parentdetail_pname'));
 pg_size_pretty
----------------
 4136 kB
(1 row)

sample=# SELECT pg_size_pretty(pg_total_relation_size('parentdetail_pts'));
 pg_size_pretty
----------------
 2208 kB
(1 row)

sample=# SELECT pg_size_pretty(pg_total_relation_size('parentdetail_elemnames'));
 pg_size_pretty
----------------
 20 MB
(1 row)

sample=# SELECT pg_size_pretty(pg_total_relation_size('parentdetail'::regclass));
 pg_size_pretty
----------------
 159 MB
(1 row)

sample=# SELECT timeit( $$ SELECT count(1) FROM parentdetail WHERE id = $1 $$);
     timeit
-----------------
 00:00:00.000106
(1 row)

元のテーブル(897 MB)から 8 割以上のサイズ削減が出来ました。

テーブルのストレージがextendedの項目はトーストにより圧縮されています。

sample-# \d+  parentdetail
                                           テーブル"public.parentdetail"
         |          タイプ          | 照合順序 | Null 値を許容 | デフォルト | ストレージ | 圧縮 | 統計目標 | 説明 
-----------+--------------------------+----------+---------------+------------+------------+------+----------+------
 id        | bigint                   |          | not null      |            | plain      |      |          | 
 pname     | text                     |          | not null      |            | extended   |      |          | 
 pts       | timestamp with time zone |          | not null      |            | plain      |      |          | 
 elemnames | text[]                   |          |               |            | extended   |      |          | 
 elemvals  | double precision[]       |          |               |            | extended   |      |          | 
インデックス:
    "parentdetail_pkey" PRIMARY KEY, btree (id)
    "parentdetail_elemnames" gin (elemnames)
    "parentdetail_pname" UNIQUE, btree (pname, pts)
    "parentdetail_pts" btree (pts)
アクセスメソッド: heap

トーストは PostgreSQL の内部メカニズムであり、デフォルトで 2040 バイトのしきい値を超える行のデータを圧縮しようとします。カラムを配列化することでしきい値を超えさせてトースト有効化しているわけです。
更にカラムを配列化すること隣接するデータが同じになる可能性が高くなり、高い圧縮率が期待できます。面白いです。

配列型ではなくJSONB型でもトーストによる圧縮効果が得られ、近いパフォーマンスが得られます。

1
1
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
1
1