More than 5 years have passed since last update.

実例で学ぶ、PostgreSQLだとJOINが必要ないケース

Posted at 2015-07-12

クソリプです

上記の記事は内容も正しく解説も丁寧で大変素晴らしいのですが、例示されているユースケースであれば、PostgreSQLでは配列型とGINインデックスを使うことでJOINそのものをなくせます。

実例

テーブル定義

CREATE TABLE articles (
  id serial PRIMARY KEY,
  created_at timestamp DEFAULT current_timestamp,
  tags int[]
);

CREATE INDEX articles_created_at ON articles (created_at);
-- 記事のタグリストに対するインデックス
CREATE INDEX articles_tags ON articles USING gin(tags);

サンプルデータ投入

INSERT INTO articles(created_at, tags)
SELECT
    current_timestamp + interval '1 minute' * i,
    array_uniq(ARRAY[0, i%11, i%13])
FROM
    generate_series(1,100000) as t(i);

/*
array_uniqはこんなの
CREATE OR REPLACE FUNCTION array_uniq(anyarray) RETURNS anyarray AS $$
SELECT
    array_agg(DISTINCT e)
FROM
    unnest($1) as t(e);
$$ LANGUAGE sql STRICT IMMUTABLE;
*/

ある記事（ID=1とする）のタグ一覧

test=# SELECT unnest(tags) FROM articles WHERE id = 1;
 unnest
--------
      0
      1
(2 rows)

test=# EXPLAIN SELECT unnest(tags) FROM articles WHERE id = 1;
                                   QUERY PLAN
---------------------------------------------------------------------------------
 Index Scan using articles_pkey on articles  (cost=0.42..8.93 rows=100 width=32)
   Index Cond: (id = 1)
(2 rows)

あるタグの新着記事10件

test=# SELECT * FROM articles where tags @> ARRAY[1] ORDER BY created_at DESC LIMIT 10;
   id   |         created_at         |   tags
--------+----------------------------+----------
 199997 | 2015-09-20 09:51:21.284207 | {0,1,7}
  99997 | 2015-09-20 09:51:11.733148 | {0,1,7}
 199991 | 2015-09-20 09:45:21.284207 | {0,1,8}
  99991 | 2015-09-20 09:45:11.733148 | {0,1,8}
 199984 | 2015-09-20 09:38:21.284207 | {0,1,5}
  99984 | 2015-09-20 09:38:11.733148 | {0,1,5}
 199980 | 2015-09-20 09:34:21.284207 | {0,1,10}
  99980 | 2015-09-20 09:34:11.733148 | {0,1,10}
 199971 | 2015-09-20 09:25:21.284207 | {0,1,3}
  99971 | 2015-09-20 09:25:11.733148 | {0,1,3}
(10 rows)

test=# EXPLAIN SELECT * FROM articles where tags @> ARRAY[1] ORDER BY created_at DESC LIMIT 10;
                                                 QUERY PLAN
------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.42..4.22 rows=10 width=44)
   ->  Index Scan Backward using articles_created_at on articles  (cost=0.42..12277.11 rows=32293 width=44)
         Filter: (tags @> '{1}'::integer[])
(3 rows)

記事に対するN:MのようなJOINは、上記のように配列型を利用することでJOINを減らし効率よく検索できる可能性があります。

デメリットとしては外部参照による制約が利用できるなくなる、更新の処理が重くなる（かも）などが挙げられます。配列型の利用は、元記事で言及されている、非正規化したキャッシュの一種だと考えられるでしょう。

万能ではないので選択肢の一つとして頭の片隅においておくといいと思います。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up

実例で学ぶ、PostgreSQLだとJOINが必要ないケース

実例

テーブル定義

サンプルデータ投入

ある記事（ID=1とする）のタグ一覧

あるタグの新着記事10件

コメント