0
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

続・PostgreSQL全文検索のざっくり性能比較 - pg_bigmとpg_trgmとPGroonga(Wikipediaの本文を検索する)

Last updated at Posted at 2025-07-09

再検証した

この記事だと各々が本気を出せてなかったので、再検証しました!

記録としてこの記事も残しますが、↑の方をご覧ください。

概要

pg_bigmとpg_trgmのPostgreSQL全文検索インデックスの性能比較 という記事を拝見しました。

私はPGroongaというPostgreSQLで高速に日本語の全文検索ができる拡張機能の開発をしているので、↑の記事を参考にPGroongaの実力を確認してみる記事を書いたのですが、PGroonga的にちょっと残念な記事だったので、もう少し長いテキストでPGroongaの実力を確認してみる記事です。
参考: https://pgroonga.github.io/ja/
参考: PostgreSQL全文検索のざっくり性能比較 - pg_bigmとpg_trgmとPGroonga(Wikipediaのタイトルを検索する)

  • pg_bigm
  • pg_trgm
  • PGroonga

それぞれのざっくり性能を比較します。
厳密に計測したわけではないので、参考程度にご覧ください。

さっそく結論: 測定結果

インデックスの作成時間とサイズ

名前 create index index size
PGroonga 約9.5分 約21 GB
pg_bigm 約37.5分 5758 MB
pg_trgm 約1時間14.5分 8768 MB

PGroongaはインデックスの作成がかなりはやいけど、サイズはかなりでかい。

日本語の検索

名前 東京 技術者 ロケット データベース
PGroonga 357.335 ms 307.343 ms 78.145 ms 79.588 ms 118.523 ms
pg_bigm 16542.671 ms 10633.977 ms 1024.921 ms 2833.574 ms 2224.894 ms
pg_trgm 35241.176 ms 29368.434 ms 931.636 ms 1645.658 ms 2055.854 ms

PGroonga頑張った。だいぶはやい。

アルファベットで検索

名前 DB test PostgreSQL
PGroonga 121.415 ms 53.101 ms 21.639 ms
pg_bigm 65.533 ms 29906.619 ms 48.832 ms
pg_trgm 30557.069 ms 17207.107 ms 21.954 ms

PGroonga頑張った。だいぶはやい。

まとめ

長めのテキストだとPGroongaがだいぶはやいです!インデックスのサイズがかなりでかいのが気になりますが…。
短めのテキストだとpg_bigmやpg_trgmの方がはやいですし、要件に応じて使い分けると良いのではないでしょうか!

pg_bigmやpg_trgmとだいぶ差があるので、もっとより良いインデックスの作り方があるとか、使い方が間違っているとかあればご指摘ください。


以降にどのように確認したか掲載します。

環境

$ cat /etc/lsb-release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=24.04
DISTRIB_CODENAME=noble
DISTRIB_DESCRIPTION="Ubuntu 24.04.2 LTS"
$ free -h
               total        used        free      shared  buff/cache   available
Mem:            62Gi       215Mi        61Gi       139Mi       1.3Gi        62Gi
Swap:          2.0Gi          0B       2.0Gi
$ cat /proc/cpuinfo 
processor	: 0
vendor_id	: GenuineIntel
cpu family	: 6
model		: 183
model name	: 13th Gen Intel(R) Core(TM) i9-13900
stepping	: 1
microcode	: 0x12f
cpu MHz		: 800.000
cache size	: 36864 KB
physical id	: 0
siblings	: 32
core id		: 0
cpu cores	: 24
apicid		: 0
initial apicid	: 0
fpu		: yes
fpu_exception	: yes
cpuid level	: 32
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf tsc_known_freq pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb ssbd ibrs ibpb stibp ibrs_enhanced tpr_shadow flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid rdseed adx smap clflushopt clwb intel_pt sha_ni xsaveopt xsavec xgetbv1 xsaves split_lock_detect user_shstk avx_vnni dtherm ida arat pln pts hwp hwp_notify hwp_act_window hwp_epp hwp_pkg_req hfi vnmi umip pku ospke waitpkg gfni vaes vpclmulqdq tme rdpid movdiri movdir64b fsrm md_clear serialize pconfig arch_lbr ibt flush_l1d arch_capabilities
vmx flags	: vnmi preemption_timer posted_intr invvpid ept_x_only ept_ad ept_1gb flexpriority apicv tsc_offset vtpr mtf vapic ept vpid unrestricted_guest vapic_reg vid ple shadow_vmcs ept_mode_based_exec tsc_scaling usr_wait_pause
bugs		: spectre_v1 spectre_v2 spec_store_bypass swapgs eibrs_pbrsb rfds bhi
bogomips	: 3993.60
clflush size	: 64
cache_alignment	: 64
address sizes	: 46 bits physical, 48 bits virtual
power management:

...(略)

準備

PostgreSQLとcontrib(pg_trgm)

sudo apt install -y -V ca-certificates lsb-release wget
sudo wget -O /usr/share/keyrings/pgdg.asc https://www.postgresql.org/media/keys/ACCC4CF8.asc
(echo "Types: deb"; \
   echo "URIs: http://apt.postgresql.org/pub/repos/apt"; \
   echo "Suites: $(lsb_release --codename --short)-pgdg"; \
   echo "Components: main"; \
   echo "Signed-By: /usr/share/keyrings/pgdg.asc") | \
    sudo tee /etc/apt/sources.list.d/pgdg.sources
sudo apt update
sudo apt install -y -V postgresql-17 postgresql-contrib

pg_bigm

sudo apt install -y -V gcc make postgresql-server-dev-17 wget
wget https://github.com/pgbigm/pg_bigm/archive/refs/tags/v1.2-20240606.tar.gz
tar zxf ./v1.2-20240606.tar.gz
cd pg_bigm-1.2-20240606
make USE_PGXS=1 PG_CONFIG=/usr/bin/pg_config
sudo make USE_PGXS=1 PG_CONFIG=/usr/bin/pg_config install

設定ファイル(/etc/postgresql/17/main/postgresql.conf)の更新:

shared_preload_libraries = 'pg_bigm'

参考: https://github.com/pgbigm/pg_bigm/blob/07269d7e72dd8bace781887bc94316da70da6a2d/docs/pg_bigm.md

PGroonga

sudo apt install -y -V ca-certificates lsb-release wget
wget https://packages.groonga.org/ubuntu/groonga-apt-source-latest-$(lsb_release --codename --short).deb
sudo apt install -y -V ./groonga-apt-source-latest-$(lsb_release --codename --short).deb
rm -f ./groonga-apt-source-latest-$(lsb_release --codename --short).deb
$ sudo apt update
$ sudo apt install -y -V postgresql-17-pgdg-pgroonga

参考: https://pgroonga.github.io/ja/install/ubuntu.html

PostgreSQL起動

sudo systemctl start postgresql

各拡張機能のバージョン

$ sudo -u postgres psql \
  --command 'drop database if exists test_articles' \
  --command 'create database test_articles'
NOTICE:  database "test_articles" does not exist, skipping
DROP DATABASE
CREATE DATABASE
$ sudo -u postgres psql -d test_articles \
  --command "create extension if not exists pg_bigm;" \
  --command "create extension if not exists pg_trgm;" \
  --command "create extension if not exists pgroonga;" \
  --command "select * from pg_extension;"
CREATE EXTENSION
CREATE EXTENSION
CREATE EXTENSION
   oid   | extname  | extowner | extnamespace | extrelocatable | extversion | extconfig | extcondition 
---------+----------+----------+--------------+----------------+------------+-----------+--------------
   13649 | plpgsql  |       10 |           11 | f              | 1.0        |           | 
 1026533 | pg_bigm  |       10 |         2200 | t              | 1.2        |           | 
 1026556 | pg_trgm  |       10 |         2200 | t              | 1.6        |           | 
 1026637 | pgroonga |       10 |         2200 | f              | 4.0.1      |           | 
(4 rows)

使うデータのダウンロードと加工

これ( https://dumps.wikimedia.org/jawiki/latest/jawiki-latest-pages-articles.xml.bz2 )を使いますが、XMLなのとサイズがでかくて扱うのが大変なので、これ( https://github.com/groonga/wikipedia-search )を使ってインポート用のSQLを準備します。

(インポート用のSQLが作れれば何を使っても良いです。)

git clone https://github.com/groonga/wikipedia-search.git
cd wikipedia-search/
bundle install # これで失敗するならよしなに対応してください。
rake data:convert:sql:ja:all

config/sql/schema.postgresql.sql にテーブル定義が、data/sql/ja-all-pages.sql にSQLが出力されるのでインポートする。

sudo -u postgres psql -d test_articles < config/sql/schema.postgresql.sql
sudo -u postgres psql -d test_articles < data/sql/ja-all-pages.sql

サイズ情報:
今回試したタイミングではXMLのファイルサイズが18GBくらい、SQL( data/sql/ja-all-pages.sql )にしたら、15GBくらいでした。

ざっくり測定

実行したコマンドと結果を記載します。

PGroonga

#!/usr/bin/bash

function run_select() {
  query=$1
  result=$(sudo -u postgres psql -d test_articles \
    --command "set max_parallel_workers_per_gather = 0" \
    --command "\timing" \
    --command "\pset pager off" \
    --command "explain analyze select * from wikipedia where text &@ '${query}';")
  echo "${result}" 1>&2
  time=$(echo "${result}" | grep '^Time:')
  echo "Query: ${query} ${time}"
}

sudo -u postgres psql -d test_articles \
  --command 'drop index if exists wikipedia_index' \
  --command "\timing" \
  --command "create index wikipedia_index on wikipedia using pgroonga (title, text);"

run_select "象"
run_select "東京"
run_select "技術者"
run_select "ロケット"
run_select "データベース"
run_select "DB"
run_select "test"
run_select "PostgreSQL"

sudo -u postgres psql -d test_articles --command "select pg_size_pretty(pg_relation_size('wikipedia_index'));"

db_oid=$(sudo -u postgres psql -d test_articles \
  --no-align \
  --tuples-only \
  --command "SELECT oid FROM pg_database where datname = 'test_articles';")

sudo -u postgres psql -d test_articles --command "vacuum;"
ls -l /var/lib/postgresql/17/main/base/${db_oid}/pgrn* | awk '{ size += $5 } END { print size }'
$ ./run-pgroonga.sh 2> log
DROP INDEX
Timing is on.
CREATE INDEX
Time: 577768.354 ms (09:37.768)
Query: 象 Time: 357.335 ms
Query: 東京 Time: 307.343 ms
Query: 技術者 Time: 78.145 ms
Query: ロケット Time: 79.588 ms
Query: データベース Time: 118.523 ms
Query: DB Time: 121.415 ms
Query: test Time: 53.101 ms
Query: PostgreSQL Time: 21.639 ms
 pg_size_pretty 
----------------
 0 bytes
(1 row)

VACUUM
23021187072

PGroongaの場合は pg_relation_size() でインデックスのサイズはわかりません。
PGDATAにあるpgrnで始まるファイルのサイズ = インデックスのサイズと考えてください。
今回の場合はだいたい21GBほどです。でかい。

log

特にアルファベットでの検索について、pg_bigmやpg_trgmと扱い方が違っており、ヒット件数に差がありますが、ざっくり性能比較ということでご容赦ください。

SET
Timing is on.
Pager usage is off.
                                                                QUERY PLAN                                                                
------------------------------------------------------------------------------------------------------------------------------------------
 Index Scan using wikipedia_index on wikipedia  (cost=0.00..958.39 rows=1464 width=491) (actual time=27.208..318.136 rows=192783 loops=1)
   Index Cond: (text &@ '象'::text)
 Planning Time: 33.531 ms
 Execution Time: 321.578 ms
(4 rows)

Time: 357.335 ms
SET
Timing is on.
Pager usage is off.
                                                                 QUERY PLAN                                                                  
---------------------------------------------------------------------------------------------------------------------------------------------
 Index Scan using wikipedia_index on wikipedia  (cost=0.00..391636.46 rows=1464 width=491) (actual time=18.600..266.914 rows=371644 loops=1)
   Index Cond: (text &@ '東京'::text)
 Planning Time: 19.322 ms
 JIT:
   Functions: 2
   Options: Inlining false, Optimization false, Expressions true, Deforming true
   Timing: Generation 0.119 ms (Deform 0.051 ms), Inlining 0.000 ms, Optimization 0.000 ms, Emission 0.000 ms, Total 0.119 ms
 Execution Time: 287.516 ms
(8 rows)

Time: 307.343 ms
SET
Timing is on.
Pager usage is off.
                                                            QUERY PLAN                                                            
----------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on wikipedia  (cost=0.37..61147.19 rows=1464 width=491) (actual time=12.066..59.930 rows=19974 loops=1)
   Recheck Cond: (text &@ '技術者'::text)
   Heap Blocks: exact=16545
   ->  Bitmap Index Scan on wikipedia_index  (cost=0.00..0.00 rows=18767 width=0) (actual time=10.422..10.422 rows=19974 loops=1)
         Index Cond: (text &@ '技術者'::text)
 Planning Time: 17.170 ms
 Execution Time: 60.559 ms
(7 rows)

Time: 78.145 ms
SET
Timing is on.
Pager usage is off.
                                                            QUERY PLAN                                                            
----------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on wikipedia  (cost=0.37..94737.68 rows=1464 width=491) (actual time=22.385..61.757 rows=18083 loops=1)
   Recheck Cond: (text &@ 'ロケット'::text)
   Heap Blocks: exact=14982
   ->  Bitmap Index Scan on wikipedia_index  (cost=0.00..0.00 rows=34310 width=0) (actual time=21.000..21.000 rows=18083 loops=1)
         Index Cond: (text &@ 'ロケット'::text)
 Planning Time: 16.742 ms
 Execution Time: 62.325 ms
(7 rows)

Time: 79.588 ms
SET
Timing is on.
Pager usage is off.
                                                            QUERY PLAN                                                            
----------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on wikipedia  (cost=0.37..83187.60 rows=1464 width=491) (actual time=31.797..95.591 rows=39439 loops=1)
   Recheck Cond: (text &@ 'データベース'::text)
   Heap Blocks: exact=28991
   ->  Bitmap Index Scan on wikipedia_index  (cost=0.00..0.00 rows=28435 width=0) (actual time=29.123..29.123 rows=39439 loops=1)
         Index Cond: (text &@ 'データベース'::text)
 Planning Time: 21.558 ms
 Execution Time: 96.591 ms
(7 rows)

Time: 118.523 ms
SET
Timing is on.
Pager usage is off.
                                                            QUERY PLAN                                                            
----------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on wikipedia  (cost=0.37..113750.43 rows=1464 width=491) (actual time=14.756..91.316 rows=45440 loops=1)
   Recheck Cond: (text &@ 'DB'::text)
   Heap Blocks: exact=31857
   ->  Bitmap Index Scan on wikipedia_index  (cost=0.00..0.00 rows=45442 width=0) (actual time=11.562..11.562 rows=45440 loops=1)
         Index Cond: (text &@ 'DB'::text)
 Planning Time: 17.624 ms
 JIT:
   Functions: 2
   Options: Inlining false, Optimization false, Expressions true, Deforming true
   Timing: Generation 0.087 ms (Deform 0.039 ms), Inlining 0.000 ms, Optimization 0.000 ms, Emission 0.000 ms, Total 0.087 ms
 Execution Time: 103.286 ms
(11 rows)

Time: 121.415 ms
SET
Timing is on.
Pager usage is off.
                                                          QUERY PLAN                                                          
------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on wikipedia  (cost=0.37..34041.16 rows=1464 width=491) (actual time=6.246..29.698 rows=9149 loops=1)
   Recheck Cond: (text &@ 'test'::text)
   Heap Blocks: exact=8306
   ->  Bitmap Index Scan on wikipedia_index  (cost=0.00..0.00 rows=9151 width=0) (actual time=5.515..5.516 rows=9149 loops=1)
         Index Cond: (text &@ 'test'::text)
 Planning Time: 22.545 ms
 Execution Time: 30.092 ms
(7 rows)

Time: 53.101 ms
SET
Timing is on.
Pager usage is off.
                                                             QUERY PLAN                                                              
-------------------------------------------------------------------------------------------------------------------------------------
 Index Scan using wikipedia_index on wikipedia  (cost=0.00..1050.62 rows=1464 width=491) (actual time=2.464..3.856 rows=260 loops=1)
   Index Cond: (text &@ 'PostgreSQL'::text)
 Planning Time: 17.216 ms
 Execution Time: 4.132 ms
(4 rows)

Time: 21.639 ms

pg_bigm

#!/usr/bin/bash

function run_select() {
  query=$1
  result=$(sudo -u postgres psql -d test_articles \
    --command "set max_parallel_workers_per_gather = 0" \
    --command "\timing" \
    --command "\pset pager off" \
    --command "explain analyze select * from wikipedia where text like '%${query}%';")
  echo "${result}" 1>&2
  time=$(echo "${result}" | grep '^Time:')
  echo "Query: ${query} ${time}"
}

sudo -u postgres psql -d test_articles \
  --command 'drop index if exists wikipedia_index' \
  --command "\timing" \
  --command "create index wikipedia_index on wikipedia using gin (title gin_bigm_ops, text gin_bigm_ops);"

run_select "象"
run_select "東京"
run_select "技術者"
run_select "ロケット"
run_select "データベース"
run_select "DB"
run_select "test"
run_select "PostgreSQL"

sudo -u postgres psql -d test_articles --command "select pg_size_pretty(pg_relation_size('wikipedia_index'));"
$ ./run-pg_bigm.sh 2> log
DROP INDEX
Timing is on.
CREATE INDEX
Time: 2254135.144 ms (37:34.135)
Query: 象 Time: 16542.671 ms (00:16.543)
Query: 東京 Time: 10633.977 ms (00:10.634)
Query: 技術者 Time: 1024.921 ms (00:01.025)
Query: ロケット Time: 2833.574 ms (00:02.834)
Query: データベース Time: 2224.894 ms (00:02.225)
Query: DB Time: 65.533 ms
Query: test Time: 29906.619 ms (00:29.907)
Query: PostgreSQL Time: 48.832 ms
 pg_size_pretty 
----------------
 5758 MB
(1 row)

log
SET
Timing is on.
Pager usage is off.
                                                             QUERY PLAN                                                              
-------------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on wikipedia  (cost=190.38..39822.23 rows=14788 width=491) (actual time=33.641..16536.705 rows=192783 loops=1)
   Recheck Cond: (text ~~ '%象%'::text)
   Rows Removed by Index Recheck: 470347
   Heap Blocks: exact=39863 lossy=33170
   ->  Bitmap Index Scan on wikipedia_index  (cost=0.00..186.68 rows=14788 width=0) (actual time=29.617..29.618 rows=133358 loops=1)
         Index Cond: (text ~~ '%象%'::text)
 Planning Time: 1.077 ms
 Execution Time: 16540.694 ms
(8 rows)

Time: 16542.671 ms (00:16.543)
SET
Timing is on.
Pager usage is off.
                                                             QUERY PLAN                                                              
-------------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on wikipedia  (cost=427.05..89513.07 rows=59154 width=491) (actual time=26.591..10627.197 rows=371644 loops=1)
   Recheck Cond: (text ~~ '%東京%'::text)
   Rows Removed by Index Recheck: 375152
   Heap Blocks: exact=56189 lossy=33046
   ->  Bitmap Index Scan on wikipedia_index  (cost=0.00..412.26 rows=59154 width=0) (actual time=20.313..20.313 rows=371644 loops=1)
         Index Cond: (text ~~ '%東京%'::text)
 Planning Time: 0.526 ms
 Execution Time: 10633.144 ms
(8 rows)

Time: 10633.977 ms (00:10.634)
SET
Timing is on.
Pager usage is off.
                                                          QUERY PLAN                                                           
-------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on wikipedia  (cost=56.38..625.21 rows=146 width=491) (actual time=7.170..1022.913 rows=19974 loops=1)
   Recheck Cond: (text ~~ '%技術者%'::text)
   Rows Removed by Index Recheck: 411
   Heap Blocks: exact=16842
   ->  Bitmap Index Scan on wikipedia_index  (cost=0.00..56.34 rows=146 width=0) (actual time=5.370..5.370 rows=20385 loops=1)
         Index Cond: (text ~~ '%技術者%'::text)
 Planning Time: 0.830 ms
 Execution Time: 1023.573 ms
(8 rows)

Time: 1024.921 ms (00:01.025)
SET
Timing is on.
Pager usage is off.
                                                          QUERY PLAN                                                           
-------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on wikipedia  (cost=82.12..650.95 rows=146 width=491) (actual time=11.921..2831.820 rows=18082 loops=1)
   Recheck Cond: (text ~~ '%ロケット%'::text)
   Rows Removed by Index Recheck: 7485
   Heap Blocks: exact=20186
   ->  Bitmap Index Scan on wikipedia_index  (cost=0.00..82.08 rows=146 width=0) (actual time=9.300..9.300 rows=25567 loops=1)
         Index Cond: (text ~~ '%ロケット%'::text)
 Planning Time: 0.839 ms
 Execution Time: 2832.323 ms
(8 rows)

Time: 2833.574 ms (00:02.834)
SET
Timing is on.
Pager usage is off.
                                                            QUERY PLAN                                                            
----------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on wikipedia  (cost=133.61..702.44 rows=146 width=491) (actual time=18.241..2222.587 rows=39438 loops=1)
   Recheck Cond: (text ~~ '%データベース%'::text)
   Rows Removed by Index Recheck: 536
   Heap Blocks: exact=29278
   ->  Bitmap Index Scan on wikipedia_index  (cost=0.00..133.57 rows=146 width=0) (actual time=14.047..14.047 rows=39974 loops=1)
         Index Cond: (text ~~ '%データベース%'::text)
 Planning Time: 0.826 ms
 Execution Time: 2223.643 ms
(8 rows)

Time: 2224.894 ms (00:02.225)
SET
Timing is on.
Pager usage is off.
                                                          QUERY PLAN                                                           
-------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on wikipedia  (cost=30.64..599.47 rows=146 width=491) (actual time=5.678..63.752 rows=31892 loops=1)
   Recheck Cond: (text ~~ '%DB%'::text)
   Heap Blocks: exact=24942
   ->  Bitmap Index Scan on wikipedia_index  (cost=0.00..30.60 rows=146 width=0) (actual time=3.531..3.531 rows=31892 loops=1)
         Index Cond: (text ~~ '%DB%'::text)
 Planning Time: 0.783 ms
 Execution Time: 64.344 ms
(7 rows)

Time: 65.533 ms
SET
Timing is on.
Pager usage is off.
                                                            QUERY PLAN                                                             
-----------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on wikipedia  (cost=82.12..650.95 rows=146 width=491) (actual time=84.103..29904.063 rows=41398 loops=1)
   Recheck Cond: (text ~~ '%test%'::text)
   Rows Removed by Index Recheck: 1159258
   Heap Blocks: exact=63578 lossy=33026
   ->  Bitmap Index Scan on wikipedia_index  (cost=0.00..82.08 rows=146 width=0) (actual time=75.905..75.906 rows=1036230 loops=1)
         Index Cond: (text ~~ '%test%'::text)
 Planning Time: 0.767 ms
 Execution Time: 29905.393 ms
(8 rows)

Time: 29906.619 ms (00:29.907)
SET
Timing is on.
Pager usage is off.
                                                           QUERY PLAN                                                           
--------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on wikipedia  (cost=240.70..809.53 rows=146 width=491) (actual time=14.756..47.517 rows=259 loops=1)
   Recheck Cond: (text ~~ '%PostgreSQL%'::text)
   Rows Removed by Index Recheck: 46
   Heap Blocks: exact=304
   ->  Bitmap Index Scan on wikipedia_index  (cost=0.00..240.66 rows=146 width=0) (actual time=14.632..14.632 rows=305 loops=1)
         Index Cond: (text ~~ '%PostgreSQL%'::text)
 Planning Time: 0.806 ms
 Execution Time: 47.582 ms
(8 rows)

Time: 48.832 ms

pg_trgm

#!/usr/bin/bash

function run_select() {
  query=$1
  result=$(sudo -u postgres psql -d test_articles \
    --command "set max_parallel_workers_per_gather = 0" \
    --command "\timing" \
    --command "\pset pager off" \
    --command "explain analyze select * from wikipedia where text like '%${query}%';")
  echo "${result}" 1>&2
  time=$(echo "${result}" | grep '^Time:')
  echo "Query: ${query} ${time}"
}

sudo -u postgres psql -d test_articles \
  --command 'drop index if exists wikipedia_index' \
  --command "\timing" \
  --command "create index wikipedia_index on wikipedia using gin (title gin_bigm_ops, text gin_trgm_ops);"

run_select "象"
run_select "東京"
run_select "技術者"
run_select "ロケット"
run_select "データベース"
run_select "DB"
run_select "test"
run_select "PostgreSQL"

sudo -u postgres psql -d test_articles --command "select pg_size_pretty(pg_relation_size('wikipedia_index'));"
$ ./run-pg_trgm.sh 2> log
DROP INDEX
Timing is on.
CREATE INDEX
Time: 4461013.327 ms (01:14:21.013)
Query: 象 Time: 35241.176 ms (00:35.241)
Query: 東京 Time: 29368.434 ms (00:29.368)
Query: 技術者 Time: 931.636 ms
Query: ロケット Time: 1645.658 ms (00:01.646)
Query: データベース Time: 2055.854 ms (00:02.056)
Query: DB Time: 30557.069 ms (00:30.557)
Query: test Time: 17207.107 ms (00:17.207)
Query: PostgreSQL Time: 21.954 ms
 pg_size_pretty 
----------------
 8768 MB
(1 row)
log
SET
Timing is on.
Pager usage is off.
                                                          QUERY PLAN                                                          
------------------------------------------------------------------------------------------------------------------------------
 Seq Scan on wikipedia  (cost=0.00..115280.73 rows=14788 width=491) (actual time=2.770..35221.392 rows=192783 loops=1)
   Filter: (text ~~ '%象%'::text)
   Rows Removed by Filter: 1271275
 Planning Time: 2.726 ms
 JIT:
   Functions: 2
   Options: Inlining false, Optimization false, Expressions true, Deforming true
   Timing: Generation 0.088 ms (Deform 0.042 ms), Inlining 0.000 ms, Optimization 0.135 ms, Emission 1.949 ms, Total 2.172 ms
 Execution Time: 35236.999 ms
(9 rows)

Time: 35241.176 ms (00:35.241)
SET
Timing is on.
Pager usage is off.
                                                          QUERY PLAN                                                          
------------------------------------------------------------------------------------------------------------------------------
 Seq Scan on wikipedia  (cost=0.00..115280.73 rows=59154 width=491) (actual time=2.832..29350.074 rows=371644 loops=1)
   Filter: (text ~~ '%東京%'::text)
   Rows Removed by Filter: 1092414
 Planning Time: 0.830 ms
 JIT:
   Functions: 2
   Options: Inlining false, Optimization false, Expressions true, Deforming true
   Timing: Generation 0.085 ms (Deform 0.040 ms), Inlining 0.000 ms, Optimization 0.131 ms, Emission 1.755 ms, Total 1.971 ms
 Execution Time: 29367.337 ms
(9 rows)

Time: 29368.434 ms (00:29.368)
SET
Timing is on.
Pager usage is off.
                                                          QUERY PLAN                                                           
-------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on wikipedia  (cost=34.89..603.72 rows=146 width=491) (actual time=4.442..929.478 rows=19974 loops=1)
   Recheck Cond: (text ~~ '%技術者%'::text)
   Rows Removed by Index Recheck: 1
   Heap Blocks: exact=16546
   ->  Bitmap Index Scan on wikipedia_index  (cost=0.00..34.85 rows=146 width=0) (actual time=2.692..2.693 rows=19975 loops=1)
         Index Cond: (text ~~ '%技術者%'::text)
 Planning Time: 1.032 ms
 Execution Time: 930.088 ms
(8 rows)

Time: 931.636 ms
SET
Timing is on.
Pager usage is off.
                                                          QUERY PLAN                                                           
-------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on wikipedia  (cost=64.89..633.72 rows=146 width=491) (actual time=5.156..1643.670 rows=18082 loops=1)
   Recheck Cond: (text ~~ '%ロケット%'::text)
   Rows Removed by Index Recheck: 1349
   Heap Blocks: exact=15982
   ->  Bitmap Index Scan on wikipedia_index  (cost=0.00..64.85 rows=146 width=0) (actual time=3.243..3.244 rows=19431 loops=1)
         Index Cond: (text ~~ '%ロケット%'::text)
 Planning Time: 0.983 ms
 Execution Time: 1644.119 ms
(8 rows)

Time: 1645.658 ms (00:01.646)
SET
Timing is on.
Pager usage is off.
                                                           QUERY PLAN                                                           
--------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on wikipedia  (cost=124.88..693.71 rows=146 width=491) (actual time=13.672..2053.427 rows=39438 loops=1)
   Recheck Cond: (text ~~ '%データベース%'::text)
   Rows Removed by Index Recheck: 21
   Heap Blocks: exact=29004
   ->  Bitmap Index Scan on wikipedia_index  (cost=0.00..124.85 rows=146 width=0) (actual time=9.814..9.814 rows=39459 loops=1)
         Index Cond: (text ~~ '%データベース%'::text)
 Planning Time: 1.096 ms
 Execution Time: 2054.338 ms
(8 rows)

Time: 2055.854 ms (00:02.056)
SET
Timing is on.
Pager usage is off.
                                                          QUERY PLAN                                                          
------------------------------------------------------------------------------------------------------------------------------
 Seq Scan on wikipedia  (cost=0.00..115280.73 rows=146 width=491) (actual time=4.931..30541.758 rows=31892 loops=1)
   Filter: (text ~~ '%DB%'::text)
   Rows Removed by Filter: 1432166
 Planning Time: 0.986 ms
 JIT:
   Functions: 2
   Options: Inlining false, Optimization false, Expressions true, Deforming true
   Timing: Generation 0.111 ms (Deform 0.054 ms), Inlining 0.000 ms, Optimization 0.147 ms, Emission 2.235 ms, Total 2.492 ms
 Execution Time: 30555.601 ms
(9 rows)

Time: 30557.069 ms (00:30.557)
SET
Timing is on.
Pager usage is off.
                                                            QUERY PLAN                                                            
----------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on wikipedia  (cost=64.89..633.72 rows=146 width=491) (actual time=25.858..17204.716 rows=41398 loops=1)
   Recheck Cond: (text ~~ '%test%'::text)
   Rows Removed by Index Recheck: 594149
   Heap Blocks: exact=33208 lossy=33143
   ->  Bitmap Index Scan on wikipedia_index  (cost=0.00..64.85 rows=146 width=0) (actual time=20.320..20.320 rows=151466 loops=1)
         Index Cond: (text ~~ '%test%'::text)
 Planning Time: 0.763 ms
 Execution Time: 17205.999 ms
(8 rows)

Time: 17207.107 ms (00:17.207)
SET
Timing is on.
Pager usage is off.
                                                          QUERY PLAN                                                          
------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on wikipedia  (cost=244.87..813.70 rows=146 width=491) (actual time=8.389..20.330 rows=259 loops=1)
   Recheck Cond: (text ~~ '%PostgreSQL%'::text)
   Rows Removed by Index Recheck: 2
   Heap Blocks: exact=260
   ->  Bitmap Index Scan on wikipedia_index  (cost=0.00..244.84 rows=146 width=0) (actual time=8.284..8.284 rows=261 loops=1)
         Index Cond: (text ~~ '%PostgreSQL%'::text)
 Planning Time: 1.038 ms
 Execution Time: 20.472 ms
(8 rows)

Time: 21.954 ms
0
0
1

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
0
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?