More than 1 year has passed since last update.

ＡＲアドバンストテクノロジ株式会社（ARI）

Elasticsearch分散処理の性能検証(3): ノード数/シャード数による性能比較

Last updated at 2023-02-09Posted at 2023-02-08

はじめに

シリーズの目次

前回までPart2が終り、今回はPart3の「ノード数・シャード数による性能比較」です。

Elasticsearchクラスタ構成のおさらい(3ノードの例)

検証シナリオ

以下二つの構成に対して、Rallyベンチマークを実施し、性能比較します。

構成1: 1ノード/1シャード/レプリカなし
構成2: 2ノード/2シャード/レプリカなし

検証環境

Amazon EC2
- インスタンス数: 2台
- インスタンスタイプ: t3.medium
  - CPU: 2コア
  - メモリ: 4GB
  - ストレージ: 8GiB, gp2
- OSイメージ
  - Amazon Linux 2 Kernel 5.10 AMI 2.0.20230119.1 x86_64 HVM gp2
Elasticsearchクラスタ
- ノード数: 2
  - 性能比較が目的ですので、2ノードクラスタを使用します(コスト削減のため)
- ※注意: 2ノードクラスタは耐障害性に乏しく、本番環境で使用すべきではありません
  - 2ノードクラスタに対するElastic社の見解

構成1のベンチマーク手順: 1ノード/1シャード/レプリカなし

ノード2のElasticsearchを停止(障害発生の想定)

sudo systemctl stop elasticsearch

vmstatコマンドで、リソース使用量を計測

15秒間隔で、CPU、Memory、Disk I/Oを計測します。

ノード1

$ vmstat 15 |
  awk '{print strftime("%Y-%m-%d %H:%M:%S"), $0; fflush()}' |
  tee -a vmstat_node1_n1s1r0.log

ノード1で、Rallyベンチマーク実行

トラック: --track=percolator(データ量が少ないものを選びました)
ターゲットホストはノード1のみ: --target-hosts=node1:9200
シャード数は1: number_of_shards:1

$ esrally race --track=percolator --pipeline=benchmark-only
  --track-params="number_of_shards:1,number_of_replicas:0" --target-hosts=node1:9200
  --client-options="timeout:60,use_ssl:true,verify_certs:false,basic_auth_user:'elastic',basic_auth_password:'xxxxxx'"
  --kill-running-processes --report-file=result_n1s1r0.md

ベンチマーク結果(Throughput(秒間処理量)を抜粋)

Metric	Task	Value	Unit
Min Throughput	index	7796.37	docs/s
Mean Throughput	index	18240.7	docs/s
Median Throughput	index	18730.3	docs/s
Max Throughput	index	27302.9	docs/s
Min Throughput	percolator_with_content_president_bush	17.81	ops/s
Mean Throughput	percolator_with_content_president_bush	20.02	ops/s
Median Throughput	percolator_with_content_president_bush	20.2	ops/s
Max Throughput	percolator_with_content_president_bush	22.06	ops/s
Min Throughput	percolator_with_content_saddam_hussein	48.7	ops/s
Mean Throughput	percolator_with_content_saddam_hussein	48.86	ops/s
Median Throughput	percolator_with_content_saddam_hussein	48.86	ops/s
Max Throughput	percolator_with_content_saddam_hussein	49.01	ops/s
Min Throughput	percolator_with_content_hurricane_katrina	49.33	ops/s
Mean Throughput	percolator_with_content_hurricane_katrina	49.42	ops/s
Median Throughput	percolator_with_content_hurricane_katrina	49.42	ops/s
Max Throughput	percolator_with_content_hurricane_katrina	49.51	ops/s
Min Throughput	percolator_with_content_google	9.09	ops/s
Mean Throughput	percolator_with_content_google	9.24	ops/s
Median Throughput	percolator_with_content_google	9.23	ops/s
Max Throughput	percolator_with_content_google	9.42	ops/s
Min Throughput	percolator_no_score_with_content_google	97.59	ops/s
Mean Throughput	percolator_no_score_with_content_google	97.59	ops/s
Median Throughput	percolator_no_score_with_content_google	97.59	ops/s
Max Throughput	percolator_no_score_with_content_google	97.59	ops/s
Min Throughput	percolator_with_highlighting	48.85	ops/s
Mean Throughput	percolator_with_highlighting	49	ops/s
Median Throughput	percolator_with_highlighting	49	ops/s
Max Throughput	percolator_with_highlighting	49.14	ops/s
Min Throughput	percolator_with_content_ignore_me	0.08	ops/s
Mean Throughput	percolator_with_content_ignore_me	0.08	ops/s
Median Throughput	percolator_with_content_ignore_me	0.08	ops/s
Max Throughput	percolator_with_content_ignore_me	0.08	ops/s
Min Throughput	percolator_no_score_with_content_ignore_me	15.03	ops/s
Mean Throughput	percolator_no_score_with_content_ignore_me	15.04	ops/s
Median Throughput	percolator_no_score_with_content_ignore_me	15.04	ops/s
Max Throughput	percolator_no_score_with_content_ignore_me	15.06	ops/s

シャードを確認

percolatorトラックで使用するqueriesインデクスのみ抜粋しました。
ドキュメント数は200万で、1つのシャードに格納されています。

[ec2-user@ip-172-31-1-200 ~]$ curl -k https://elastic:elastic@localhost:9200/_cat/shards?v
index            shard prirep state      docs   store ip           node
queries          0     p      STARTED 2000000 114.4mb 172.31.1.200 ip-172-31-1-200.ec2.internal

vmstatコマンドを停止

ノード1で実行中のvmstatコマンドを停止します。

構成2のベンチマーク手順: 2ノード/2シャード/レプリカなし

Elasticsearchを再起動

ノード1とノード2で下記コマンドを実行します。

sudo systemctl restart elasticsearch

vmstatコマンドで、リソース使用量を計測

15秒間隔で、CPU、Memory、Disk I/Oを計測します。

ノード1

$ vmstat 15 |
  awk '{print strftime("%Y-%m-%d %H:%M:%S"), $0; fflush()}' |
  tee -a vmstat_node1_n2s2r0.log

ノード2

$ vmstat 15 |
  awk '{print strftime("%Y-%m-%d %H:%M:%S"), $0; fflush()}' |
  tee -a vmstat_node2_n2s2r0.log

ノード1で、Rallyベンチマーク実行

トラック: percolatorを使用
ターゲットホストはノード1、ノード2: --target-hosts=node1:9200,node2:9200
シャード数は2: number_of_shards:2

$ esrally race --track=percolator --pipeline=benchmark-only
  --track-params="number_of_shards:2,number_of_replicas:0" --target-hosts=node1:9200,node2:9200
  --client-options="timeout:60,use_ssl:true,verify_certs:false,basic_auth_user:'elastic',basic_auth_password:'xxxxxx'"
  --kill-running-processes --report-file=result_n2s2r0.md

ベンチマーク結果(Throughput(秒間処理量)を抜粋)

Metric	Task	Value	Unit
Min Throughput	index	10623.1	docs/s
Mean Throughput	index	22417.1	docs/s
Median Throughput	index	22907.3	docs/s
Max Throughput	index	33243.5	docs/s
Min Throughput	percolator_with_content_president_bush	20.73	ops/s
Mean Throughput	percolator_with_content_president_bush	23.51	ops/s
Median Throughput	percolator_with_content_president_bush	23.79	ops/s
Max Throughput	percolator_with_content_president_bush	25.99	ops/s
Min Throughput	percolator_with_content_saddam_hussein	48.8	ops/s
Mean Throughput	percolator_with_content_saddam_hussein	48.96	ops/s
Median Throughput	percolator_with_content_saddam_hussein	48.96	ops/s
Max Throughput	percolator_with_content_saddam_hussein	49.11	ops/s
Min Throughput	percolator_with_content_hurricane_katrina	49.6	ops/s
Mean Throughput	percolator_with_content_hurricane_katrina	49.66	ops/s
Median Throughput	percolator_with_content_hurricane_katrina	49.66	ops/s
Max Throughput	percolator_with_content_hurricane_katrina	49.71	ops/s
Min Throughput	percolator_with_content_google	17.09	ops/s
Mean Throughput	percolator_with_content_google	17.51	ops/s
Median Throughput	percolator_with_content_google	17.57	ops/s
Max Throughput	percolator_with_content_google	17.76	ops/s
Min Throughput	percolator_no_score_with_content_google	56.89	ops/s
Mean Throughput	percolator_no_score_with_content_google	56.89	ops/s
Median Throughput	percolator_no_score_with_content_google	56.89	ops/s
Max Throughput	percolator_no_score_with_content_google	56.89	ops/s
Min Throughput	percolator_with_highlighting	47.76	ops/s
Mean Throughput	percolator_with_highlighting	48.04	ops/s
Median Throughput	percolator_with_highlighting	48.04	ops/s
Max Throughput	percolator_with_highlighting	48.32	ops/s
Min Throughput	percolator_with_content_ignore_me	0.08	ops/s
Mean Throughput	percolator_with_content_ignore_me	0.08	ops/s
Median Throughput	percolator_with_content_ignore_me	0.08	ops/s
Max Throughput	percolator_with_content_ignore_me	0.09	ops/s
Min Throughput	percolator_no_score_with_content_ignore_me	15.01	ops/s
Mean Throughput	percolator_no_score_with_content_ignore_me	15.01	ops/s
Median Throughput	percolator_no_score_with_content_ignore_me	15.01	ops/s
Max Throughput	percolator_no_score_with_content_ignore_me	15.02	ops/s

クラスタの状態確認

$ curl -k https://elastic:elastic@localhost:9200/_cluster/health?pretty=true
{
  "cluster_name" : "mycluster",
  "status" : "green",
  "timed_out" : false,
  "number_of_nodes" : 2,
  "number_of_data_nodes" : 2,
  "active_primary_shards" : 3,
  "active_shards" : 5,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 0,
  "delayed_unassigned_shards" : 0,
  "number_of_pending_tasks" : 0,
  "number_of_in_flight_fetch" : 0,
  "task_max_waiting_in_queue_millis" : 0,
  "active_shards_percent_as_number" : 100.0
}

クラスタのノード確認

$ curl -k https://elastic:elastic@localhost:9200/_cat/nodes?v
ip           heap.percent ram.percent cpu load_1m load_5m load_15m node.role   master name
172.31.1.200           69          85  50    1.01    0.99     0.67 cdfhilmrstw -      ip-172-31-1-200.ec2.internal
172.31.6.170           11          91   3    0.00    0.02     0.04 cdfhilmrstw *      ip-172-31-6-170.ec2.internal

シャードを確認

percolatorトラックで使用するqueriesインデクスのみ抜粋しています。
ドキュメント数は変わらず200万件、二つのシャードに均等に分散配置されています。

$ curl -k https://elastic:elastic@localhost:9200/_cat/shards?v
index            shard prirep state      docs   store ip           node
queries          0     p      STARTED  999798 57.2mb 172.31.1.200 ip-172-31-1-200.ec2.internal
queries          1     p      STARTED 1000202 57.9mb 172.31.6.170 ip-172-31-6-170.ec2.internal

vmstatコマンドを停止

二つのノードで実行中のvmstatコマンドを停止します。

二つの構成の性能を比較

RallyベンチマークのレースIDを確認します

二つの構成に対して実施したベンチマークのレースIDを控えておきます。

$ esrally list races

Recent races:

Race ID                               Race Timestamp    Track       Challenge            Car       ES Version    Revision                                  Rally Version    Track Revision    Team Revision    User Tags
------------------------------------  ----------------  ----------  -------------------  --------  ------------  ----------------------------------------  ---------------  ----------------  ---------------  -----------
29a78535-db7b-4d98-8f3e-aa67c674ed40  20230208T031833Z  percolator  append-no-conflicts  external  8.6.1         180c9830da956993e59e2cd70eb32b5e383ea42c  2.7.0            b2f887a
a0822ef8-007f-4b18-9fa1-d70d26d17d77  20230208T022705Z  percolator  append-no-conflicts  external  8.6.1         180c9830da956993e59e2cd70eb32b5e383ea42c  2.7.0            b2f887a

Rallyのcompareコマンドを使って、二つのレース結果を比較します

$ esrally compare --baseline={比較元となるraceid} --contender={比較先となるraceid} --report-file={結果レポート}

$ esrally compare --baseline=a0822ef8-007f-4b18-9fa1-d70d26d17d77 --contender=29a78535-db7b-4d98-8f3e-aa67c674ed40 --report-file=compare_n1s1r0_n2s2r0.md

比較結果から、秒間処理量を示すThroughputのみ抜粋します。

Metric	Task	Baseline	Contender	Diff	Unit	Diff %
Min Throughput	index	7796.37	10623.2	2826.78	docs/s	+36.26%
Mean Throughput	index	18240.7	22417.1	4176.4	docs/s	+22.90%
Median Throughput	index	18730.3	22907.3	4176.98	docs/s	+22.30%
Max Throughput	index	27302.9	33243.5	5940.63	docs/s	+21.76%
Min Throughput	percolator_with_content_president_bush	17.8078	20.7333	2.92549	ops/s	+16.43%
Mean Throughput	percolator_with_content_president_bush	20.0201	23.5053	3.48526	ops/s	+17.41%
Median Throughput	percolator_with_content_president_bush	20.1961	23.7918	3.5957	ops/s	+17.80%
Max Throughput	percolator_with_content_president_bush	22.0563	25.9909	3.93459	ops/s	+17.84%
Min Throughput	percolator_with_content_saddam_hussein	48.7043	48.7985	0.09421	ops/s	+0.19%
Mean Throughput	percolator_with_content_saddam_hussein	48.857	48.9555	0.09847	ops/s	+0.20%
Median Throughput	percolator_with_content_saddam_hussein	48.857	48.9555	0.09847	ops/s	+0.20%
Max Throughput	percolator_with_content_saddam_hussein	49.0096	49.1124	0.10273	ops/s	+0.21%
Min Throughput	percolator_with_content_hurricane_katrina	49.3263	49.6039	0.27763	ops/s	+0.56%
Mean Throughput	percolator_with_content_hurricane_katrina	49.4193	49.6567	0.23738	ops/s	+0.48%
Median Throughput	percolator_with_content_hurricane_katrina	49.4193	49.6567	0.23738	ops/s	+0.48%
Max Throughput	percolator_with_content_hurricane_katrina	49.5123	49.7094	0.19712	ops/s	+0.40%
Min Throughput	percolator_with_content_google	9.09016	17.0904	8.00021	ops/s	+88.01%
Mean Throughput	percolator_with_content_google	9.24156	17.5053	8.26378	ops/s	+89.42%
Median Throughput	percolator_with_content_google	9.2259	17.5677	8.34181	ops/s	+90.42%
Max Throughput	percolator_with_content_google	9.42384	17.7616	8.33776	ops/s	+88.48%
Min Throughput	percolator_no_score_with_content_google	97.5881	56.8924	-40.6957	ops/s	-41.70%
Mean Throughput	percolator_no_score_with_content_google	97.5881	56.8924	-40.6957	ops/s	-41.70%
Median Throughput	percolator_no_score_with_content_google	97.5881	56.8924	-40.6957	ops/s	-41.70%
Max Throughput	percolator_no_score_with_content_google	97.5881	56.8924	-40.6957	ops/s	-41.70%
Min Throughput	percolator_with_highlighting	48.8527	47.7591	-1.09362	ops/s	-2.24%
Mean Throughput	percolator_with_highlighting	48.9988	48.0386	-0.96017	ops/s	-1.96%
Median Throughput	percolator_with_highlighting	48.9988	48.0386	-0.96017	ops/s	-1.96%
Max Throughput	percolator_with_highlighting	49.1448	48.3181	-0.82672	ops/s	-1.68%
Min Throughput	percolator_with_content_ignore_me	0.0806263	0.0836467	0.00302	ops/s	+3.75%
Mean Throughput	percolator_with_content_ignore_me	0.081156	0.084154	0.003	ops/s	+3.69%
Median Throughput	percolator_with_content_ignore_me	0.0812006	0.0839049	0.0027	ops/s	+3.33%
Max Throughput	percolator_with_content_ignore_me	0.081277	0.0865769	0.0053	ops/s	+6.52%
Min Throughput	percolator_no_score_with_content_ignore_me	15.0302	15.0076	-0.02258	ops/s	-0.15%
Mean Throughput	percolator_no_score_with_content_ignore_me	15.0413	15.0108	-0.03047	ops/s	-0.20%
Median Throughput	percolator_no_score_with_content_ignore_me	15.0375	15.0113	-0.02626	ops/s	-0.17%
Max Throughput	percolator_no_score_with_content_ignore_me	15.0569	15.0155	-0.04144	ops/s	-0.28%

考察

インデクシングのパフォーマンス
- 30%ほど向上が見られました
検索のパフォーマンス
- ほとんどのケースにおいて向上が見られました
  - 特に、percolator_with_content_googleが顕著で、90%ほど性能アップしました
- 気になるのが、percolator_no_score_with_content_googleで、40%ほど性能が落ちています(調査予定で宿題とさせてください)。
vmstatのシステムリソースの計測結果
- 2ノード構成でCPUとDisk I/Oが分散されていることが確認できます

構成	CPU使用率(peak)	Memory使用率(peak)	Disk I/O(インデクシング時のbo値)
構成1(1ノード/1シャード)	50%(ノード1、1コアを占有)	70%(ノード1)	6000(ノード1)
構成2(2ノード/2シャード)	30%(ノード1、ノード2それぞれ)	60%(ノード1、ノード2それぞれ)	3000(ノード1、ノード2それぞれ)

おわりに

Elasticsearchクラスタのノード数/シャード数による性能比較を行いました。
次回は、レプリカ数による性能比較検証を実施予定です。
お楽しみに。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up