0
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

More than 3 years have passed since last update.

OpenStack Keystoneを高速化できるか

Posted at

Keystoneは遅すぎる

あらゆるOpenStackサービスを利用するときに背後ではKeystoneの認証プロセスが動いている。このKeystoneの動作を高速化することで快適なOpenStack環境を作れるのでは?ということで色々試してみた。
まともな結論は出ていない。

ベース

libvirt上でubuntu20.04を稼働させるためのVagrantfile
簡単な前処理はスクリプトでやっておく。

Vagrant.configure("2") do |config|
    config.vm.define "master" do |host|
      host.vm.box = "generic/ubuntu2004"
      host.vm.provision :shell, inline: $script
      host.vm.provider "libvirt" do |vb|
        vb.memory = 8192
        vb.cpus = 8
      end
    end
end

$script = <<END
apt update
apt upgrade -y
apt install memcached keystone python3-openstackclient -y
END

ソフトウェアのバージョン

vagrant@ubuntu2004:~$ dpkg -l | grep -P "(mysql)|(memcache)|(keystone)"
ii  keystone                             2:17.0.0-0ubuntu0.20.04.1         all          OpenStack identity service - Daemons
ii  mysql-server-8.0                     8.0.22-0ubuntu0.20.04.2           amd64        MySQL database server binaries and system database setup
ii  memcached                            1.5.22-2ubuntu0.1                 amd64        High-performance in-memory object caching system

まずはドキュメント通りインストールを実施。
https://docs.openstack.org/keystone/ussuri/install/keystone-install-ubuntu.html
※memcachedはまずは使わない。

どこに時間が掛かる?

timingオプションをつけることでAPIごとの時間が計測可能
tokensへのリクエストが0.3秒程度 * 2でここだけで0.6秒も掛かっている。
(なんでtokensを2回も叩いてるんだろう)

root@ubuntu2004:~# openstack project list --timing
+----------------------------------+-------+
| ID                               | Name  |
+----------------------------------+-------+
| 973e028a13184bf585915d1dbcb8bd69 | admin |
+----------------------------------+-------+

+-------------------------------------------+--------------------+
| URL                                       |            Seconds |
+-------------------------------------------+--------------------+
| GET http://localhost:5000/v3              |           0.003235 |
| POST http://localhost:5000/v3/auth/tokens |           0.333022 |
| POST http://localhost:5000/v3/auth/tokens |           0.313014 |
| GET http://localhost:5000/v3/projects     |           0.079824 |
| Total                                     | 0.7290949999999999 |
+-------------------------------------------+--------------------+
root@ubuntu2004:~#

Jmeterを使って/tokensに対して10秒で100リクエスト投げてみた。
Jmeter実行元サーバーが非力なので参考値程度だが、平均で0.263秒掛かっていることがわかる。

Starting the test @ Sat Oct 31 12:26:14 JST 2020 (1604114774995)
Waiting for possible shutdown message on port 4445
summary =    100 in    12s =    8.6/s Avg:   263 Min:   249 Max:   313 Err:     0 (0.00%)
Tidying up ...    @ Sat Oct 31 12:26:26 JST 2020 (1604114786675)

memcachedを使ってみる

keystoneにmemcacheを利用する設定を追加する。
token発行時にどれほどmemcachedが効くか分からなかったが、多少改善されたようにも見える。(平均で0.231秒)
keystone.confに記載するキャッシュバックエンド設定は以下を使った。
backend = dogpile.cache.memcached

Starting the test @ Sat Oct 31 12:31:27 JST 2020 (1604115087396)
Waiting for possible shutdown message on port 4445
summary +     19 in     3s =    7.2/s Avg:   232 Min:   222 Max:   284 Err:     0 (0.00%) Active: 2 Started: 3 Finished: 1
summary +     81 in     9s =    9.3/s Avg:   231 Min:   218 Max:   238 Err:     0 (0.00%) Active: 0 Started: 10 Finished: 10
summary =    100 in  11.3s =    8.8/s Avg:   231 Min:   218 Max:   284 Err:     0 (0.00%)
Tidying up ...    @ Sat Oct 31 12:31:38 JST 2020 (1604115098736)

以下のバックエンド設定を使ってみた場合も試行した。
backend = oslo_cache.memcache_pool
結果はdogpile.cache.memcachedとほとんど変わらないように見える。

Starting the test @ Sat Oct 31 12:36:02 JST 2020 (1604115362429)
Waiting for possible shutdown message on port 4445
summary +      1 in   0.5s =    2.2/s Avg:   280 Min:   280 Max:   280 Err:     0 (0.00%) Active: 1 Started: 1 Finished: 0
summary +     99 in    11s =    9.1/s Avg:   230 Min:   219 Max:   239 Err:     0 (0.00%) Active: 0 Started: 10 Finished: 10
summary =    100 in  11.3s =    8.9/s Avg:   230 Min:   219 Max:   280 Err:     0 (0.00%)
Tidying up ...    @ Sat Oct 31 12:36:13 JST 2020 (1604115373754)

dogpile.cache.memcachedoslo_cache.memcache_poolの違い

設定ファイルのドキュメントによると以下の記載がある

小規模な環境を除いて、基本はmemcache_poolの利用が推奨のようだ。
その名の通りmemcacheへのコネクションがプールされるか。

Cache backend module. For eventlet-based or environments with hundreds of threaded servers,
Memcache with pooling (oslo_cache.memcache_pool) is recommended.
For environments with less than 100 threaded servers,
Memcached (dogpile.cache.memcached) or Redis (dogpile.cache.redis) is recommended. 
Test environments with a single instance of the server can use the dogpile.cache.memory backend.

ちなみにpool_maxsizeという設定でコネクションプール数を制御できるようなので、環境によってはこの数値を変化させることで改善されることもありそう。(ただし本記事の実験環境ではkeystoneのみが起動しているため、MySQLやMemcachedのプール数による改善ではなく単発のリクエストの速度改善を目指す。)

MySQLへのアクセス

トークン発行プログラムのどこに時間が掛かっているかは分からないが、チューニングしやすそうなMySQLから見てみる。
tcpdump+wiresharkで見てみるとトークン発行時にMySQLにアクセスしているのでどのようなクエリを投げているか見てみた。すべてSELECT文なので読み込み速度を改善すれば早くなりそう。またORDER BYによって並び替えを行っていることから並び替えに関するパフォーマンス改善の余地もあるか。

SELECT user.enabled AS user_enabled, user.id AS user_id, user.domain_id AS user_domain_id, user.extra AS user_extra, user.default_project_id AS user_default_project_id, user.created_at AS user_created_at, user.last_active_at AS user_last_active_at, password_1.created_at AS password_1_created_at, password_1.expires_at AS password_1_expires_at, password_1.id AS password_1_id, password_1.local_user_id AS password_1_local_user_id, password_1.password_hash AS password_1_password_hash, password_1.created_at_int AS password_1_created_at_int, password_1.expires_at_int AS password_1_expires_at_int, password_1.self_service AS password_1_self_service, local_user_1.id AS local_user_1_id, local_user_1.user_id AS local_user_1_user_id, local_user_1.domain_id AS local_user_1_domain_id, local_user_1.name AS local_user_1_name, local_user_1.failed_auth_count AS local_user_1_failed_auth_count, local_user_1.failed_auth_at AS local_user_1_failed_auth_at, federated_user_1.id AS federated_user_1_id, federated_user_1.user_id AS federated_user_1_user_id, federated_user_1.idp_id AS federated_user_1_idp_id, federated_user_1.protocol_id AS federated_user_1_protocol_id, federated_user_1.unique_id AS federated_user_1_unique_id, federated_user_1.display_name AS federated_user_1_display_name, nonlocal_user_1.domain_id AS nonlocal_user_1_domain_id, nonlocal_user_1.name AS nonlocal_user_1_name, nonlocal_user_1.user_id AS nonlocal_user_1_user_id 
FROM user LEFT OUTER JOIN local_user AS local_user_1 ON user.id = local_user_1.user_id AND user.domain_id = local_user_1.domain_id LEFT OUTER JOIN password AS password_1 ON local_user_1.id = password_1.local_user_id LEFT OUTER JOIN federated_user AS federated_user_1 ON user.id = federated_user_1.user_id LEFT OUTER JOIN nonlocal_user AS nonlocal_user_1 ON user.domain_id = nonlocal_user_1.domain_id AND user.id = nonlocal_user_1.user_id 
WHERE user.id = 'e901f4f3d5544817bf89c1946e4ed419' ORDER BY password_1.created_at_int

SELECT user_option.user_id AS user_option_user_id, user_option.option_id AS user_option_option_id, user_option.option_value AS user_option_option_value, anon_1.user_id AS anon_1_user_id 
FROM (SELECT user.id AS user_id 
FROM user 
WHERE user.id = 'e901f4f3d5544817bf89c1946e4ed419') AS anon_1 INNER JOIN user_option ON anon_1.user_id = user_option.user_id ORDER BY anon_1.user_id

SELECT user.enabled AS user_enabled, user.id AS user_id, user.domain_id AS user_domain_id, user.extra AS user_extra, user.default_project_id AS user_default_project_id, user.created_at AS user_created_at, user.last_active_at AS user_last_active_at, password_1.created_at AS password_1_created_at, password_1.expires_at AS password_1_expires_at, password_1.id AS password_1_id, password_1.local_user_id AS password_1_local_user_id, password_1.password_hash AS password_1_password_hash, password_1.created_at_int AS password_1_created_at_int, password_1.expires_at_int AS password_1_expires_at_int, password_1.self_service AS password_1_self_service, local_user_1.id AS local_user_1_id, local_user_1.user_id AS local_user_1_user_id, local_user_1.domain_id AS local_user_1_domain_id, local_user_1.name AS local_user_1_name, local_user_1.failed_auth_count AS local_user_1_failed_auth_count, local_user_1.failed_auth_at AS local_user_1_failed_auth_at, federated_user_1.id AS federated_user_1_id, federated_user_1.user_id AS federated_user_1_user_id, federated_user_1.idp_id AS federated_user_1_idp_id, federated_user_1.protocol_id AS federated_user_1_protocol_id, federated_user_1.unique_id AS federated_user_1_unique_id, federated_user_1.display_name AS federated_user_1_display_name, nonlocal_user_1.domain_id AS nonlocal_user_1_domain_id, nonlocal_user_1.name AS nonlocal_user_1_name, nonlocal_user_1.user_id AS nonlocal_user_1_user_id 
FROM user LEFT OUTER JOIN local_user AS local_user_1 ON user.id = local_user_1.user_id AND user.domain_id = local_user_1.domain_id LEFT OUTER JOIN password AS password_1 ON local_user_1.id = password_1.local_user_id LEFT OUTER JOIN federated_user AS federated_user_1 ON user.id = federated_user_1.user_id LEFT OUTER JOIN nonlocal_user AS nonlocal_user_1 ON user.domain_id = nonlocal_user_1.domain_id AND user.id = nonlocal_user_1.user_id 
WHERE user.id = 'e901f4f3d5544817bf89c1946e4ed419' ORDER BY password_1.created_at_int

SELECT user_option.user_id AS user_option_user_id, user_option.option_id AS user_option_option_id, user_option.option_value AS user_option_option_value, anon_1.user_id AS anon_1_user_id 
FROM (SELECT user.id AS user_id 
FROM user 
WHERE user.id = 'e901f4f3d5544817bf89c1946e4ed419') AS anon_1 INNER JOIN user_option ON anon_1.user_id = user_option.user_id ORDER BY anon_1.user_id

SELECT revocation_event.id AS revocation_event_id, revocation_event.domain_id AS revocation_event_domain_id, revocation_event.project_id AS revocation_event_project_id, revocation_event.user_id AS revocation_event_user_id, revocation_event.role_id AS revocation_event_role_id, revocation_event.trust_id AS revocation_event_trust_id, revocation_event.consumer_id AS revocation_event_consumer_id, revocation_event.access_token_id AS revocation_event_access_token_id, revocation_event.issued_before AS revocation_event_issued_before, revocation_event.expires_at AS revocation_event_expires_at, revocation_event.revoked_at AS revocation_event_revoked_at, revocation_event.audit_id AS revocation_event_audit_id, revocation_event.audit_chain_id AS revocation_event_audit_chain_id 
FROM revocation_event 
WHERE revocation_event.issued_before >= '2020-10-31 09:13:17' AND (revocation_event.user_id IS NULL OR revocation_event.user_id = 'e901f4f3d5544817bf89c1946e4ed419') AND (revocation_event.project_id IS NULL OR revocation_event.project_id = '973e028a13184bf585915d1dbcb8bd69') AND (revocation_event.audit_id IS NULL OR revocation_event.audit_id = '8QLRph58TVuNZ-DwS_m1Qw')

SELECT project.id AS project_id, project.name AS project_name, project.domain_id AS project_domain_id, project.description AS project_description, project.enabled AS project_enabled, project.extra AS project_extra, project.parent_id AS project_parent_id, project.is_domain AS project_is_domain 
FROM project 
WHERE project.id != '<<keystone.domain.root>>' AND project.is_domain = false

SELECT project_tag.project_id AS project_tag_project_id, project_tag.name AS project_tag_name, anon_1.project_id AS anon_1_project_id 
FROM (SELECT project.id AS project_id 
FROM project 
WHERE project.id != '<<keystone.domain.root>>' AND project.is_domain = false) AS anon_1 INNER JOIN project_tag ON project_tag.project_id = anon_1.project_id ORDER BY anon_1.project_id

SELECT project_option.project_id AS project_option_project_id, project_option.option_id AS project_option_option_id, project_option.option_value AS project_option_option_value, anon_1.project_id AS anon_1_project_id 
FROM (SELECT project.id AS project_id 
FROM project 
WHERE project.id != '<<keystone.domain.root>>' AND project.is_domain = false) AS anon_1 INNER JOIN project_option ON anon_1.project_id = project_option.project_id ORDER BY anon_1.project_id

MySQLパフォーマンス

MySQL8.0からはクエリーキャッシュが利用不可となり、クエリーキャッシュ系の設定を入れるとMySQLが起動しない。
https://yakst.com/ja/posts/4612

代替案としてProxySQLなるものをクライアントとMySQLの間に噛ませることでキャッシュを効かせる試みをしているらしい。管理ソフトウェアが増えること、新たな学習コストを考えると微妙か。一旦mysqltunerを使った簡易的な方法を試してみる。

mysqltunerを使ったパフォーマンス測定

インターネット上には多くのMySQLパフォーマンス改善系の情報が出ているが、かなり奥が深そう学習コストが高そう…)。ちょっと面倒なので、mysqltunerを使って簡単に診断を受けてみる。
ubuntuリポジトリに入っているのでaptで簡単にインストールできた。

実行後、色々出たが以下が改善余地のありそうなポイントとのこと。

-------- Recommendations ---------------------------------------------------------------------------
General recommendations:
    Control warning line(s) into /var/log/mysql/error.log file
    Control error line(s) into /var/log/mysql/error.log file
    MySQL was started within the last 24 hours - recommendations may be inaccurate
    Configure your accounts with ip or subnets only, then update your configuration with skip-name-resolve=1
    Before changing innodb_log_file_size and/or innodb_log_files_in_group read this: https://bit.ly/2TcGgtU
Variables to adjust:
    innodb_log_file_size should be (=16M) if possible, so InnoDB total log files size equals to 25% of buffer pool size.
root@ubuntu2004:~#

まずは分かりやすい名前解決のオフ。期待はしてなかったが変わらない。

Starting the test @ Sat Oct 31 18:55:19 JST 2020 (1604138119110)
Waiting for possible shutdown message on port 4445
summary +     98 in    11s =    9.0/s Avg:   233 Min:   223 Max:   286 Err:     0 (0.00%) Active: 1 Started: 10 Finished: 9
summary +      2 in   0.5s =    4.4/s Avg:   224 Min:   222 Max:   226 Err:     0 (0.00%) Active: 0 Started: 10 Finished: 10
summary =    100 in  11.3s =    8.8/s Avg:   232 Min:   222 Max:   286 Err:     0 (0.00%)
Tidying up ...    @ Sat Oct 31 18:55:30 JST 2020 (1604138130464)
... end of run

次にinnodb_log_file_sizeを16MBにしてみた。まぁこちらも変化なし。

Starting the test @ Sat Oct 31 19:19:25 JST 2020 (1604139565449)
Waiting for possible shutdown message on port 4445
summary +     39 in     5s =    8.4/s Avg:   235 Min:   227 Max:   279 Err:     0 (0.00%) Active: 2 Started: 5 Finished: 3
summary +     61 in     7s =    9.1/s Avg:   234 Min:   227 Max:   243 Err:     0 (0.00%) Active: 0 Started: 10 Finished: 10
summary =    100 in  11.4s =    8.8/s Avg:   235 Min:   227 Max:   279 Err:     0 (0.00%)
Tidying up ...    @ Sat Oct 31 19:19:36 JST 2020 (1604139576840)

ProxySQL

結局ProxySQLも試してみた。

設定は以下を参考にしました。ちなみに設定はrmで直接消したほうがよさそう。keystoneユーザーが認識されず困った。
https://qiita.com/bringer1092/items/7f2729ac83df92541e29

その後、keystone.confもアクセスポートを6033(ProxySQL経由)に変更して計測
結果は以下。

Starting the test @ Sat Oct 31 20:33:14 JST 2020 (1604143994240)
Waiting for possible shutdown message on port 4445
summary =    100 in  11.4s =    8.8/s Avg:   235 Min:   224 Max:   286 Err:     0 (0.00%)
Tidying up ...    @ Sat Oct 31 20:33:25 JST 2020 (1604144005631)

単純に間に1つ入ったのでそうそう早くはならないよなぁと思ったが遅くもならなかった。現実的にMySQLサーバーをクラスタリングする環境を考えるとHAProxyよりはマシ?

まとめ

簡単に試せる部分ではmemcachedが分かりやすく効果があった。memcachedは普通入れているところが多そうなので次にMySQLやmemcachedのチューニングに入ると効果が出やすそう。また、捌きの遅い原因によっては

  • keystoneのプロセス数を増やす
  • コネクションプール数を増やす
  • ProxySQLのようなプロキシサーバーを噛ませる
    といった処理でも分かりやすい効果がでそう。
    単発処理の速度改善はプログラム内でデバッグログ仕込んでより細かく処理時間が掛かっている部分を見つけないと難しそう。
0
0
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
0
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?