この記事では2025/03/01に発生したGitLabのバックグラウンドジョブ滞留問題の障害対応内容を記載しています。
事象
利用者への影響
- 以下のGitLab機能が動作しない
- Merge Request作成、マージ
- ProjectのFork
- Pipeline実行
原因
- Redisに2025-02-29という存在しない日時のデータが保存されており、参照しているSidekiqジョブで日付の処理エラーが発生している模様
対応
- ログの調査
- /var/log/gitlab/sidekiq/currentの内容を調査
- 以下のログが続けて出力されている。
- invalid dateが出力されている場合は本事象と判断して良い。
{"severity":"INFO","time":"2025-03-03T09:40:01.312Z","message":"Starting cluster with 1 processes"}
{"severity":"INFO","time":"2025-03-03T09:40:01.320Z","message":"Starting metrics server on port 8082"}
2025-03-03T09:40:17.307Z pid=343709 tid=80zt INFO: {:message=>"started", :memwd_handler_class=>"Gitlab::Memory::Watchdog::Handlers::SidekiqHandler", :memwd_sleep_time_s=>3, :pid=>343709, :worker_id=>"sidekiq_0", :memwd_rss_bytes=>430551040}
{"severity":"INFO","time":"2025-03-03T09:40:18.702Z","message":"Listening on queues [\"default,1\", \"mailers,1\"]"}
{"severity":"INFO","time":"2025-03-03T09:40:20.301Z","message":"GitLab reliable fetch activated!"}
{"severity":"INFO","time":"2025-03-03T09:40:20.301Z","message":"Sidekiq 7.1.6 connecting to Redis with options {:size=\u003e25, :pool_name=\u003e\"internal\", :path=\u003e\"/var/opt/gitlab/redis/redis.socket\", :custom=\u003e{:instrumentation_class=\u003e\"Queues\"}, :url=\u003enil}"}
invalid date
/opt/gitlab/embedded/lib/ruby/gems/3.1.0/gems/sidekiq-cron-1.12.0/lib/sidekiq/cron/job.rb:631:in `parse'
/opt/gitlab/embedded/lib/ruby/gems/3.1.0/gems/sidekiq-cron-1.12.0/lib/sidekiq/cron/job.rb:631:in `rescue in parse_enqueue_time'
/opt/gitlab/embedded/lib/ruby/gems/3.1.0/gems/sidekiq-cron-1.12.0/lib/sidekiq/cron/job.rb:628:in `parse_enqueue_time'
/opt/gitlab/embedded/lib/ruby/gems/3.1.0/gems/sidekiq-cron-1.12.0/lib/sidekiq/cron/job.rb:290:in `initialize'
/opt/gitlab/embedded/lib/ruby/gems/3.1.0/gems/sidekiq-cron-1.12.0/lib/sidekiq/cron/job.rb:228:in `new'
/opt/gitlab/embedded/lib/ruby/gems/3.1.0/gems/sidekiq-cron-1.12.0/lib/sidekiq/cron/job.rb:228:in `block in all'
/opt/gitlab/embedded/lib/ruby/gems/3.1.0/gems/sidekiq-cron-1.12.0/lib/sidekiq/cron/job.rb:226:in `collect'
/opt/gitlab/embedded/lib/ruby/gems/3.1.0/gems/sidekiq-cron-1.12.0/lib/sidekiq/cron/job.rb:226:in `all'
/opt/gitlab/embedded/lib/ruby/gems/3.1.0/gems/sidekiq-cron-1.12.0/lib/sidekiq/cron/job.rb:534:in `destroy_removed_jobs'
/opt/gitlab/embedded/lib/ruby/gems/3.1.0/gems/sidekiq-cron-1.12.0/lib/sidekiq/cron/job.rb:177:in `load_from_hash!'
/opt/gitlab/embedded/service/gitlab-rails/config/initializers/sidekiq.rb:9:in `load_cron_jobs!'
/opt/gitlab/embedded/service/gitlab-rails/config/initializers/sidekiq.rb:120:in `block in <top (required)>'
/opt/gitlab/embedded/service/gitlab-rails/vendor/gems/sidekiq-7.1.6/lib/sidekiq.rb:98:in `configure_server'
/opt/gitlab/embedded/service/gitlab-rails/config/initializers/sidekiq.rb:46:in `<top (required)>'
/opt/gitlab/embedded/lib/ruby/gems/3.1.0/gems/railties-7.0.8.4/lib/rails/engine.rb:667:in `load'
/opt/gitlab/embedded/lib/ruby/gems/3.1.0/gems/railties-7.0.8.4/lib/rails/engine.rb:667:in `block in load_config_initializer'
/opt/gitlab/embedded/lib/ruby/gems/3.1.0/gems/activesupport-7.0.8.4/lib/active_support/notifications.rb:208:in `instrument'
/opt/gitlab/embedded/lib/ruby/gems/3.1.0/gems/railties-7.0.8.4/lib/rails/engine.rb:666:in `load_config_initializer'
/opt/gitlab/embedded/lib/ruby/gems/3.1.0/gems/railties-7.0.8.4/lib/rails/engine.rb:620:in `block (2 levels) in <class:Engine>'
/opt/gitlab/embedded/lib/ruby/gems/3.1.0/gems/railties-7.0.8.4/lib/rails/engine.rb:619:in `each'
/opt/gitlab/embedded/lib/ruby/gems/3.1.0/gems/railties-7.0.8.4/lib/rails/engine.rb:619:in `block in <class:Engine>'
/opt/gitlab/embedded/lib/ruby/gems/3.1.0/gems/railties-7.0.8.4/lib/rails/initializable.rb:32:in `instance_exec'
/opt/gitlab/embedded/lib/ruby/gems/3.1.0/gems/railties-7.0.8.4/lib/rails/initializable.rb:32:in `run'
/opt/gitlab/embedded/lib/ruby/gems/3.1.0/gems/railties-7.0.8.4/lib/rails/initializable.rb:61:in `block in run_initializers'
/opt/gitlab/embedded/lib/ruby/3.1.0/tsort.rb:228:in `block in tsort_each'
/opt/gitlab/embedded/lib/ruby/3.1.0/tsort.rb:350:in `block (2 levels) in each_strongly_connected_component'
/opt/gitlab/embedded/lib/ruby/3.1.0/tsort.rb:422:in `block (2 levels) in each_strongly_connected_component_from'
/opt/gitlab/embedded/lib/ruby/3.1.0/tsort.rb:431:in `each_strongly_connected_component_from'
/opt/gitlab/embedded/lib/ruby/3.1.0/tsort.rb:421:in `block in each_strongly_connected_component_from'
/opt/gitlab/embedded/lib/ruby/gems/3.1.0/gems/railties-7.0.8.4/lib/rails/initializable.rb:50:in `each'
/opt/gitlab/embedded/lib/ruby/gems/3.1.0/gems/railties-7.0.8.4/lib/rails/initializable.rb:50:in `tsort_each_child'
/opt/gitlab/embedded/lib/ruby/3.1.0/tsort.rb:415:in `call'
/opt/gitlab/embedded/lib/ruby/3.1.0/tsort.rb:415:in `each_strongly_connected_component_from'
/opt/gitlab/embedded/lib/ruby/3.1.0/tsort.rb:349:in `block in each_strongly_connected_component'
/opt/gitlab/embedded/lib/ruby/3.1.0/tsort.rb:347:in `each'
/opt/gitlab/embedded/lib/ruby/3.1.0/tsort.rb:347:in `call'
/opt/gitlab/embedded/lib/ruby/3.1.0/tsort.rb:347:in `each_strongly_connected_component'
/opt/gitlab/embedded/lib/ruby/3.1.0/tsort.rb:226:in `tsort_each'
/opt/gitlab/embedded/lib/ruby/3.1.0/tsort.rb:205:in `tsort_each'
/opt/gitlab/embedded/lib/ruby/gems/3.1.0/gems/railties-7.0.8.4/lib/rails/initializable.rb:60:in `run_initializers'
/opt/gitlab/embedded/lib/ruby/gems/3.1.0/gems/railties-7.0.8.4/lib/rails/application.rb:372:in `initialize!'
/opt/gitlab/embedded/service/gitlab-rails/config/environment.rb:7:in `<top (required)>'
<internal:/opt/gitlab/embedded/lib/ruby/site_ruby/3.1.0/rubygems/core_ext/kernel_require.rb>:37:in `require'
<internal:/opt/gitlab/embedded/lib/ruby/site_ruby/3.1.0/rubygems/core_ext/kernel_require.rb>:37:in `require'
/opt/gitlab/embedded/service/gitlab-rails/vendor/gems/sidekiq-7.1.6/lib/sidekiq/cli.rb:303:in `boot_application'
/opt/gitlab/embedded/service/gitlab-rails/vendor/gems/sidekiq-7.1.6/lib/sidekiq/cli.rb:42:in `run'
/opt/gitlab/embedded/service/gitlab-rails/vendor/gems/sidekiq-7.1.6/bin/sidekiq:31:in `<top (required)>'
/opt/gitlab/embedded/lib/ruby/gems/3.1.0/bin/sidekiq:25:in `load'
/opt/gitlab/embedded/lib/ruby/gems/3.1.0/bin/sidekiq:25:in `<top (required)>'
/opt/gitlab/embedded/lib/ruby/site_ruby/3.1.0/bundler/cli/exec.rb:58:in `load'
/opt/gitlab/embedded/lib/ruby/site_ruby/3.1.0/bundler/cli/exec.rb:58:in `kernel_load'
/opt/gitlab/embedded/lib/ruby/site_ruby/3.1.0/bundler/cli/exec.rb:23:in `run'
/opt/gitlab/embedded/lib/ruby/site_ruby/3.1.0/bundler/cli.rb:455:in `exec'
/opt/gitlab/embedded/lib/ruby/site_ruby/3.1.0/bundler/vendor/thor/lib/thor/command.rb:28:in `run'
/opt/gitlab/embedded/lib/ruby/site_ruby/3.1.0/bundler/vendor/thor/lib/thor/invocation.rb:127:in `invoke_command'
/opt/gitlab/embedded/lib/ruby/site_ruby/3.1.0/bundler/vendor/thor/lib/thor.rb:527:in `dispatch'
/opt/gitlab/embedded/lib/ruby/site_ruby/3.1.0/bundler/cli.rb:35:in `dispatch'
/opt/gitlab/embedded/lib/ruby/site_ruby/3.1.0/bundler/vendor/thor/lib/thor/base.rb:584:in `start'
/opt/gitlab/embedded/lib/ruby/site_ruby/3.1.0/bundler/cli.rb:29:in `start'
/opt/gitlab/embedded/lib/ruby/gems/3.1.0/gems/bundler-2.5.11/exe/bundle:28:in `block in <top (required)>'
/opt/gitlab/embedded/lib/ruby/site_ruby/3.1.0/bundler/friendly_errors.rb:117:in `with_friendly_errors'
/opt/gitlab/embedded/lib/ruby/gems/3.1.0/gems/bundler-2.5.11/exe/bundle:20:in `<top (required)>'
/opt/gitlab/embedded/bin/bundle:25:in `load'
/opt/gitlab/embedded/bin/bundle:25:in `<main>'
{"severity":"INFO","time":"2025-03-03T09:40:20.330Z","message":"stopped","memwd_reason":"background task stopped","memwd_handler_class":"Gitlab::Memory::Watchdog::Handlers::SidekiqHandler","memwd_sleep_time_s":3,"pid":343709,"worker_id":"sidekiq_0","memwd_rss_bytes":488927232,"retry":0}
{"severity":"INFO","time":"2025-03-03T09:40:21.323Z","message":"A worker terminated, shutting down the cluster"}
- rails consoleでSidekiq::Cron::Job.allを実行し、同じエラーが出ることを確認する。
[root@dcvmn01]:[/etc/gitlab]# gitlab-rails console
--------------------------------------------------------------------------------
Ruby: ruby 3.1.5p253 (2024-04-023 revision 1945f8dc0e) [x86_64-linux]
GitLab: 17.3.1-ee (df01858216e) EE
GitLab Shell: 14.38.0
PostgreSQL: 14.11
------------------------------------------------------------[ booted in 42.06s ]
WARNING: Active Record does not support composite primary key.
security_findings has composite primary key. Composite primary key is ignored.
Loading production environment (Rails 7.0.8.4)
irb(main):001:0> Sidekiq::Cron::Job.all
/opt/gitlab/embedded/lib/ruby/gems/3.1.0/gems/sidekiq-cron-1.12.0/lib/sidekiq/cron/job.rb:631:in `parse': invalid date (Date::Error)
from /opt/gitlab/embedded/lib/ruby/gems/3.1.0/gems/sidekiq-cron-1.12.0/lib/sidekiq/cron/job.rb:631:in `rescue in parse_enqueue_time'
from /opt/gitlab/embedded/lib/ruby/gems/3.1.0/gems/sidekiq-cron-1.12.0/lib/sidekiq/cron/job.rb:628:in `parse_enqueue_time'
from /opt/gitlab/embedded/lib/ruby/gems/3.1.0/gems/sidekiq-cron-1.12.0/lib/sidekiq/cron/job.rb:290:in `initialize'
from /opt/gitlab/embedded/lib/ruby/gems/3.1.0/gems/sidekiq-cron-1.12.0/lib/sidekiq/cron/job.rb:228:in `new'
from /opt/gitlab/embedded/lib/ruby/gems/3.1.0/gems/sidekiq-cron-1.12.0/lib/sidekiq/cron/job.rb:228:in `block in all'
from /opt/gitlab/embedded/lib/ruby/gems/3.1.0/gems/sidekiq-cron-1.12.0/lib/sidekiq/cron/job.rb:226:in `collect'
from /opt/gitlab/embedded/lib/ruby/gems/3.1.0/gems/sidekiq-cron-1.12.0/lib/sidekiq/cron/job.rb:226:in `all'
from (irb):1:in `<main>'
from /opt/gitlab/embedded/lib/ruby/gems/3.1.0/gems/railties-7.0.8.4/lib/rails/commands/console/console_command.rb:74:in `start'
from /opt/gitlab/embedded/lib/ruby/gems/3.1.0/gems/railties-7.0.8.4/lib/rails/commands/console/console_command.rb:19:in `start'
from /opt/gitlab/embedded/lib/ruby/gems/3.1.0/gems/railties-7.0.8.4/lib/rails/commands/console/console_command.rb:106:in `perform'
from /opt/gitlab/embedded/lib/ruby/gems/3.1.0/gems/thor-1.3.1/lib/thor/command.rb:28:in `run'
from /opt/gitlab/embedded/lib/ruby/gems/3.1.0/gems/thor-1.3.1/lib/thor/invocation.rb:127:in `invoke_command'
from /opt/gitlab/embedded/lib/ruby/gems/3.1.0/gems/thor-1.3.1/lib/thor.rb:527:in `dispatch'
from /opt/gitlab/embedded/lib/ruby/gems/3.1.0/gems/railties-7.0.8.4/lib/rails/command/base.rb:87:in `perform'
from /opt/gitlab/embedded/lib/ruby/gems/3.1.0/gems/railties-7.0.8.4/lib/rails/command.rb:48:in `invoke'
... 4 levels...
- ソースコードの内容から参照しているRedisを特定しRedis Cliを使用してジョブ一覧を取得
[root@dcvmn01]:[/root]# /opt/gitlab/embedded/bin/redis-cli -s /var/opt/gitlab/redis/redis.socket
redis /var/opt/gitlab/redis/redis.socket> SMEMBERS cron_jobs
1) "cron_job:ensure_merge_requests_prepared_worker"
省略
151) "cron_job:loose_foreign_keys_cleanup_worker"
- 上記の151個のジョブのlast_enqueue_timeを取得(以下cron_job:analytics_cycle_analytics_consistency_workerのlast_enqueue_timeが対象)
redis /var/opt/gitlab/redis/redis.socket>
[root@dcvmn01]:[/etc/gitlab]# sudo /opt/gitlab/embedded/bin/redis-cli -s /var/opt/gitlab/redis/redis.socket
redis /var/opt/gitlab/redis/redis.socket> HGET cron_job:analytics_cycle_analytics_consistency_worker last_enqueue_time
"2025-02-28 15:00:08 +0000"
redis /var/opt/gitlab/redis/redis.socket>
- ExcelでSMEMBERS cron_jobsの結果を元にジョブ名からHGETコマンドとHSETコマンドを生成する
- ↑の結果が2025-02-29になっているものを確認
- 確認した結果NGだったものに対して“2025-02-28 15:00:08 +0000"でフィールドを更新するためのHSETを実行
[root@dcvmn01]:[/etc/gitlab]# sudo /opt/gitlab/embedded/bin/redis-cli -s /var/opt/gitlab/redis/redis.socket
redis /var/opt/gitlab/redis/redis.socket> HSET cron_job:search_elastic_metrics_update_cron_worker last_enqueue_time "2025-02-28 15:00:08 +0000"
(integer) 0
redis /var/opt/gitlab/redis/redis.socket> HGET cron_job:search_elastic_metrics_update_cron_worker last_enqueue_time
"2025-02-28 15:00:08 +0000"
redis /var/opt/gitlab/redis/redis.socket> HSET cron_job:search_elastic_metrics_update_cron_worker last_enqueue_time "2025-02-28 15:00:08 +0000"
(integer) 0
redis /var/opt/gitlab/redis/redis.socket> HGET cron_job:stuck_ci_jobs_worker last_enqueue_time
"2025-02-29 15:00:08 +0000"
<以下省略>
- 稼働確認としてrails_consoleからSidekiq::Cron::Job.allを再実行しinvalid dateのエラーが表示されなければOK
irb(main):003:0> Sidekiq::Cron::Job.all
=>
[#<Sidekiq::Cron::Job:0x00007ff17601f440
@active_job=false,
@active_job_queue_name_delimiter="",
@active_job_queue_name_prefix="",
@args=[],
@cron="0 */4 * * *",
@date_as_argument=false,
@description="",
@fetch_missing_args=false,
@klass="BulkImports::StaleImportWorker",
@last_enqueue_time=2025-02-28 12:00:06 UTC,
@message="{\"retry\":false,\"queue\":\"default\",\"version\":0,\"store\":null,\"queue_namespace\":\"cronjob\",\"class\":\"BulkImports::StaleImportWorker\",\"args\":[]}",
@name="bulk_imports_stale_import_worker",
@queue="default",
@queue_name_with_prefix="default",
@source="dynamic",
@status="enabled",
@symbolize_args=false>,
<以下省略>
-
ジョブの再生成は定期的に実行されるため、しばらく待機してから前述の事象が解決していることを確認する。待機状態のジョブの件数が減り始めればOK。
-
2025/3/1の発生時はGitLab再起動も実施していたので上記で解決しない場合は再起動を実施
GitLab Supportへの問い合わせ
サポートに問い合わせた結果Sidekiqの障害に起因しているとのこと。
この障害が解消されない限り毎年同様の問題が発生すると思われる。
https://gitlab.com/gitlab-org/gitlab/-/issues/523081#top






