More than 1 year has passed since last update.

Optunaでsqliteを使った並列化をしようとしたら詰まった

Posted at 2024-03-07

起きた問題

Optunaで機械学習のハイパーパラメータ探索を，複数プロセスで同時に動かすよう並列化をしようと，以下の実装を動かしたらエラーになった．

main.py(抜粋)

study = optuna.create_study(
    study_name="my_study",
    storage=args.optuna_db,  # sqlite:///path/to/study.db のような文字列を指定
    load_if_exists=True,  # すでに↑のファイルがある場合は，それを読み込んで並列化
)

実行スクリプト

source venv/bin/activate;
# 3つ同時に実行して探索
python main.py --optuna_db sqlite:///result/study.db &
python main.py --optuna_db sqlite:///result/study.db &
python main.py --optuna_db sqlite:///result/study.db &
wait

エラー

Traceback (most recent call last):
  File "/path/to/venv/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1960, in _exec_single_context
    self.dialect.do_execute(
  File "/path/to/venv/lib/python3.9/site-packages/sqlalchemy/engine/default.py", line 924, in do_execute
    cursor.execute(statement, parameters)
sqlite3.OperationalError: table alembic_version already exists

load_if_exists=True, を指定したのに，なんで…?

環境

optuna==3.5.0
SQLAlchemy==2.0.27

解決策

2つ目移行の実行前にsleepを挟むことで解決した．

実行スクリプト

source venv/bin/activate;
# 3つ同時に実行して探索
python main.py --optuna_db sqlite:///result/study.db &
sleep 3;  # 追加
python main.py --optuna_db sqlite:///result/study.db &
python main.py --optuna_db sqlite:///result/study.db &
wait

新たなDBファイルを作る時にmigrationが走るようで，その最中は load_if_exists=True, を指定したとしてもロックがかかっていたりするのか，DBファイルを正しく開いてくれないことが原因ではないかと推測した．
そこで，1つ目のプロセスでmigrationが終わるまでの数秒間の待機を挟んでから2つ目以降のプロセスを走らせることで解決したと考えられる．

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up