Optunaの探索結果をJSON形式で保存する方法

Posted at 2024-08-04

初めに

機械学習モデルのハイパーパラメータチューニングを効率化するために、Optunaは非常に有用なライブラリです。この記事では、具体的な例を用いてOptunaの探索結果をJSON形式で保存する方法を紹介します！

Optunaの基本的な使い方

まず、Optunaの基本的な使い方を簡単に紹介します。ここでは、シンプルな最小化問題を解決するためにOptunaを使用する例を示します。

例: シンプルな最小化問題

以下のコードは、$f(x) = (x - 2)^2$という関数を最小化するためのOptunaの使用例です。

import optuna

# 目的関数の定義
def objective(trial):
    x = trial.suggest_uniform('x', -10, 10)
    return (x - 2) ** 2

# スタディの作成
study = optuna.create_study(direction='minimize')
study.optimize(objective, n_trials=100)

print(f'Best value: {study.best_value}')
print(f'Best params: {study.best_params}')

このコードでは、suggest_uniformメソッドを使用して、探索範囲[-10, 10]の中からパラメータxをサンプリングしています。study.optimizeメソッドを使用して、100回のトライアルを実行し、最適なパラメータを見つけます。

JSON形式での保存

次に、探索結果をJSON形式で保存する方法を説明します。以下のコードでは、study.trialsから各トライアルの結果を取得し、JSON形式に変換してファイルに保存しています。

import json

# 各トライアルの結果をJSON形式で取得
trials = study.trials
trials_json = [trial.params for trial in trials]

# JSONファイルとして保存
with open('optuna_trials.json', 'w') as f:
    json.dump(trials_json, f, indent=2)

保存された結果の読み込み

保存したJSONファイルを読み込む方法も紹介します。以下のコードを使用して、保存したJSONファイルから結果を読み込むことができます。

# JSONファイルから結果を読み込む
with open('optuna_trials.json', 'r') as f:
    loaded_trials = json.load(f)

print(loaded_trials)

（おまけ1）最良のパラメーターのみをJSON形式で保存

探索結果の中で最良のパラメータだけをJSON形式で保存する方法を説明します。この方法は、ファイルサイズを小さくし、重要な情報を素早く取得するのに有効です。

# 最良のパラメータを取得
best_params = study.best_params
print(f'Best params: {best_params}')

# 最良のパラメータをJSONファイルとして保存
with open('best_params.json', 'w') as f:
    json.dump(best_params, f, indent=2)
------------------------------------------------------
# JSONファイルから最良のパラメータを読み込む
with open('best_params.json', 'r') as f:
    loaded_best_params = json.load(f)

print(f'Loaded best params: {loaded_best_params}')

（おまけ2）ランダムフォレストのハイパーパラメータチューニング

具体的な例として、ランダムフォレストのハイパーパラメータチューニングを行い、その結果をJSON形式で保存する方法を紹介します。

import optuna
from sklearn import datasets
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import cross_val_score

# データセットのロード
iris = datasets.load_iris()
X = iris.data
y = iris.target

# 目的関数の定義
def objective(trial):
    rf_n_estimators = trial.suggest_int('rf_n_estimators', 10, 200)
    rf_max_depth = trial.suggest_int('rf_max_depth', 2, 32)
    classifier_obj = RandomForestClassifier(n_estimators=rf_n_estimators, max_depth=rf_max_depth)
    
    score = cross_val_score(classifier_obj, X, y, n_jobs=-1, cv=3)
    accuracy = score.mean()
    return accuracy

# スタディの作成
study = optuna.create_study(direction='maximize')
study.optimize(objective, n_trials=100)

# トライアル結果の保存
trials = study.trials
trials_json = [trial.params for trial in trials]

with open('rf_optuna_trials.json', 'w') as f:
    json.dump(trials_json, f, indent=2)

# 最良のパラメータを保存
best_params = study.best_params
with open('best_params.json', 'w') as f:
    json.dump(best_params, f, indent=2)

このコードでは、ランダムフォレストのハイパーパラメータであるn_estimatorsとmax_depthを最適化しています。最適化の結果をJSON形式で保存することで、後で簡単に再利用できます。

まとめ

Optunaの探索結果をJSON形式で保存する方法と、最良のパラメータのみをJSON形式で保存する方法について、具体例を用いて説明しました。JSON形式で保存することで、結果の共有や再利用が容易になります。Optunaを使用している方は、ぜひこの方法を試してみてください。

参考リンク

Optuna公式ドキュメント
https://optuna.readthedocs.io/en/stable/
Optunaの使い方備忘録
https://qiita.com/mikka/items/5d81410b30cfe12c1c34
optuna入門
https://qiita.com/studio_haneya/items/2dc3ba9d7cafa36ddffa
Optunaでパラメーター探索中に常にベストモデルを保存しておく方法
https://qiita.com/hasesho/items/c3870df397e777585387

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up