Optunaの概要とPyTorchにおけるハイパーパラメータ最適化まとめ

Posted at 2025-01-29

Optunaの基本トピック

Optunaの概要

OptunaはNSGAⅡなどのアルゴリズムを用いて最適化を行うライブラリです。

インストールと基本コードの実行

Optunaは下記を実行することでPyPIから入手かつインストールが可能です。

$ pip install optuna

たとえば下記を実行することによって、$y=(x-2)^{2}$の関数の最適化(最小値問題を解く)を行うことができます。

import optuna

def objective(trial):
    x = trial.suggest_float('x', -10, 10)
    return (x - 2) ** 2

study = optuna.create_study()
study.optimize(objective, n_trials=100)

print(study.best_params)
print(study.best_params['x'])

・実行結果

Trial 0 finished with value: 21.326323753205866 and parameters: {'x': -2.6180432818679673}. Best is trial 0 with value: 21.326323753205866.
Trial 1 finished with value: 2.6087324813104438 and parameters: {'x': 0.3848428926849117}. Best is trial 1 with value: 2.6087324813104438.
Trial 2 finished with value: 24.326855996945703 and parameters: {'x': 6.9322262718721355}. Best is trial 1 with value: 2.6087324813104438.
...
Trial 9 finished with value: 112.09664765340895 and parameters: {'x': -8.58757043204006}. Best is trial 1 with value: 2.6087324813104438.
Trial 10 finished with value: 0.5343605234582444 and parameters: {'x': 2.730999674047974}. Best is trial 10 with value: 0.5343605234582444.
...
Trial 19 finished with value: 0.07875673527265561 and parameters: {'x': 2.2806363042670275}. Best is trial 19 with value: 0.07875673527265561.
Trial 20 finished with value: 39.84726213307998 and parameters: {'x': -4.312468782741027}. Best is trial 19 with value: 0.07875673527265561.
Trial 21 finished with value: 0.06736154429417497 and parameters: {'x': 1.7404589737745206}. Best is trial 21 with value: 0.06736154429417497.
...
Trial 27 finished with value: 0.03468694752118066 and parameters: {'x': 1.813755677882034}. Best is trial 27 with value: 0.03468694752118066.
...
Trial 32 finished with value: 0.020789077305940862 and parameters: {'x': 2.1441841784175395}. Best is trial 32 with value: 0.020789077305940862.
...
Trial 38 finished with value: 0.013396194111183715 and parameters: {'x': 2.115741928924585}. Best is trial 38 with value: 0.013396194111183715.
...
Trial 41 finished with value: 0.004804307453818769 and parameters: {'x': 1.930686888297965}. Best is trial 41 with value: 0.004804307453818769.
...
Trial 51 finished with value: 5.0554755004907705e-05 and parameters: {'x': 1.9928898132932455}. Best is trial 51 with value: 5.0554755004907705e-05.
...
Trial 99 finished with value: 55.818369789377705 and parameters: {'x': 9.471169238437696}. Best is trial 51 with value: 5.0554755004907705e-05.

{'x': 1.9928898132932455}
1.9928898132932455

最適化された結果を取り出す場合はstudy.best_paramsを用いればDict型の結果を取得することができます。

探索空間(Search Space)の定義

Optunaにおける探索空間の定義にあたっては下記のメソッドを用います。

・optuna.trial.Trial.suggest_categorical(): カテゴリカルなパラメータに使用
・optuna.trial.Trial.suggest_int(): int型のパラメータに使用
・optuna.trial.Trial.suggest_float(): float型のパラメータに使用

上記はそれぞれobjective関数の中で下記のように用います。

import optuna

def objective(trial):
    # Categorical parameter
    optimizer = trial.suggest_categorical("optimizer", ["MomentumSGD", "Adam"])

    # Integer parameter
    num_layers = trial.suggest_int("num_layers", 1, 3)

    # Integer parameter (log)
    num_channels = trial.suggest_int("num_channels", 32, 512, log=True)

    # Integer parameter (discretized)
    num_units = trial.suggest_int("num_units", 10, 100, step=5)

    # Floating point parameter
    dropout_rate = trial.suggest_float("dropout_rate", 0.0, 1.0)

    # Floating point parameter (log)
    learning_rate = trial.suggest_float("learning_rate", 1e-5, 1e-2, log=True)

    # Floating point parameter (discretized)
    drop_path_rate = trial.suggest_float("drop_path_rate", 0.0, 1.0, step=0.1)

Optunaを用いたDeepLearningのハイパーパラメータの最適化

以下、Optunaを用いたDeepLearning(PyTorch)のハイパーパラメータの最適化の例である上記の確認を行います。

プログラムの実行

$ python pytorch_simple.py

上記のコマンドを実行すると下記のような結果が得られます。

Study statistics: 
  Number of finished trials:  100
  Number of pruned trials:  62
  Number of complete trials:  38
Best trial:
  Value:  0.84921875
  Params: 
    n_layers: 1
    n_units_l0: 117
    dropout_l0: 0.26561837045237663
    optimizer: Adam
    lr: 0.0055972211613692325

実装の大枠の確認

pytorch_simple.py

if __name__ == "__main__":
    study = optuna.create_study(direction="maximize")
    study.optimize(objective, n_trials=100, timeout=600)

    pruned_trials = study.get_trials(deepcopy=False, states=[TrialState.PRUNED])
    complete_trials = study.get_trials(deepcopy=False, states=[TrialState.COMPLETE])

    print("Study statistics: ")
    print("  Number of finished trials: ", len(study.trials))
    print("  Number of pruned trials: ", len(pruned_trials))
    print("  Number of complete trials: ", len(complete_trials))

    print("Best trial:")
    trial = study.best_trial

    print("  Value: ", trial.value)

    print("  Params: ")
    for key, value in trial.params.items():
        print("    {}: {}".format(key, value))

pytorch_simple.pyを実行すると、上記のコードが実行されます。まず2行目のoptuna.create_studyと3行目のstudy.optimizeに着目すると良いと思います。また、8行目以降はハイパーパラメータの最適化結果のサマリーの出力を行っています。

objective関数の確認

pytorch_simple.pyではobjective関数が下記のように定義されています。

pytorch_simple.py

def objective(trial):
    # Generate the model.
    model = define_model(trial).to(DEVICE)

    # Generate the optimizers.
    optimizer_name = trial.suggest_categorical("optimizer", ["Adam", "RMSprop", "SGD"])
    lr = trial.suggest_float("lr", 1e-5, 1e-1, log=True)
    optimizer = getattr(optim, optimizer_name)(model.parameters(), lr=lr)

    # Get the FashionMNIST dataset.
    train_loader, valid_loader = get_mnist()

    # Training of the model.
    for epoch in range(EPOCHS):
        model.train()
        for batch_idx, (data, target) in enumerate(train_loader):
            # Limiting training data for faster epochs.
            if batch_idx * BATCHSIZE >= N_TRAIN_EXAMPLES:
                break

            data, target = data.view(data.size(0), -1).to(DEVICE), target.to(DEVICE)

            optimizer.zero_grad()
            output = model(data)
            loss = F.nll_loss(output, target)
            loss.backward()
            optimizer.step()

        # Validation of the model.
        model.eval()
        correct = 0
        with torch.no_grad():
            for batch_idx, (data, target) in enumerate(valid_loader):
                # Limiting validation data.
                if batch_idx * BATCHSIZE >= N_VALID_EXAMPLES:
                    break
                data, target = data.view(data.size(0), -1).to(DEVICE), target.to(DEVICE)
                output = model(data)
                # Get the index of the max log-probability.
                pred = output.argmax(dim=1, keepdim=True)
                correct += pred.eq(target.view_as(pred)).sum().item()

        accuracy = correct / min(len(valid_loader.dataset), N_VALID_EXAMPLES)

        trial.report(accuracy, epoch)

        # Handle pruning based on the intermediate value.
        if trial.should_prune():
            raise optuna.exceptions.TrialPruned()

    return accuracy

上記の確認にあたっては、まず6行目のtrial.suggest_categoricalと7行目のtrial.suggest_floatでハイパーパラメータを取り扱っていることに着目すると良いと思います。また、return accuracyとあるので、目的関数はaccuraryが対応します。

前項で出力された結果にはハイパーパラメータがoptimizerとlr以外にn_layers・n_units_l0・dropout_l0がありました。これらは下記のdefine_model関数で定義され、上記の3行目で呼び出されています。

pytorch_simple.py

def define_model(trial):
    # We optimize the number of layers, hidden units and dropout ratio in each layer.
    n_layers = trial.suggest_int("n_layers", 1, 3)
    layers = []

    in_features = 28 * 28
    for i in range(n_layers):
        out_features = trial.suggest_int("n_units_l{}".format(i), 4, 128)
        layers.append(nn.Linear(in_features, out_features))
        layers.append(nn.ReLU())
        p = trial.suggest_float("dropout_l{}".format(i), 0.2, 0.5)
        layers.append(nn.Dropout(p))

        in_features = out_features
    layers.append(nn.Linear(in_features, CLASSES))
    layers.append(nn.LogSoftmax(dim=1))

    return nn.Sequential(*layers)

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up