12
6

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

More than 5 years have passed since last update.

GTM (Generative Topographic Mapping) のハイパーパラメータチューニングでベイズ最適化を使った

Last updated at Posted at 2018-11-05

金子先生が公開されている GTM (Generative Topographic Mapping) では、ハイパーパラメータのチューニングにグリッドサーチを使った例を紹介されていましたが、ベイズ最適化でチューニングしたくなったのでテストコードを書いてみました。こちらからGTMをインストール済みであること前提です。

まず最初はオリジナルと同じです。

import matplotlib.figure as figure
import matplotlib.pyplot as plt
import numpy as np
from sklearn.datasets import load_iris

from gtm import gtm
from k3nerror import k3nerror

# load an iris dataset
iris = load_iris()
input_dataset = iris.data
color = iris.target

# autoscaling
input_dataset = (input_dataset - input_dataset.mean(axis=0)) / input_dataset.std(axis=0, ddof=1)

ここから、ベイズ最適化用にコードを書き換えました。

ベイズ最適化に関しては こちら を参考にしました。

import GPy, GPyOpt

k_in_k3nerror = 10

# 変数探索範囲
bounds = [
    {'name': 'shape_of_map_grid', 'type': 'discrete', 'domain': np.arange(30, 31, dtype=int)},
    {'name': 'shape_of_rbf_centers_grid', 'type': 'discrete', 'domain': np.arange(2, 22, 2, dtype=int)},
    {'name': 'variance_of_rbfs_grid', 'type': 'discrete', 'domain': np.arange(-5, 4, 2, dtype=float)},  
    {'name': 'lambda_in_em_algorithm_grid', 'type': 'discrete', 'domain': np.arange(-4, 0, dtype=float)},  
    {'name': 'number_of_iterations', 'type': 'discrete', 'domain': np.arange(300, 301, dtype=float)},
]
def gtmf(x):
    shape_of_map_grid = int(x[:,0][0])
    shape_of_rbf_centers_grid = int(x[:,1][0])
    variance_of_rbfs_grid = int(x[:,2][0])
    lambda_in_em_algorithm_grid = int(x[:,3][0])
    number_of_iterations = int(x[:,4][0])
    display_flag = 0
    model = gtm([shape_of_map_grid, shape_of_map_grid],
                            [shape_of_rbf_centers_grid, shape_of_rbf_centers_grid],
                            2 ** variance_of_rbfs_grid, 2 ** lambda_in_em_algorithm_grid, number_of_iterations, display_flag)
    model.fit(input_dataset)
    if model.success_flag:
        # calculate of responsibilities
        responsibilities = model.responsibility(input_dataset)
        # calculate the mean of responsibilities
        means = responsibilities.dot(model.map_grids)
        # calculate k3n-error
        k3nerror_of_gtm = k3nerror(input_dataset, means, k_in_k3nerror)
    else:
        k3nerror_of_gtm = 10 ** 100
    return k3nerror_of_gtm
myBopt = GPyOpt.methods.BayesianOptimization(f=gtmf, domain=bounds)
myBopt.run_optimization(max_iter=300) 
print(myBopt.x_opt)
print(-myBopt.fx_opt)
[ 30.   2.   3.  -4. 300.]
-0.6821886073347678
shape_of_map = [int(myBopt.x_opt[0]), int(myBopt.x_opt[0])]
shape_of_rbf_centers = [int(myBopt.x_opt[1]), int(myBopt.x_opt[1])]
variance_of_rbfs = 2 ** myBopt.x_opt[2]
lambda_in_em_algorithm = 2 ** myBopt.x_opt[3]
number_of_iterations = int(myBopt.x_opt[4])
display_flag = 0

この後は、再びオリジナルと同じです。

# construct GTM model
model = gtm(shape_of_map, shape_of_rbf_centers, variance_of_rbfs, lambda_in_em_algorithm, number_of_iterations,
            display_flag)
model.fit(input_dataset)
# calculate of responsibilities
responsibilities = model.responsibility(input_dataset)
# plot the mean of responsibilities
means = responsibilities.dot(model.map_grids)
plt.figure(figsize=figure.figaspect(1))
plt.scatter(means[:, 0], means[:, 1], c=color)
plt.ylim(-1.1, 1.1)
plt.xlim(-1.1, 1.1)
plt.xlabel("z1 (mean)")
plt.ylabel("z2 (mean)")
plt.grid()
plt.show()

GTMのハイパーパラメータチューニングでベイズ最適化を使った_12_0.png

print("Optimized hyperparameters")
print("Optimal map size: {0}, {1}".format(shape_of_map[0], shape_of_map[1]))
print("Optimal shape of RBF centers: {0}, {1}".format(shape_of_rbf_centers[0], shape_of_rbf_centers[1]))
print("Optimal variance of RBFs: {0}".format(variance_of_rbfs))
print("Optimal lambda in EM algorithm: {0}".format(lambda_in_em_algorithm))
Optimized hyperparameters
Optimal map size: 30, 30
Optimal shape of RBF centers: 2, 2
Optimal variance of RBFs: 8.0
Optimal lambda in EM algorithm: 0.0625

最適化の履歴

目的関数の最適化の履歴はこのように確認できます。max_iter=300 にしたけど、実際は31回程度で終了したということかな。

plt.plot(myBopt.Y)
plt.ylim([0, 2])
plt.show()

Unknown.png

結果

  • オリジナル(グリッドサーチ版)とほぼ同じ結果が得られました。
  • 計算時間は、オリジナルが約9分30秒だったのに比べ、ベイズ最適化版は約1分50秒でした。
12
6
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
12
6

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?