Qiita Teams that are logged in
You are not logged in to any team

Log in to Qiita Team
Community
OrganizationEventAdvent CalendarQiitadon (β)
Service
Qiita JobsQiita ZineQiita Blog
2
Help us understand the problem. What are the problem?

ガウス過程 (Part 1)

参考文献はこちら1.(間違ってるかもしれないので,訂正あればお願いいたします.)

1. ガウス過程とは

「ある確率変数の集合を$F$とし,任意の自然数$N$に対して,$F$から選んだN個の確率変数{$f(x_1),...,f(x_N)$}がガウス分布に従うとき,$F$をガウス過程と呼ぶ.」

具体的には,平均関数{$m(x_1),...,m(x_N)$}とカーネル関数$k(x,x')$のカーネル行列Kを共分散行列に持つと考えると,$f\sim {\rm GP}(m,K)=\mathcal{N}(m(\mathbf{x}),K)$と書ける.

1.1. ガウス過程の良さ

共分散行列(カーネル関数)を用意してしまえば,明示的なパラメータを必要としない.
ノンパラメトリックベイズと言われており,不確実性を表すことができる.

1.2. 精度に関して

カーネル関数の精度を測るために,対数周辺尤度(またはELBO)が大きいものを選べばよく,この値は多次元ガウス分布の定義から計算することができ,GPytorch等のライブラリでも簡単に実装できる.

2.実験

回帰問題を考える.

今回考えるガウス過程はコンスタントな値の平均を採用し,カーネル関数に関しては$k(x,y)=\sigma^2\exp{-\frac{(x-y)^2}{2l}}$を考える.よって,このハイパーパラメータ$\sigma,l$は事前分布をもとに,勾配降下法や変分推論等で求める.その際に,ELBO(つまり,対数周辺尤度)を最大化するように設計する.

コードはこちらを参考にする(アクセス日 2021/08/11)

(GPyTorchでガウス過程を見てみよう - はてなブログ by s0sem0y ,
cornellius-gp/gpytorch (公式github),
GPyTorch’s documentation,
GPyTorch Regression Tutorial,
Stochastic Variational GP Regression)

環境構築(Google Colaboratory)
!pip install gpytorch

import gpytorch
import torch
import numpy as np
import matplotlib.pyplot as plt
import gpytorch
import torch
import numpy as np
import matplotlib.pyplot as plt
from gpytorch.models import ApproximateGP
from gpytorch.variational import CholeskyVariationalDistribution
from gpytorch.variational import VariationalStrategy

2.1.1 モデル構築(正確なGP推論のモデル)

# We will use the simplest form of GP model, exact inference
class ExactGPModel(gpytorch.models.ExactGP):
    def __init__(self, train_x, train_y, likelihood, 
                 lengthscale_prior=None, outputscale_prior=None):
        super(ExactGPModel, self).__init__(train_x,
                                      train_y,
                                      likelihood,
                                      )
        self.mean_module = gpytorch.means.ConstantMean()
        self.covar_module = gpytorch.kernels.ScaleKernel(
            gpytorch.kernels.RBFKernel(lengthscale_prior=lengthscale_prior),
            outputscale_prior=outputscale_prior
            )

    def forward(self, x):
        mean_x = self.mean_module(x)
        covar_x = self.covar_module(x)
        return gpytorch.distributions.MultivariateNormal(mean_x, covar_x)

2.1.2 学習クラスの作成

num_epoch=2000
class Trainer(object):

    def __init__(self, gpr, likelihood, optimizer, mll):
        self.gpr = gpr
        self.likelihood = likelihood
        self.optimizer = optimizer
        self.mll = mll

    def update_hyperparameter(self, epochs):
        # Find optimal model hyperparameters
        self.gpr.train()
        self.likelihood.train()
        for epoch in range(epochs):
            # Zero gradients from previous iteration
            self.optimizer.zero_grad()
            # Output from model
            output = self.gpr(self.gpr.train_inputs[0])

            # Calc loss and backprop gradients
            loss = - self.mll(output, self.gpr.train_targets)
            loss.backward()
            self.optimizer.step()

            if epoch % (epochs//10) == 0:
              print('Iter %d/%d - Loss: %.3f   lengthscale: %.3f   noise: %.3f' % (
                          epoch + 1, epochs, loss.item(),
                          self.gpr.covar_module.base_kernel.lengthscale.item(),
                          self.gpr.likelihood.noise.item()
                      ))

2.1.3. 正確なGPのモデル 例

初期状態を設定をする.

# モデルの初期値や学習データの再現性の確保
torch.manual_seed(0)
# Training data is 10 points in [0,1] inclusive regularly spaced
## GPでは入力は多次元前提なので (num_data, dim) という shape
train_inputs = torch.linspace(0, 1, 10).reshape(10, 1)

# True function is sin(2*pi*x) with Gaussian noise
## 一方で出力は一次元前提なので (num_data) という形式にする
train_targets = torch.sin(2*np.pi*train_inputs).reshape(10) + 0.3*torch.randn(10)

# initialize likelihood and model
likelihood = gpytorch.likelihoods.GaussianLikelihood()
l_prior = gpytorch.priors.NormalPrior(loc=torch.tensor(1.), 
                                      scale=torch.tensor(10.))
s_prior = gpytorch.priors.NormalPrior(loc=torch.tensor(1.), 
                                      scale=torch.tensor(10.))
gpr = ExactGPModel(train_inputs, train_targets, likelihood, 
              lengthscale_prior=l_prior, outputscale_prior=s_prior)

# "Loss" for GPs - the marginal log likelihood
mll = gpytorch.mlls.ExactMarginalLogLikelihood(likelihood, gpr)

# Use the RMSprop optimizer
optimizer = torch.optim.RMSprop(params=gpr.parameters(),
                                lr=1e-2)

訓練

trainer = Trainer(gpr, likelihood, optimizer, mll)
trainer.update_hyperparameter(num_epoch)
Iter 1/2000 - Loss: 1.089   lengthscale: 0.644   noise: 0.644
Iter 201/2000 - Loss: 0.706   lengthscale: 0.227   noise: 0.110
Iter 401/2000 - Loss: 0.706   lengthscale: 0.227   noise: 0.107
Iter 601/2000 - Loss: 0.706   lengthscale: 0.228   noise: 0.106
Iter 801/2000 - Loss: 0.706   lengthscale: 0.228   noise: 0.106
Iter 1001/2000 - Loss: 0.706   lengthscale: 0.228   noise: 0.106
Iter 1201/2000 - Loss: 0.706   lengthscale: 0.228   noise: 0.106
Iter 1401/2000 - Loss: 0.706   lengthscale: 0.228   noise: 0.106
Iter 1601/2000 - Loss: 0.706   lengthscale: 0.228   noise: 0.106
Iter 1801/2000 - Loss: 0.706   lengthscale: 0.228   noise: 0.106

結果を表示

test_inputs = torch.linspace(-0.2, 1.2, 100)

gpr.eval()
likelihood.eval()
with torch.no_grad():
    predicts = likelihood(gpr(test_inputs))
    predicts_mean = predicts.mean
    predicts_std = predicts.stddev

plt.plot(test_inputs.numpy(), predicts_mean.numpy())
plt.fill_between(test_inputs.numpy(), 
                 predicts_mean.numpy() - 0.9*predicts_std.numpy(), 
                 predicts_mean.numpy() + 0.9*predicts_std.numpy(), alpha=0.4)
plt.plot(train_inputs.numpy(), train_targets.numpy(), "ro")

likelihood = gpytorch.likelihoods.GaussianLikelihood()
s=torch.tensor(0.)
for i in likelihood(gpr(train_inputs)):
  s+=i
print(s) # MultivariateNormal(loc: -0.8138006329536438)

en.png

良い感じです(事前分布を極端なものにすると極端な結果になります).

2.2.1 モデル構築(近似的なGP推論のモデル)

(詳細は今後追加します)

# We will use the simple form of GP model, variational inference

class GPModel(ApproximateGP):
    def __init__(self, inducing_points,
                 lengthscale_prior=None, outputscale_prior=None):
        variational_distribution = CholeskyVariationalDistribution(inducing_points.size(0))
        variational_strategy = VariationalStrategy(self, inducing_points, variational_distribution, learn_inducing_locations=True)
        super(GPModel, self).__init__(variational_strategy)
        self.mean_module = gpytorch.means.ConstantMean()
        self.covar_module = gpytorch.kernels.ScaleKernel(
            gpytorch.kernels.RBFKernel(lengthscale_prior=lengthscale_prior),
            outputscale_prior=outputscale_prior
            )

    def forward(self, x):
        mean_x = self.mean_module(x)
        covar_x = self.covar_module(x)
        return gpytorch.distributions.MultivariateNormal(mean_x, covar_x)

num_epochs = 2000

class V_Trainer(object):
    def __init__(self, gpr, likelihood, optimizer, mll):
        self.gpr = gpr
        self.likelihood = likelihood
        self.optimizer = optimizer
        self.mll = mll

    def update_hyperparameter(self, epochs):
        # Find optimal model hyperparameters
        self.gpr.train()
        self.likelihood.train()
        epochs_iter = tqdm.notebook.tqdm(range(num_epochs), desc="Epoch")
        for i in epochs_iter:
            # Within each iteration, we will go over each minibatch of data
            minibatch_iter = tqdm.notebook.tqdm(train_loader, desc="Minibatch", leave=False)
            for x_batch, y_batch in minibatch_iter:
                optimizer.zero_grad()
                output = model(x_batch)
                loss = -mll(output, y_batch)
                minibatch_iter.set_postfix(loss=loss.item())
                loss.backward()
                optimizer.step()

torch.manual_seed(0)
# Training data is 10 points in [0,1] inclusive regularly spaced
## GPでは入力は多次元前提なので (num_data, dim) という shape
train_inputs = torch.linspace(0, 1, 10).reshape(10, 1)

# True function is sin(2*pi*x) with Gaussian noise
## 一方で出力は一次元前提なので (num_data) という形式にする
train_targets = torch.sin(2*np.pi*train_inputs).reshape(10) + 0.3*torch.randn(10)

train_dataset = TensorDataset(train_inputs, train_targets)

train_loader = DataLoader(train_dataset, batch_size=10, shuffle=True)

# initialize likelihood and model
likelihood = gpytorch.likelihoods.GaussianLikelihood()
l_prior = gpytorch.priors.NormalPrior(loc=torch.tensor(1.), 
                                      scale=torch.tensor(10.))
s_prior = gpytorch.priors.NormalPrior(loc=torch.tensor(1.), 
                                      scale=torch.tensor(10.))
inducing_points=train_inputs[:2]
model = GPModel(inducing_points=inducing_points,
                lengthscale_prior=l_prior, outputscale_prior=s_prior)

# Our loss object. We're using the VariationalELBO
mll = gpytorch.mlls.VariationalELBO(likelihood, model, num_data=train_inputs.size(0))

optimizer = torch.optim.Adam([
    {'params': model.parameters()},
    {'params': likelihood.parameters()},
], lr=0.01)

trainer = V_Trainer(model, likelihood, optimizer, mll)
trainer.update_hyperparameter(num_epoch)

test_inputs = torch.linspace(-0.2, 1.2, 100)

# Get into evaluation (predictive posterior) mode
model.eval()
likelihood.eval()
# Test points are regularly spaced along [0,1]
# Make predictions by feeding model through likelihood
with torch.no_grad():
    predicts = likelihood(model(test_inputs))
    predicts_mean = predicts.mean
    predicts_std = predicts.stddev

plt.plot(test_inputs.numpy(), predicts_mean.numpy())
plt.fill_between(test_inputs.numpy(), 
                 predicts_mean.numpy() - 0.9*predicts_std.numpy(), 
                 predicts_mean.numpy() + 0.9*predicts_std.numpy(), alpha=0.4)
plt.plot(train_inputs.numpy(), train_targets.numpy(), "ro")

likelihood = gpytorch.likelihoods.GaussianLikelihood()
s=torch.tensor(0.)
for i in likelihood(gpr(train_inputs)):
  s+=i
print(s) # -0.8138006329536438

v_en.png

next

GPLVM(ノンパラメトリックな教師なし学習モデル)を扱いたいです.

Why not register and get more from Qiita?
  1. We will deliver articles that match you
    By following users and tags, you can catch up information on technical fields that you are interested in as a whole
  2. you can read useful information later efficiently
    By "stocking" the articles you like, you can search right away
2
Help us understand the problem. What are the problem?