75
77

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

More than 5 years have passed since last update.

ニューラルネットで最適化アルゴリズムを色々試してみる

Last updated at Posted at 2015-10-03

実装は**こちらのGitHubリポジトリ**のdnn/optimizers.py中にあります.

#はじめに
MNISTをベンチマークとして簡単なニューラルネットワークで以下の学習係数最適化を使ってみました.それぞれの評価用データセットに対するaccuracyを比較してみます.

  • SGD
  • Momentum SGD
  • AdaGrad
  • RMSprop
  • AdaDelta
  • Adam

#実装
実装はTheanoを使いました.
今回はソースコードを私のGitHubリポジトリにアップロードしました.

学習係数最適化に関してはここら辺参考にしました.
https://gist.github.com/SnippyHolloW/67effa81dd1cd5a488b4
https://gist.github.com/skaae/ae7225263ca8806868cb
http://chainer.readthedocs.org/en/stable/reference/optimizers.html?highlight=optimizers
http://qiita.com/skitaoka/items/e6afbe238cd69c899b2a

下のコードではparams(or self.params)がネットワーク全体の重み,バイアスを保持しています.確率的勾配降下法で学習するので,誤差関数値lossparamsで微分した値が必要ですが,gparams(or self.gparams)がそれに当たります.(T.grad(loss, param)で微分しています.) 微分して傾きを求めるところまでをOptimizerクラスとして,それを継承してSGD,Momentum SGDなどを実装していきます.

optimizers.py
class Optimizer(object):
	def __init__(self, params=None):
		if params is None:
			return NotImplementedError()
		self.params = params

	def updates(self, loss=None):
		if loss is None:
			return NotImplementedError()

		self.updates = OrderedDict()
		self.gparams = [T.grad(loss, param) for param in self.params]

ちなみにここでのself.updatesは重みなどを更新するときに使います.

##SGD

optimizers.py
class SGD(Optimizer):
	def __init__(self, learning_rate=0.01, params=None):
		super(SGD, self).__init__(params=params)
		self.learning_rate = 0.01

	def updates(self, loss=None):
		super(SGD, self).updates(loss=loss)

		for param, gparam in zip(self.params, self.gparams):
			self.updates[param] = param - self.learning_rate * gparam

		return self.updates	

##Momentum SGD

optimizers.py
class MomentumSGD(Optimizer):
	def __init__(self, learning_rate=0.01, momentum=0.9, params=None):
		super(MomentumSGD, self).__init__(params=params)
		self.learning_rate = learning_rate
		self.momentum = momentum
		self.vs = [build_shared_zeros(t.shape.eval(), 'v') for t in self.params]

	def updates(self, loss=None):
		super(MomentumSGD, self).updates(loss=loss)

		for v, param, gparam in zip(self.vs, self.params, self.gparams):
			_v = v * self.momentum
			_v = _v - self.learning_rate * gparam
			self.updates[param] = param + _v
			self.updates[v] = _v

		return self.updates

##AdaGrad

optimizers.py
class AdaGrad(Optimizer):
	def __init__(self, learning_rate=0.01, eps=1e-6, params=None):
		super(AdaGrad, self).__init__(params=params)

		self.learning_rate = learning_rate
		self.eps = eps
		self.accugrads = [build_shared_zeros(t.shape.eval(),'accugrad') for t in self.params]

	def updates(self, loss=None):
		super(AdaGrad, self).updates(loss=loss)

		for accugrad, param, gparam\
		in zip(self.accugrads, self.params, self.gparams):
			agrad = accugrad + gparam * gparam
			dx = - (self.learning_rate / T.sqrt(agrad + self.eps)) * gparam
			self.updates[param] = param + dx
			self.updates[accugrad] = agrad

		return self.updates

##RMSprop

optimizers.py
class RMSprop(Optimizer):
	def __init__(self, learning_rate=0.001, alpha=0.99, eps=1e-8, params=None):
		super(RMSprop, self).__init__(params=params)

		self.learning_rate = learning_rate
		self.alpha = alpha
		self.eps = eps

		self.mss = [build_shared_zeros(t.shape.eval(),'ms') for t in self.params]

	def updates(self, loss=None):
		super(RMSprop, self).updates(loss=loss)

		for ms, param, gparam in zip(self.mss, self.params, self.gparams):
			_ms = ms*self.alpha
			_ms += (1 - self.alpha) * gparam * gparam
			self.updates[ms] = _ms
			self.updates[param] = param - self.learning_rate * gparam / T.sqrt(_ms + self.eps)

		return self.updates

##AdaDelta

optimizers.py
class AdaDelta(Optimizer):
	def __init__(self, rho=0.95, eps=1e-6, params=None):
		super(AdaDelta, self).__init__(params=params)

		self.rho = rho
		self.eps = eps
		self.accugrads = [build_shared_zeros(t.shape.eval(),'accugrad') for t in self.params]
		self.accudeltas = [build_shared_zeros(t.shape.eval(),'accudelta') for t in self.params]

	def updates(self, loss=None):
		super(AdaDelta, self).updates(loss=loss)

		for accugrad, accudelta, param, gparam\
		in zip(self.accugrads, self.accudeltas, self.params, self.gparams):
			agrad = self.rho * accugrad + (1 - self.rho) * gparam * gparam
			dx = - T.sqrt((accudelta + self.eps)/(agrad + self.eps)) * gparam
			self.updates[accudelta] = (self.rho*accudelta + (1 - self.rho) * dx * dx)
			self.updates[param] = param + dx
			self.updates[accugrad] = agrad

		return self.updates

##Adam

optimizers.py
class Adam(Optimizer):
	def __init__(self, alpha=0.001, beta1=0.9, beta2=0.999, eps=1e-8, gamma=1-1e-8, params=None):
		super(Adam, self).__init__(params=params)

		self.alpha = alpha
		self.b1 = beta1
		self.b2 = beta2
		self.gamma = gamma
		self.t = theano.shared(np.float32(1))
		self.eps = eps

		self.ms = [build_shared_zeros(t.shape.eval(), 'm') for t in self.params]
		self.vs = [build_shared_zeros(t.shape.eval(), 'v') for t in self.params]

	def updates(self, loss=None):
		super(Adam, self).updates(loss=loss)
		self.b1_t = self.b1 * self.gamma ** (self.t - 1)

		for m, v, param, gparam \
		in zip(self.ms, self.vs, self.params, self.gparams):
			_m = self.b1_t * m + (1 - self.b1_t) * gparam
			_v = self.b2 * v + (1 - self.b2) * gparam ** 2

			m_hat = _m / (1 - self.b1 ** self.t)
			v_hat = _v / (1 - self.b2 ** self.t)

			self.updates[param] = param - self.alpha*m_hat / (T.sqrt(v_hat) + self.eps)
			self.updates[m] = _m
			self.updates[v] = _v
		self.updates[self.t] = self.t + 1.0

		return self.updates

#実験
MNISTを使って20epochずつ30シード平均とりました.学習係数やニューラルネットの細かい設定に関しては私のGitHubリポジトリを参照ください.

#結果

うっ,上の方がごちゃごちゃしててわからない,拡大しよう.

SGDさんが消えてしまった.

#おわりに
誤差関数値もとっておけばよかった....

今回のGitHubリポジトリに畳み込みニューラルネットやらStacked Denoising Autoencodersやら今後追加していく予定です.

変なところとかありましたらご指摘いただけるとありがたいです.

75
77
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
75
77

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?