More than 5 years have passed since last update.

TensorflowのRNNとLSTMのパラメータ、num_units解析

Last updated at 2018-02-20Posted at 2018-02-20

TensorflowでRNNとLSTMを実装する時にBasicRNNCellやBasicLSTMCellを使う場合、パラメータとして「num_units」を指定します。

# RNN
# num_units: int, The number of units in the RNN cell.
rnn = tf.contrib.rnn.BasicRNNCell(num_units)
# LSTM
# num_units: int, The number of units in the LSTM cell.
lstm = tf.contrib.rnn.BasicLSTMCell(num_units)

Feed forward ニューラルネットワークと比較すると、RNNのnum_unitsは中のhidden layerのunit(もしくはノード)の数だと理解できます。
基本的なLSTMの場合、全ての層のnum_unitsが一致しないと内部処理が混乱のため、全ての層のunit数が統一されます。つまり、下記の４つの層のunit(ノード)数がnum_unitsで統一されます。

input gate
new input
forget gate
output gate

下記はTensorflowのLSTMソースコードになります。

class BasicLSTMCell(LayerRNNCell):
 ...
  def build(self, inputs_shape): 
    ...
    self._kernel = self.add_variable( 
        _WEIGHTS_VARIABLE_NAME, 
        shape=[input_depth + h_depth, 4 * self._num_units]) 
    self._bias = self.add_variable( 
        _BIAS_VARIABLE_NAME, 
        shape=[4 * self._num_units], 
        initializer=init_ops.zeros_initializer(dtype=self.dtype)) 
    ...
  def call(self, inputs, state):
    ...
    # ここで新しいインプット
    gate_inputs = math_ops.matmul( 
        array_ops.concat([inputs, h], 1), self._kernel) 
    gate_inputs = nn_ops.bias_add(gate_inputs, self._bias) 
    ...
    # i = input_gate, j = new_input, f = forget_gate, o = output_gate 
    i, j, f, o = array_ops.split( 
        value=gate_inputs, num_or_size_splits=4, axis=one) 
    ...

ご覧の通り、層を作る時にnum_unitsを４倍して重みを作り、使用時には各層で均等に割って使っていることがわかります。

ネット上のLSTMの例を見るとinputのところに自分でweight, biasを組む人をたまに見ますが、スースコードを見ると、内部でやってくれているのでそれは不要かと思いました。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up