LoginSignup
1
2

More than 5 years have passed since last update.

Gated CNN x Separable Convolution の実装に関するメモ

Last updated at Posted at 2018-06-05

ポイント

  • Gated CNN において、Regular Convolution を Separable Convolution に置き換え。
  • Sequential MNIST を使用。
  • 今後、パラメータ数の削減効果、Dilation と Filter size のトレードオフを追加検証。

レファレンス

1. Language Modeling with Gated Convolutional Networks
2. Depthwise Separable Convolutions for Neural Machine Translation
3. Gated CNN に関するメモ

検証方法

  • Sequential MNIST を [14*14, 1] に縮尺。
  • パラメータ数 ( Regular : 約16,000、Separable : 約4,000 )

   

データ

MNIST handwritten digits

from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets('***/mnist', \
                                     one_hot = True)

train_images = np.reshape(mnist.train.images, [-1, 28, 28])
train_images = skimage.measure.block_reduce(train_images, \
                    block_size = (1, 2, 2), func = np.max)
train_images = np.reshape(train_images, [-1, 14*14])

検証結果

数値計算例:

  • n_in = 14*14
  • filter_size = 5
  • n_units = 32
  • n_out = 10
  • n_layers = 4
  • residual_step = 2
  • learning_rate = 0.01
  • batch_size = 64

Regular Convolution
image.png

Separable Convolution
image.png

サンプルコード

  def dilated_gated_separable_conv(self, x, shape_d, shape_p, rate):
    w_d = self.weight_variable('w_d', shape_d)
    w_p = self.weight_variable('w_p', shape_p)
    b = self.weight_variable('b', shape_p[-1])
    v_d = self.weight_variable('v_d', shape_d)
    v_p = self.weight_variable('v_p', shape_p)
    c = self.weight_variable('c', shape_p[-1])

    f = tf.add(tf.nn.separable_conv2d(x, w_d, w_p, \
            strides = [1, 1, 1, 1], padding = 'VALID', \
            rate = [1, rate]), b)
    g = tf.add(tf.nn.separable_conv2d(x, v_d, v_p, \
            strides = [1, 1, 1, 1], padding = 'VALID', \
            rate = [1, rate]), c)

    output = tf.multiply(f, tf.sigmoid(g))  # GLU
    #output = tf.multiply(tf.tanh(f), tf.sigmoid(g))  # GTU
    #output = tf.nn.relu(f)  # ReLU
    #output = tf.tanh(f)  # Tanh

    return output

  def inference_2(self, x, filter_size, n_in, n_units, \
                    n_out, n_layers, residual_step):

    width = n_in
    channel = 1
    x = tf.reshape(x, [-1, 1, width, channel])

    x = tf.nn.max_pool(x, ksize = [1, 1, 2, 1], \
               strides = [1, 1, 2, 1], padding = 'SAME')

    shape = [1, 1, channel, n_units]
    with tf.variable_scope('initial'):
      y = self.conv(x, shape)

    inputs = y

    # for regular convolution
    #shape = [1, filter_size, n_units, n_units]
    # for separable convolution
    shape_d = [1, filter_size, n_units, 1]
    shape_p = [1, 1, n_units, n_units]
    for i in range(n_layers):
      with tf.variable_scope('layer_{}'.format(i + 1)):

        # Dilated
        y = tf.pad(y, [[0, 0], [0, 0], \
                 [2**i * (filter_size - 1), 0], [0, 0]])
        # Separable Convolution Without Weight Normalization
        y = self.dilated_gated_separable_conv(y, \
                 shape_d, shape_p, 2**i)

        if (i + 1) % residual_step == 0:
          y += inputs
          inputs = y

    y = y[:, :, -1, :]
    y = tf.squeeze(y, axis = 1)

    with tf.variable_scope('final'):
      w = self.weight_variable('w', [n_units, n_out])
      b = self.bias_variable('b', [n_out])

      y = tf.add(tf.matmul(y, w), b)
      y = tf.nn.softmax(y, axis = 1)

    return y

1
2
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
1
2