More than 5 years have passed since last update.

TensorflowでMNISTデータ認識＋CNN(Convolutional Neural Net)

Last updated at 2016-10-23Posted at 2016-10-10

はじめに

これまで多層ニューラルネットをいじってきたが、そろそろCNNに入ろう！というわけで早速話を進めようと思う。

ソース

（仮）クラス導入修正版 [mnist_CNN_Graph_adhoc2.py]
(https://github.com/WaterIsland/DLStudy/blob/master/tensorflow/source/mine/mnist/mnist_CNN_Graph_adhoc2.py)
（仮）クラス導入修正版 ExtendedTensorflowCNN.py

実装の概要

CNN(Convolutional Neural Net)の実装が今回実施した内容である。
ソースの出元は、本家に掲載されているものである。それに対して、Tensorboard の GRAPHS と EVENTS を表示できるように改良している。

実行環境

ざっくり以下の環境。
・Mac OS X 10.10.5
・Python 3.5.1
・virtualenv
・IPython

処理内容（特筆すべき処理）

冒頭でも書いたが、Tensorboard の GRAPHS を出力できるようにしている。
以前書いたソースの如く、以下のような withブロックが乱立している部分が GRAPHS 出力に該当する。

mnist_CNN_Graph_adhoc.py

    with tf.name_scope('input') as scope:
        x = tf.placeholder(tf.float32, shape=[None, 784], name='x') # --> 純粋な入力格納先
        x_image = tf.reshape(x, [-1,28,28,1], name='x-pixel_order') # --> 画素順に並べ替えた入力格納先
    with tf.name_scope('teach') as scope:
        y_ = tf.placeholder(tf.float32, shape=[None, 10], name='d') # --> 出力と比較するための教師格納先

それと、EVENTSの出力は、以下の "tf.scalar_summary(〜)" の部分である。
ここでも GRAPHS 出力の with ブロックがあることに気づくだろう。

mnist_CNN_Graph_adhoc.py

    with tf.name_scope('loss') as scope:
        cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(y_conv, y_)) # --> 損失関数の定義
        tf.scalar_summary('cross_entropy', cross_entropy) # --> Tensorboard の EVENTS で見るための記述
    with tf.name_scope('training') as scope:
        train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy) # --> 学習法の定義
    with tf.name_scope('test') as scope:
        correct_prediction = tf.equal(tf.argmax(y_conv,1), tf.argmax(y_,1)) # --> 同じ場所にビットが立っているかをチェック
        accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32)) # --> 正答率の定義
        tf.scalar_summary('accuracy', accuracy) # --> Tensorboard の EVENTS で見るための記述

それと、以下の部分である。コメントにアドホックなテクニックと書いている通り、泥臭い手法でテストデータを20分割してテストしている。
＃　テストしようとしたらメモリ消費量が６GB超えてえらいことに...いいマシンほしい（再び）
＃　本来は、一番下のコメントアウトされている２行で解決！

mnist_CNN_Graph_adhoc.py

    # adhoc technique
    split_number = 20 # --> 20分割しますよ。以下、略。
    total_number = len(mnist.test.images)
    odd_number = total_number % split_number
    div_number = int((total_number - odd_number) / split_number)
    numbers = [div_number for i in range(split_number)]
    if odd_number > 0:
        numbers.append(odd_number)
        split_number = split_number + 1
    print(numbers)

    total_accuracy = 0
    start_number = 0
    for i in range(split_number):
        local_accuracy = accuracy.eval(feed_dict={x: mnist.test.images[start_number:start_number + numbers[i]], 
                                       y_: mnist.test.labels[start_number:start_number + numbers[i]], 
                                       keep_prob: 1.0})
        total_accuracy = total_accuracy + local_accuracy * numbers[i]
        print("[%5d-%5d]test accuracy[%d]: %.3f" % (start_number, start_number + numbers[i], i, local_accuracy))
        start_number = start_number + numbers[i]

    print("Total Accuracy is %.3f" % (total_accuracy / total_number))
#    print("test accuracy %g"%accuracy.eval(feed_dict={
#        x: mnist.test.images, y_: mnist.test.labels, keep_prob: 1.0}))

処理結果

本来は学習で20000回ぶんまわすが、時間の都合上、1001回にした。
1001回？中途半端！と思うかもしれないが、Tensorbord に出力する EVENTS の都合である。
＃　EVENTS は200回毎に出力する。ぐうたらの産物である...
また、学習下の画像でわかると思うが、学習１ステップが大体 0.35秒ぐらいである。20000回学習すると、約2時間かかる。時間があるときにしよう！と考えての所作である。
＃　余談だが、以前、自分で書いたフルスクラッチな多層ニューラルネットより早いとか、Googleすごいの一言である。

こんな感じである。
学習データに対する accuracy は 96.0% である。