More than 3 years have passed since last update.

MLP、CNNを使って物体認識を行う〜精度を向上させる（その１） / Kaggle CIFAR-10

Last updated at 2021-02-08Posted at 2021-02-08

KaggleのCIFAR-10コンペを題材にモデルを組み、精度を向上させるプロセスをまとめてみたいと思います。

コンペの概要

CIFAR-10 - Object Recognition in Images
https://www.kaggle.com/c/cifar-10

有名な物体認識用のデータセット「CIFAR-10」を使用し、そのデータが果たして何の画像なのかを分類するタスクです。
60,000枚のデータが50,000枚の訓練データ、10,000枚のテストデータに分割されており、飛行機、鳥、猫、自動車……など10種類の正解ラベルがあります。
32×32ピクセルの画像データとなっています。

この記事で触れること

MLP、CNNによるディープラーニングのModelを組み、それぞれチューニングを実施し精度がどのように向上するか（あるいは下がるか……）を確認する。

以上っ。

Remarks

実行環境はKaggleのNotebookです。GPUを利用しますが使い過ぎには注意してください。（他に取り組んでいるコンペがある方は特に……）
対象読者としてはpythonの文法を一通り抑え、いくつかの機械学習のチュートリアルを終えたくらいの方を想定しています。
ただし__MLP、CNNの理論的背景には触れません。__あくまで手っ取り早くModelを組み、動かすことにフォーカスしています。

事前準備

前置きはこの辺にして、早速取り掛かって見ましょう。
コンペのページを開いたらNotebooksタブに移動し、「New Notebook」を選択します。

Notebookを新規作成すると入力済みのセルがあるので実行します。

まずはデータを読み込みます。
通常KaggleのNotebookにはデータセットが組み込まれ、input/コンペ名/配下にtrain.csv、test.csvがあり、それらを利用するのですが、今回はkerasに標準でデータが用意されているので、そちらを使いましょう。¹
予めNotebookのサイドバーにある「Settings」で「Internet」を有効にしておく必要があります。²

# データのインポート
import numpy as np
from tensorflow.keras.datasets import cifar10

(X_train, y_train), (X_test, y_test) = cifar10.load_data()
print('X_train:', X_train.shape)
print('y_train:', y_train.shape)
print('X_test:', X_test.shape)
print('y_test:', y_test.shape)

Downloading data from https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz
170500096/170498071 [==============================] - 2s 0us/step
X_train: (50000, 32, 32, 3)
y_train: (50000, 1)
X_test: (10000, 32, 32, 3)
y_test: (10000, 1)

また、GPUをオンにします。同じく「Settings」の「Accerarator」から「GPU」を選択します。³

Model作成その①　MLP

まずは基本となるMLP（Multilayer Perceptron／多層パーセプトロン）でModelを組んでみましょう。
とその前に、Modelがデータをきちんと受け取って解釈できるように、正規化しておきます。

# 画像パラメーター
img_row = X_train.shape[1]
img_col = X_train.shape[2]
img_shape = img_row * img_col * X_train.shape[3]

# 訓練データ、テストデータを正規化
from tensorflow.keras.utils import to_categorical

X_train, X_test = X_train.reshape(-1, img_shape), X_test.reshape(-1, img_shape)
X_train, X_test = X_train.astype('float32'), X_test.astype('float32')
X_train, X_test = X_train/255.0, X_test/255.0
y_train, y_test = to_categorical(y_train), to_categorical(y_test)

Modelですが、層構造としては非常にシンプルに、隠れ層として1層のみを配置し、ニューロン数も適当に決めておきます。

# ニューラルネットワーク構築（MLP）
import time
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Activation

# Sequentialオブジェクト
model = Sequential()

# 第一層
model.add(Dense(256, input_shape=(img_shape,), activation='relu'))

# 出力層
model.add(Dense(10, activation='softmax'))

# コンパイル
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

# 学習を行う（GPU：オン）
start = time.time()
history = model.fit(
    X_train,
    y_train,
    epochs=100,
    batch_size=128,
    validation_data=(X_test, y_test)
)
end = time.time()

# 結果表示
score = model.evaluate(X_test, y_test, verbose=1)

print('accuracy:', score[1])
print('loss    :', score[0])
print('所要時間 :',end-start,'sec')

すると以下のような結果が得られました。（Epochは最終行のみを表示）

Epoch 100/100
391/391 [==============================] - 1s 4ms/step - loss: 1.1486 - accuracy: 0.5889 - val_loss: 1.4864 - val_accuracy: 0.4949
313/313 [==============================] - 1s 2ms/step - loss: 1.4864 - accuracy: 0.4949
accuracy: 0.4948999881744385
loss    : 1.4863722324371338
所要時間 : 132.7412850856781 sec

精度としては0.4948〜ってことで、だいたい半分くらいの確率で的中させられるようです。
完全なランダムよりかは流石に的中率は高いですが、物足りない数字ですね。

ちなみにhistoryから学習の状況も確認することができます。

%matplotlib inline
import matplotlib.pyplot as plt

plt.figure(figsize=(20, 15))
plt.plot(history.history['accuracy'], label='train', color='black')
plt.plot(history.history['val_accuracy'], label='Val Acc', color='red')
plt.legend()
plt.grid()
plt.xlabel('epoch')
plt.ylabel('accuracy')

plt.show()

Model作成その②　MLPパラメーターチューニング

MLPでもう少し精度を上げたいのでハイパーパラメーターのチューニングを行います。
MLPではネットワークの層の数やバッチサイズ、オプティマイザー等、様々なチューニング可能なパラメーターが存在します。
手動で頑張れないこともないのですが、頑張りたくはないので機械任せでチューニングをしてみたいと思います。

今回は中間層の数と各層のユニット数、オプティマイザーをチューニングすることにしました。

Hyperoptというライブラリにより探索を行えます。
KaggleのNotebookにはデフォルトでインストールされていないので、pipコマンドからインストールを実行してください。

!pip install hyperas

まず、データ作成関数。

def create_data():
    import numpy as np
    import time
    from tensorflow.keras.datasets import cifar10
    from tensorflow.keras.models import Sequential
    from tensorflow.keras.layers import Dense, Activation, Dropout
    from tensorflow.keras.utils import to_categorical

    (X_train, y_train), (X_test, y_test) = cifar10.load_data()

    X_train, X_test = X_train.reshape(-1, 3072), X_test.reshape(-1, 3072)
    X_train, X_test = X_train.astype('float32'), X_test.astype('float32')
    X_train, X_test = X_train/255.0, X_test/255.0
    y_train, y_test = to_categorical(y_train), to_categorical(y_test)
    
    return X_train, X_test, y_train, y_test

Modelを生成する関数です。

from hyperopt import hp
from hyperopt import Trials, tpe
from hyperas import optim
from hyperas.distributions import choice, uniform

def create_model(X_train, X_test):
    # Sequentialオブジェクト
    model = Sequential()

    # 第一層
    model.add(Dense({{choice([256, 512])}}, input_shape=(3072,), activation='relu'))

    # 第二〜三層
    if {{choice([0, 1, 2])}} == 0:
        pass
    elif {{choice([0, 1, 2])}} == 1:
        model.add(Dense({{choice([100, 200])}},activation='relu'))
    elif {{choice([0, 1, 2])}} == 2:
        model.add(Dense({{choice([100, 200])}},activation='relu'))
        model.add(Dense({{choice([25, 50])}},activation='relu'))

    # 出力層
    model.add(Dense(10, activation='softmax'))

    model.compile(loss='categorical_crossentropy',optimizer={{choice(['adam', 'rmsprop'])}},metrics=['accuracy'])

    # 学習を行う（GPU：オン）
    history = model.fit(
        X_train,
        y_train,
        epochs=30,
        batch_size=128,
        validation_data=(X_test, y_test),
        verbose=0
    )
    val_acc = np.amax(history.history['val_accuracy'])
    return  {'loss': -val_acc, 'status':STATUS_OK, 'model':model}

上記コードでは

# 第一層
model.add(Dense({{choice([256, 512])}}, input_shape=(3072,), activation='relu'))

のように、パラメーターが{{choice([x, y])}}という句で囲まれている箇所がいくつかありますが、これらの選択について探索を行い、パラメーターとして最適な組み合わせを算出してくれます。

なおHyperoptではModel生成部は上記のように関数としてラップすることが求められる点にも注意してください。

それでは下記のコードで探索を実行し、結果を表示します。

# 探索実行
best_run, best_model = optim.minimize(
                model=create_model, 
                data=create_data, 
                algo=tpe.suggest, 
                max_evals=10, 
                eval_space=True, 
                notebook_name='__notebook_source__', 
                trials=Trials()
)

# 結果の確認
X_train, X_test, y_train, y_test = create_data()
score = best_model.evaluate(X_test, y_test)

print('accuracy:', score[1])
print('loss    :', score[0])

100%|██████████| 10/10 [06:00<00:00, 36.04s/trial, best loss: -0.5304999947547913]
accuracy: 0.5304999947547913
loss    : 1.3674978017807007

accuracyは0.4948〜から0.53049〜になり、サクッと伸ばすことができましたね。
今回はやっていませんが、Dropoutやバッチサイズについても同じく探索することもできます。

# Dropoutの探索
from hyperas.distributions import uniform

# 0.1から0.5まで、0.1刻みで探索する
model.add(Dropout(uniform(0.1, 0.5, 0.1)}})

# バッチサイズの探索
history = model.fit(
        X_train,
        y_train,
        epochs=30,
        batch_size={{choice([10, 50, 100])}},
        validation_data=(X_test, y_test),
        verbose=0
)

当然探索の範囲が広いほど処理に時間はかかります。
今回のコードですとだいたい20分くらいでしょうか。

と、少し長ったらしくなってきたので今回はここまでとさせていただきいます。
次回はCNNとそのチューニングを試してみたいと思います。

→試してみました
MLP、CNNを使って物体認識を行う〜精度を向上させる（その２） / Kaggle CIFAR-10

通常のコンペでは訓練データを学習用データと検証用データに分割し訓練・検証を行わなければいけませんが、今回は割愛し訓練データ全量をModelにぶちこんでいます。 ↩
Notebook上でインターネット接続をするためには初回にSMS認証が求められます。 ↩
Kaggle Notebookでは一週間あたりのGPU使用上限時間が決まっているので付けっ放しには注意してください。 ↩

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up