More than 3 years have passed since last update.

CIFAR-100を分類するtensorflowチュートリアル【2022】

Posted at 2022-04-10

これは何？

CIFAR-100を使ったtensorflowの多クラス分類のチュートリアル記事。
CIFAR-10を使った実験記事だとよくあるので、CIFAR-100でやってみました。
スーパークラスの20クラス分類にしたからCIFAR-10の分類と変わらないんだけれども。
▶︎Vuepress版記事

私の環境

GPU:RTX-3090のBTOパソコン
pyenvでインストールしたpython=3.7.11
jupyter notebook

必要ライブラリ

vscodeを使ってGPUパソコンにSSH接続してjupyter notebookを起動！

requirements.txtを以下に示した。

absl-py==0.8.1
appnope==0.1.0
astor==0.8.0
astunparse==1.6.3
attrs==19.2.0
backcall==0.1.0
bleach==3.1.0
cachetools==4.1.1
certifi==2020.6.20
chardet==3.0.4
cycler==0.10.0
decorator==4.4.0
defusedxml==0.6.0
entrypoints==0.3
gast==0.3.3
google-auth==1.18.0
google-auth-oauthlib==0.4.1
google-pasta==0.2.0
grpcio==1.44.0
h5py==2.10.0
idna==2.10
imageio==2.6.1
importlib-metadata==0.23
ipykernel==5.1.2
ipython==7.8.0
ipython-genutils==0.2.0
ipywidgets==7.5.1
jedi==0.15.1
Jinja2==2.10.3
jsonschema==3.1.1
jupyter==1.0.0
jupyter-client==5.3.4
jupyter-console==6.0.0
jupyter-core==4.6.0
Keras==2.3.1
Keras-Applications==1.0.8
Keras-Preprocessing==1.1.0
kiwisolver==1.1.0
Markdown==3.1.1
MarkupSafe==1.1.1
matplotlib==3.1.1
mistune==0.8.4
more-itertools==7.2.0
music21==5.7.0
nbconvert==5.6.0
nbformat==4.4.0
networkx==2.3
notebook==6.0.1
numpy==1.21.5
oauthlib==3.1.0
opt-einsum==3.1.0
pandas==0.25.1
pandocfilters==1.4.2
parso==0.5.1
pexpect==4.7.0
pickleshare==0.7.5
Pillow==6.2.0
prometheus-client==0.7.1
prompt-toolkit==2.0.10
protobuf==3.10.0
ptyprocess==0.6.0
pyasn1==0.4.8
pyasn1-modules==0.2.8
pydot==1.4.1
pydotplus==2.0.2
Pygments==2.4.2
pyparsing==2.4.2
pyrsistent==0.15.4
python-dateutil==2.8.0
pytz==2019.3
PyWavelets==1.3.0
PyYAML==5.1.2
pyzmq==18.1.0
qtconsole==4.5.5
requests==2.24.0
requests-oauthlib==1.3.0
rsa==4.6
scikit-image==0.17.2
scipy==1.4.1
Send2Trash==1.5.0
six==1.12.0
tensorboard==2.2.2
tensorboard-plugin-wit==1.7.0
tensorflow==2.2.0
tensorflow-addons==0.10.0
tensorflow-estimator==2.2.0
termcolor==1.1.0
terminado==0.8.2
testpath==0.4.2
tifffile==2021.11.2
tornado==6.0.3
traitlets==4.3.3
typeguard==2.9.1
urllib3==1.25.9
wcwidth==0.1.7
webencodings==0.5.1
Werkzeug==0.16.0
widgetsnbextension==3.5.1
wrapt==1.11.2
zipp==0.6.0

再現したい人がいたら、python=3.7.11環境で
｀pip install -r requirements.txt｀
すると環境が構築できるよ。

CUDAは11.5が入ってます。

ソースコード解説

ライブラリのインポート

トレーニングまでに必要なライブラリをインポートする。jupoyter notebookを使う場合はユニットごとにセルにコピペするといい感じになる。

import numpy as np

from tensorflow.keras.layers import Input, Flatten, Dense, Conv2D, BatchNormalization, LeakyReLU, Dropout, Activation
from tensorflow.keras.models import Model
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.utils import to_categorical
import tensorflow.keras.backend as K 

from tensorflow.keras.datasets import cifar100

クラス数の設定

# 学習させるデータセットによって変わる
NUM_CLASSES = 20

データ読み込み

(x_train, y_train), (x_test, y_test) = cifar100.load_data(label_mode='coarse')

ちょいとデータの形状を確認してみる。

x_train[0].shape

データの前処理

多クラス分類を学習させるために、yデータのone hot ベクトル化を行う。keras.utilsの関数が使えるので楽。

# データの正規化
x_train = x_train.astype('float32') / 255.0
x_test = x_test.astype('float32') / 255.0
# 正解データのone_hotデータ化
y_train = to_categorical(y_train, NUM_CLASSES)
y_test = to_categorical(y_test, NUM_CLASSES)

レイヤーの構築

input_layer = Input(shape=(32,32,3))
# 畳み込み層(入力)
conv_layer_1 = Conv2D(
    filters = 10
    , kernel_size = (4,4)
    , strides = 2
    , padding = 'same'
    )(input_layer)
# 畳み込み層
conv_layer_2 = Conv2D(
    filters = 20
    , kernel_size = (3,3)
    , strides = 2
    , padding = 'same'
    )(conv_layer_1)
# 平坦化層
flatten_layer = Flatten()(conv_layer_2)
# 全結合層(出力)
output_layer = Dense(units=20, activation = 'softmax')(flatten_layer)

model = Model(input_layer, output_layer)

モデル情報をちょいと確認する。

model.summary()

モデルの構築

各レイヤーの後に、正規化を行うBatchNormalizationとLeakyReLu層を挟むことによって精度向上を図るという試み。

input_layer = Input((32,32,3))

x = Conv2D(filters = 32, kernel_size = 3, strides = 1, padding = 'same')(input_layer)
x = BatchNormalization()(x)
x = LeakyReLU()(x)

x = Conv2D(filters = 32, kernel_size = 3, strides = 2, padding = 'same')(x)
x = BatchNormalization()(x)
x = LeakyReLU()(x)

x = Conv2D(filters = 64, kernel_size = 3, strides = 1, padding = 'same')(x)
x = BatchNormalization()(x)
x = LeakyReLU()(x)

x = Conv2D(filters = 64, kernel_size = 3, strides = 1, padding = 'same')(x)
x = BatchNormalization()(x)
x = LeakyReLU()(x)

x = Conv2D(filters = 64, kernel_size = 3, strides = 2, padding = 'same')(x)
x = BatchNormalization()(x)
x = LeakyReLU()(x)

x = Flatten()(x)

x = Dense(128)(x)
x = BatchNormalization()(x)
x = LeakyReLU()(x)
x = Dropout(rate = 0.5)(x)

x = Dense(NUM_CLASSES)(x)
output_layer = Activation('softmax')(x)

model = Model(input_layer, output_layer)

ちょいとモデルを確認。

model.summary()

最適化アルゴリズムを設定し、コンパイル

最適化アルゴリズムは有名なAdamを用いる。そしてコンパイル。モデルを構成してコンパイル。モデルを構成してコンパイル。逃げちゃダメだ、逃げちゃダメだ。

opt = Adam(lr=0.0005)
model.compile(loss='categorical_crossentropy', optimizer=opt, metrics=['accuracy'])

モデルのトレーニング

何回か試してみて、結構学習するほど正解率が上がっていったので、
今回は100エポック学習させてみる。

model.fit(x_train
          , y_train
          , batch_size=32
          , epochs=100
          , shuffle=True
          , validation_data = (x_test, y_test))

モデルの評価

model.evaluate(x_test, y_test, batch_size=1000)

結果は以下の通りとなった。
loss: 2.8621 - accuracy: 0.5325

うーん。結構低い。
100エポックも学習させる意味はあまりないのかも。

モデル構成をいじったり、パラメータやエポック数を変えることで精度はだいぶん変わるので、
気になる方は試してみてくだされ。

推論

# 推論のためのラベル設定
super_class_names = [
  'aquatic mammals',  # 0：水生哺乳類
  'fish',  # 1：魚
  'flowers',  # ：花
  'food containers',  # 3：食品容器
  'fruit and vegetables',  # 4：果物と野菜
  'household electrical devices',  # 5：家電
  'household furniture',  # 6：家具
  'insects',  # 7：昆虫
  'large carnivores',  # 8：大型の肉食動物
  'large man-made outdoor things',  # 9：大型の建造物
  'large natural outdoor scenes',  # 10：大自然の風景
  'large omnivores and herbivores',  # 11：大型の雑食動物と草食動物
  'medium-sized mammals',  # 12：中型の哺乳類
  'non-insect invertebrates',  # 13：昆虫ではない無脊椎動物
  'people',  # 14：人
  'reptiles',  # 15：爬虫類
  'small mammals',  # 16：小型の哺乳類
  'trees',  # 17：木
  'vehicles 1',  # 18：車両1
  'vehicles 2',  # 19：車両2
]

# 推論
CLASSES = np.array(super_class_names)

preds = model.predict(x_test)
preds_single = CLASSES[np.argmax(preds, axis = -1)]
actual_single = CLASSES[np.argmax(y_test, axis = -1)]

結果を確認する

import matplotlib.pyplot as plt
# 日本語を表示させるためのフォントの設定
plt.rcParams['font.family'] = "Noto Serif CJK JP"

n_to_show = 10
indices = np.random.choice(range(len(x_test)), n_to_show)

fig = plt.figure(figsize=(25, 5))
fig.subplots_adjust(hspace=1, wspace=0.8)

for i, idx in enumerate(indices):
    img = x_test[idx]
    ax = fig.add_subplot(1, n_to_show, i+1)
    ax.axis('off')
    ax.text(0.5, -0.3, '==推論結果==', fontsize=12, ha='center', transform=ax.transAxes) 
    ax.text(0.5, -0.6, super_class_dic[str(preds_single[idx])], fontsize=12, ha='center', transform=ax.transAxes)
    ax.text(0.5, -0.9, '==正解ラベル==', fontsize=12, ha='center', transform=ax.transAxes)
    ax.text(0.5, -1.2, super_class_dic[str(actual_single[idx])], fontsize=12, ha='center', transform=ax.transAxes)
    ax.imshow(img)

少し解像度が低いが、以下に結果を示した。
正解率半分程度とあり、打倒な結果のように思える。

まとめ

tensorflowを使って多クラス分類を行う工程を若干の説明と共に掲載した。
tensorflowの環境を作ってさえしまえば、パラメータやレイヤーをいじくり回して結構遊べるのでおすすめです。

というわけで、今回はここまで。

おまけ

フォントを "Noto Serif CJK JP"に決定したのは、以下のコードで日本語と英語が両方ちゃんと表示されているものを選んだから。

import matplotlib

fonts = set([f.name for f in matplotlib.font_manager.fontManager.ttflist])

plt.figure(figsize=(10,len(fonts)/4))

for i, font in enumerate(fonts):
    plt.text(0, i, f"日本語：{font}", fontname=font)

plt.ylim(0, len(fonts))
plt.axis("off")
    
plt.show()

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up