Kaggleの過去問Leaf Classificationに挑戦して得られた知見です.
問題概要
- 画像と特徴量(192次元)から99クラスのラベルに葉っぱを分類する.
Datasets
- 192次元の特徴量 + 画像
- ラベルごとのサンプル数: 10
- 分類クラス数: 99
- 訓練データ数: 990(99x10)
- 検証データ数: 594
Solutions
- 画像と特徴量を入力,出力を99クラスのソフトマックスとするネットワークを作る.
- 画像数が少ないので小さめのシンプルなネットワークにする.
Architecture
____________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
====================================================================================================
input_1 (InputLayer) (None, 96, 96, 1) 0
____________________________________________________________________________________________________
convolution2d_1 (Convolution2D) (None, 96, 96, 32) 320 input_1[0][0]
____________________________________________________________________________________________________
batchnormalization_1 (BatchNorma (None, 96, 96, 32) 128 convolution2d_1[0][0]
____________________________________________________________________________________________________
activation_1 (Activation) (None, 96, 96, 32) 0 batchnormalization_1[0][0]
____________________________________________________________________________________________________
maxpooling2d_1 (MaxPooling2D) (None, 48, 48, 32) 0 activation_1[0][0]
____________________________________________________________________________________________________
convolution2d_2 (Convolution2D) (None, 48, 48, 32) 9248 maxpooling2d_1[0][0]
____________________________________________________________________________________________________
batchnormalization_2 (BatchNorma (None, 48, 48, 32) 128 convolution2d_2[0][0]
____________________________________________________________________________________________________
activation_2 (Activation) (None, 48, 48, 32) 0 batchnormalization_2[0][0]
____________________________________________________________________________________________________
maxpooling2d_2 (MaxPooling2D) (None, 24, 24, 32) 0 activation_2[0][0]
____________________________________________________________________________________________________
convolution2d_3 (Convolution2D) (None, 24, 24, 32) 9248 maxpooling2d_2[0][0]
____________________________________________________________________________________________________
batchnormalization_3 (BatchNorma (None, 24, 24, 32) 128 convolution2d_3[0][0]
____________________________________________________________________________________________________
activation_3 (Activation) (None, 24, 24, 32) 0 batchnormalization_3[0][0]
____________________________________________________________________________________________________
flatten_1 (Flatten) (None, 18432) 0 activation_3[0][0]
____________________________________________________________________________________________________
dense_1 (Dense) (None, 96) 1769568 flatten_1[0][0]
____________________________________________________________________________________________________
input_2 (InputLayer) (None, 192) 0
____________________________________________________________________________________________________
merge_1 (Merge) (None, 288) 0 dense_1[0][0]
input_2[0][0]
____________________________________________________________________________________________________
dense_2 (Dense) (None, 99) 28611 merge_1[0][0]
====================================================================================================
Total params: 1,817,379
Trainable params: 1,817,187
Non-trainable params: 192
____________________________________________________________________________________________________
Memo
- 画像のみや数値特徴量のみだとあんまり性能がよくない.
- 自作のImageDataGeneratorの作り方を学ぶことができる.
- 画像と数値的特徴量の混合ネットワークを構築する方法を学べる.