About
赤ちゃんの分類を対象にデータの量で学習結果はどのように異なるかを検証
赤ちゃんの分類の詳細は過去記事
https://qiita.com/Phoeboooo/items/2c7457d1bfba514e2dc8
モデル
比較のため、データが多いときと少ないときのどちらでも同じモデルを使う
model = models.Sequential()
model.add(layers.Conv2D(32, (3, 3), activation='relu',
input_shape=(150, 150, 3)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(128, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(128, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(GlobalAveragePooling2D())
model.add(layers.Dense(32, activation='relu'))
model.add(layers.Dense(1, activation='sigmoid'))
Total params: 244,993
Trainable params: 244,993
Non-trainable params: 0
データが少ないとき
Found 5825 images belonging to 2 classes.
Found 2000 images belonging to 2 classes.
学習結果
Epoch 1/15
233/233 [==============================] - 546s 2s/step - loss: 0.6685 - acc: 0.6113 - val_loss: 0.6959 - val_acc: 0.5000
Epoch 2/15
233/233 [==============================] - 504s 2s/step - loss: 0.6679 - acc: 0.6117 - val_loss: 0.6951 - val_acc: 0.5000
Epoch 3/15
233/233 [==============================] - 509s 2s/step - loss: 0.6664 - acc: 0.6117 - val_loss: 0.6945 - val_acc: 0.5000
Epoch 4/15
233/233 [==============================] - 512s 2s/step - loss: 0.6676 - acc: 0.6117 - val_loss: 0.6998 - val_acc: 0.5000
accはずっと61%ぐらい、val_accはちょうど50%で全く動かず学習できていない
データが多い場合
Found 10000 images belonging to 2 classes.
Found 2000 images belonging to 2 classes.
学習結果
Epoch 1/100
200/200 [==============================] - 605s 1s/step - loss: 0.6939 - acc: 0.5062 - val_loss: 0.6926 - val_acc: 0.5350
Epoch 2/100
200/200 [==============================] - 610s 1s/step - loss: 0.6932 - acc: 0.5129 - val_loss: 0.6921 - val_acc: 0.5213
Epoch 3/100
200/200 [==============================] - 609s 1s/step - loss: 0.6924 - acc: 0.5183 - val_loss: 0.6905 - val_acc: 0.5712
.
.
.
Epoch 18/20
200/200 [==============================] - 1177s 6s/step - loss: 0.4492 - acc: 0.7786 - val_loss: 0.2808 - val_acc: 0.8819
Epoch 19/20
200/200 [==============================] - 1168s 6s/step - loss: 0.4490 - acc: 0.7791 - val_loss: 0.2894 - val_acc: 0.8747
Epoch 20/20
200/200 [==============================] - 1177s 6s/step - loss: 0.4438 - acc: 0.7839 - val_loss: 0.2831 - val_acc: 0.8791
*エポック数があっていないのは学習を簡略化して表記したため
データ拡張ありで87.91%までval_accが上がった (データ拡張なしで82%)
合計エポック数 : 550
比較結果
データ集め・整理は地味な作業だが、めっちゃ大事