LoginSignup
3

More than 3 years have passed since last update.

ぶっちゃけ変なモデル。。。^^;

Last updated at Posted at 2018-04-09

CapsNetとAveragePooling版とそしてSpatialPyramidPooling版を比較検討していると、。。。

なんとなく、それぞれを動かしているのが面倒になってきた。

ということで、3つのモデルを同時に動かすことを考えた。

で、なんとなくなにかできたので紹介する。

プログラム構造

メインモデルは以下のとおり
これは、keras / examples / cifar10_cnn_capsule.py を少し層を増やしたモデルです。そして、その元のオリジナルは以下の中国の方みたいですね。
Capsule Implement is from https://github.com/bojone/Capsule/

def model_cifar(input_image=Input(shape=(None, None, 3))):
# A common Conv2D model
    x = Conv2D(64, (3, 3), activation='relu',padding='same')(input_image)
    x = Conv2D(64, (3, 3), activation='relu',padding='same')(x)
    x = BatchNormalization(axis=3)(x)  
    x = Dropout(0.5)(x)                
    x = AveragePooling2D((2, 2))(x)
    x = Conv2D(128, (3, 3), activation='relu',padding='same')(x)
    x = Conv2D(128, (3, 3), activation='relu',padding='same')(x)
    x = BatchNormalization(axis=3)(x)  
    x = Dropout(0.5)(x)                
    x = AveragePooling2D((2, 2))(x)    
    x = Conv2D(256, (3, 3), activation='relu',padding='same')(x)  
    x = Conv2D(256, (3, 3), activation='relu',padding='same')(x)  
    #x = BatchNormalization(axis=3)(x)  
    x = Dropout(0.5)(x)                
    return x,input_image

モデル呼び出し3通りの出力処理

# SPP
x1,input_image1=model_cifar(input_image=Input(shape=(None, None, 3)))
x1 = SpatialPyramidPooling([1])(x1)    #[1,2,4]
output1 = Dense(num_classes, activation='softmax')(x1)

# AveragePooling
x2,input_image2=model_cifar(input_image=Input(shape=(32, 32, 3)))
x2 = AveragePooling2D(pool_size=(2, 2), strides=None, border_mode='valid', dim_ordering='tf')(x2)
x2 = Flatten()(x2)
output2 = Dense(num_classes, activation='softmax')(x2)

# Capsule
x3,input_image3=model_cifar(input_image=Input(shape=(None, None, 3)))
x3 = Reshape((-1, 128))(x3)
capsule = Capsule(10, 96, 3, True)(x3)  #16
output3 = Lambda(lambda z: K.sqrt(K.sum(K.square(z), 2)))(capsule)

モデル宣言

model1 = Model(inputs=input_image1, outputs=output1)
model2 = Model(inputs=input_image2, outputs=output2)
model3 = Model(inputs=input_image3, outputs=output3)

# we use a margin loss
model1.compile(loss=margin_loss, optimizer='adam', metrics=['accuracy'])
model2.compile(loss=margin_loss, optimizer='adam', metrics=['accuracy'])
model3.compile(loss=margin_loss, optimizer='adam', metrics=['accuracy'])
#model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model1.summary()
model2.summary()
model3.summary()

これを動かすと。。。以下のようなモデル構造を出力

_________________________________________________________________
Layer (type)                 Output Shape              Param #
=================================================================
input_2 (InputLayer)         (None, None, None, 3)     0
_________________________________________________________________
conv2d_1 (Conv2D)            (None, None, None, 64)    1792
_________________________________________________________________
conv2d_2 (Conv2D)            (None, None, None, 64)    36928
_________________________________________________________________
batch_normalization_1 (Batch (None, None, None, 64)    256
_________________________________________________________________
dropout_1 (Dropout)          (None, None, None, 64)    0
_________________________________________________________________
average_pooling2d_1 (Average (None, None, None, 64)    0
_________________________________________________________________
conv2d_3 (Conv2D)            (None, None, None, 128)   73856
_________________________________________________________________
conv2d_4 (Conv2D)            (None, None, None, 128)   147584
_________________________________________________________________
batch_normalization_2 (Batch (None, None, None, 128)   512
_________________________________________________________________
dropout_2 (Dropout)          (None, None, None, 128)   0
_________________________________________________________________
average_pooling2d_2 (Average (None, None, None, 128)   0
_________________________________________________________________
conv2d_5 (Conv2D)            (None, None, None, 256)   295168
_________________________________________________________________
conv2d_6 (Conv2D)            (None, None, None, 256)   590080
_________________________________________________________________
dropout_3 (Dropout)          (None, None, None, 256)   0
_________________________________________________________________
spatial_pyramid_pooling_1 (S (None, 5376)              0
_________________________________________________________________
dense_1 (Dense)              (None, 10)                53770
=================================================================
Total params: 1,199,946
Trainable params: 1,199,562
Non-trainable params: 384
_________________________________________________________________
_________________________________________________________________
Layer (type)                 Output Shape              Param #
=================================================================
input_3 (InputLayer)         (None, 32, 32, 3)         0
_________________________________________________________________
conv2d_7 (Conv2D)            (None, 32, 32, 64)        1792
_________________________________________________________________
conv2d_8 (Conv2D)            (None, 32, 32, 64)        36928
_________________________________________________________________
batch_normalization_3 (Batch (None, 32, 32, 64)        256
_________________________________________________________________
dropout_4 (Dropout)          (None, 32, 32, 64)        0
_________________________________________________________________
average_pooling2d_3 (Average (None, 16, 16, 64)        0
_________________________________________________________________
conv2d_9 (Conv2D)            (None, 16, 16, 128)       73856
_________________________________________________________________
conv2d_10 (Conv2D)           (None, 16, 16, 128)       147584
_________________________________________________________________
batch_normalization_4 (Batch (None, 16, 16, 128)       512
_________________________________________________________________
dropout_5 (Dropout)          (None, 16, 16, 128)       0
_________________________________________________________________
average_pooling2d_4 (Average (None, 8, 8, 128)         0
_________________________________________________________________
conv2d_11 (Conv2D)           (None, 8, 8, 256)         295168
_________________________________________________________________
conv2d_12 (Conv2D)           (None, 8, 8, 256)         590080
_________________________________________________________________
dropout_6 (Dropout)          (None, 8, 8, 256)         0
_________________________________________________________________
average_pooling2d_5 (Average (None, 4, 4, 256)         0
_________________________________________________________________
flatten_1 (Flatten)          (None, 4096)              0
_________________________________________________________________
dense_2 (Dense)              (None, 10)                40970
=================================================================
Total params: 1,187,146
Trainable params: 1,186,762
Non-trainable params: 384
_________________________________________________________________
_________________________________________________________________
Layer (type)                 Output Shape              Param #
=================================================================
input_4 (InputLayer)         (None, None, None, 3)     0
_________________________________________________________________
conv2d_13 (Conv2D)           (None, None, None, 64)    1792
_________________________________________________________________
conv2d_14 (Conv2D)           (None, None, None, 64)    36928
_________________________________________________________________
batch_normalization_5 (Batch (None, None, None, 64)    256
_________________________________________________________________
dropout_7 (Dropout)          (None, None, None, 64)    0
_________________________________________________________________
average_pooling2d_6 (Average (None, None, None, 64)    0
_________________________________________________________________
conv2d_15 (Conv2D)           (None, None, None, 128)   73856
_________________________________________________________________
conv2d_16 (Conv2D)           (None, None, None, 128)   147584
_________________________________________________________________
batch_normalization_6 (Batch (None, None, None, 128)   512
_________________________________________________________________
dropout_8 (Dropout)          (None, None, None, 128)   0
_________________________________________________________________
average_pooling2d_7 (Average (None, None, None, 128)   0
_________________________________________________________________
conv2d_17 (Conv2D)           (None, None, None, 256)   295168
_________________________________________________________________
conv2d_18 (Conv2D)           (None, None, None, 256)   590080
_________________________________________________________________
dropout_9 (Dropout)          (None, None, None, 256)   0
_________________________________________________________________
reshape_1 (Reshape)          (None, None, 128)         0
_________________________________________________________________
capsule_1 (Capsule)          (None, 10, 96)            122880
_________________________________________________________________
lambda_1 (Lambda)            (None, 10)                0
=================================================================
Total params: 1,269,056
Trainable params: 1,268,672
Non-trainable params: 384
_________________________________________________________________

結果

Not using data augmentation.
*****j=  0
Train on 50000 samples, validate on 10000 samples
Epoch 1/1
50000/50000 [==============================] - 35s 694us/step - loss: 0.3911 - acc: 0.5035 - val_loss: 0.3821 - val_acc: 0.5240
Train on 50000 samples, validate on 10000 samples
Epoch 1/1
50000/50000 [==============================] - 29s 589us/step - loss: 0.4079 - acc: 0.4789 - val_loss: 0.3943 - val_acc: 0.5071
Train on 50000 samples, validate on 10000 samples
Epoch 1/1
50000/50000 [==============================] - 38s 764us/step - loss: 0.4780 - acc: 0.3260 - val_loss: 0.4251 - val_acc: 0.3842
*****j=  1
Train on 50000 samples, validate on 10000 samples
Epoch 1/1
50000/50000 [==============================] - 32s 646us/step - loss: 0.2779 - acc: 0.6537 - val_loss: 0.2571 - val_acc: 0.6793
Train on 50000 samples, validate on 10000 samples
Epoch 1/1
50000/50000 [==============================] - 29s 578us/step - loss: 0.2888 - acc: 0.6391 - val_loss: 0.3097 - val_acc: 0.6087
Train on 50000 samples, validate on 10000 samples
Epoch 1/1
50000/50000 [==============================] - 38s 754us/step - loss: 0.3588 - acc: 0.4833 - val_loss: 0.3383 - val_acc: 0.5223
*****j=  2
Train on 50000 samples, validate on 10000 samples
Epoch 1/1
50000/50000 [==============================] - 32s 649us/step - loss: 0.2315 - acc: 0.7109 - val_loss: 0.2631 - val_acc: 0.6706
Train on 50000 samples, validate on 10000 samples
Epoch 1/1
50000/50000 [==============================] - 29s 583us/step - loss: 0.2414 - acc: 0.6993 - val_loss: 0.2923 - val_acc: 0.6395
Train on 50000 samples, validate on 10000 samples
Epoch 1/1
50000/50000 [==============================] - 38s 751us/step - loss: 0.3023 - acc: 0.5718 - val_loss: 0.3411 - val_acc: 0.5316
*****j=  3
Train on 50000 samples, validate on 10000 samples
Epoch 1/1
50000/50000 [==============================] - 32s 649us/step - loss: 0.2044 - acc: 0.7456 - val_loss: 0.2445 - val_acc: 0.7010
Train on 50000 samples, validate on 10000 samples
Epoch 1/1
50000/50000 [==============================] - 29s 575us/step - loss: 0.2099 - acc: 0.7408 - val_loss: 0.2086 - val_acc: 0.7432
Train on 50000 samples, validate on 10000 samples
Epoch 1/1
50000/50000 [==============================] - 38s 752us/step - loss: 0.2595 - acc: 0.6397 - val_loss: 0.2517 - val_acc: 0.6451

因みに、


# SPP
x,input_image1=model_cifar(input_image=Input(shape=(32, 32, 3)))
x1 = SpatialPyramidPooling([1])(x)    #[1,2,4]
output1 = Dense(num_classes, activation='softmax')(x1)

# AveragePooling
# x2,input_image2=model_cifar(input_image=Input(shape=(32, 32, 3)))
x2 = AveragePooling2D(pool_size=(2, 2), strides=None, border_mode='valid', dim_ordering='tf')(x)
x2 = Flatten()(x2)
output2 = Dense(num_classes, activation='softmax')(x2)

# Capsule
# x3,input_image3=model_cifar(input_image=Input(shape=(None, None, 3)))
x3 = Reshape((-1, 128))(x)
capsule = Capsule(10, 96, 3, True)(x3)  #16
output3 = Lambda(lambda z: K.sqrt(K.sum(K.square(z), 2)))(capsule)

model1 = Model(inputs=input_image1, outputs=output1)
model2 = Model(inputs=input_image1, outputs=output2)
model3 = Model(inputs=input_image1, outputs=output3)

のようにmodel_cifar関数呼び出しを一回で代替しようとすると、。。。以下のようになりました。

Not using data augmentation.
*****j=  0
Train on 50000 samples, validate on 10000 samples
Epoch 1/1
50000/50000 [==============================] - 35s 690us/step - loss: 0.4076 - acc: 0.4796 - val_loss: 0.4171 - val_acc: 0.4970
Train on 50000 samples, validate on 10000 samples
Epoch 1/1
50000/50000 [==============================] - 29s 586us/step - loss: 0.3161 - acc: 0.6046 - val_loss: 0.2713 - val_acc: 0.6659
Train on 50000 samples, validate on 10000 samples
Epoch 1/1
50000/50000 [==============================] - 38s 756us/step - loss: nan - acc: 0.1011 - val_loss: nan - val_acc: 0.1000
*****j=  1
Train on 50000 samples, validate on 10000 samples
Epoch 1/1
50000/50000 [==============================] - 32s 647us/step - loss: 0.6400 - acc: 0.0982 - val_loss: 0.6400 - val_acc: 0.1000
Train on 50000 samples, validate on 10000 samples
Epoch 1/1
50000/50000 [==============================] - 29s 573us/step - loss: 0.6400 - acc: 0.0977 - val_loss: 0.6400 - val_acc: 0.1000
Train on 50000 samples, validate on 10000 samples
Epoch 1/1
50000/50000 [==============================] - 37s 745us/step - loss: nan - acc: 0.1000 - val_loss: nan - val_acc: 0.1000
*****j=  2

これは、テンソルのパラメータが3つのモデルで共通化されてしまい、CapsNetでうまく収束しなくなったためのようです。
ということで、上の別々に記載する形式で3つのモデルを同時比較したいと思います。

コードは、以下に置きました
cifar10_cnn_capsule_alt.py
動かすには、同じところに置いてありますが、Capsule.pyとSpatialPyramidPooling.pyが必要です

そして、100回回した結果

*****j=  90
Epoch 1/1
391/391 [==============================] - 32s 81ms/step - loss: 0.0736 - acc: 0.9123 - val_loss: 0.1039 - val_acc: 0.8839
Epoch 1/1
391/391 [==============================] - 28s 72ms/step - loss: 0.0682 - acc: 0.9174 - val_loss: 0.0971 - val_acc: 0.8917
Epoch 1/1
391/391 [==============================] - 37s 94ms/step - loss: 0.0652 - acc: 0.9255 - val_loss: 0.0896 - val_acc: 0.8961
*****j=  97
Epoch 1/1
391/391 [==============================] - 32s 81ms/step - loss: 0.0705 - acc: 0.9149 - val_loss: 0.0991 - val_acc: 0.8866
Epoch 1/1
391/391 [==============================] - 28s 72ms/step - loss: 0.0630 - acc: 0.9235 - val_loss: 0.1006 - val_acc: 0.8872
Epoch 1/1
391/391 [==============================] - 37s 94ms/step - loss: 0.0630 - acc: 0.9291 - val_loss: 0.0934 - val_acc: 0.8945
*****j=  98
Epoch 1/1
391/391 [==============================] - 32s 81ms/step - loss: 0.0693 - acc: 0.9163 - val_loss: 0.1010 - val_acc: 0.8856
Epoch 1/1
391/391 [==============================] - 28s 73ms/step - loss: 0.0637 - acc: 0.9232 - val_loss: 0.1153 - val_acc: 0.8740
Epoch 1/1
391/391 [==============================] - 37s 94ms/step - loss: 0.0633 - acc: 0.9286 - val_loss: 0.0858 - val_acc: 0.9019
*****j=  99
Epoch 1/1
391/391 [==============================] - 32s 82ms/step - loss: 0.0676 - acc: 0.9191 - val_loss: 0.1062 - val_acc: 0.8799
Epoch 1/1
391/391 [==============================] - 28s 72ms/step - loss: 0.0629 - acc: 0.9252 - val_loss: 0.1087 - val_acc: 0.8794
Epoch 1/1
391/391 [==============================] - 37s 95ms/step - loss: 0.0630 - acc: 0.9298 - val_loss: 0.0926 - val_acc: 0.8959

つまり、100epoch回したところで最後のCapsNetモデルが最大精度90.19%で一番、SPP[1,2,4]は最大88.66%、そして通常のAveragePooling版は最大89.17%という結果となりました。

これで論文の90%超え達成で、CapsNetの効果はあるのかなという結果となりました。

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
3