caffe 欠陥画像のCNN動作テスト

  • caffeの導入まで済んでいると仮定する。"~/caffe-windows/Build/x64/Release"にPATHを通す。convert_imageset.exe, compute_image_mean.exe, caffe.exeについては、このディレクトリ内にある。
  • 欠陥画像なので、2クラス判別(正常 or 異常)。画像は判別容易な画像を用いる。
  • 動作テストなので、lenetモデルで検証。


Windows 7 64bit + RAM 16GB + Cygwin(cpuモードで)


 ├ data/
 │  └ mytest/
 │     ├ train.txt #../../example/mytest/train.sh で作成
 │     ├ test.txt #../../example/mytest/train.sh で作成
 │     ├ answer.txt #../../example/mytest/deploy.sh で作成
 │     ├ mean.binaryproto #../../example/mytest/train.sh で作成
 │     ├ mytest_train_leveldb/ #../../example/mytest/train.sh で作成
 │     ├ mytest_test_leveldb/ #../../example/mytest/train.sh で作成
 │     ├ mytest_deploy_leveldb/ #../../example/mytest/train.sh で作成
 │     ├ src/
 │     └ 0001.bmp
 │     └ 0002.bmp
 │     └ (以下省略。訓練用:train data=>10000枚 test data => 2000枚 評価用:eval data => 7000枚作成)
 ├ example/
 │  ├ eval.sh
 │  ├ train.sh
 │  └ mytest/
 │     └ solver.prototxt #../../example/mytest/train.sh(訓練時)に使用
 │     └ train_test.prototxt #../../example/mytest/train.sh(訓練時)に使用
 │     └ mytest_iter_100000.caffemodel #../../example/mytest/train.sh(訓練時)後に生成される
 │     └ eval.prototxt #../../example/mytest/eval.sh(評価時)に使用
 ├ Build/
 │ └ x64/
 └   └ Release/ #パスを通しておく。



  • 画像データ(30px四方)を用意しておく。今回は、テストのため、わかりやすいように、異常物をランダムにくっつけ、人工的に作成。
    画像を(正常):(異常) = 1:1.1位の割合で用意した。グレー画像。

  • 訓練表(train.txt, test.txt, answer.txtなど)は、

12001.bmp 0
12002.bmp 1
12003.bmp 1
12004.bmp 0
12005.bmp 0
12006.bmp 0
12007.bmp 0
12008.bmp 1
12009.bmp 0
12010.bmp 0


  • solver.prototxtは以下のように記述
net: "examples/mytest/lenet/train_test.prototxt"
test_iter: 100 #NOTE that ${test_iter} * ${batch_size} = {number of test data}
test_interval: 500 
base_lr: 0.01 #adjust value
momentum: 0.9
weight_decay: 0.0005
# The learning rate policy
lr_policy: "inv"
gamma: 0.0001
power: 0.75
# Display every 100 iterations
display: 500
# The maximum number of iterations
max_iter: 100000
# snapshot intermediate results
snapshot: 5000
snapshot_prefix: "examples/mytest/lenet/mytest"
# solver mode: CPU or GPU
solver_mode: CPU
  • train_test.prototxtは以下のように記述
name: "LeNet"
layer {
  name: "mnist"
  type: "Data"
  top: "data"
  top: "label"
  include {
    phase: TRAIN
  transform_param {
    scale: 0.00390625 #raw_scale is 255, so, 1/255 scaling
    mean_file: "data/mytest/mean.binaryproto"
  data_param {
    source: "data/mytest/mytest_train_leveldb"
    batch_size: 100
    backend: LEVELDB
layer {
  name: "mnist"
  type: "Data"
  top: "data"
  top: "label"
  include {
    phase: TEST
  transform_param {
    scale: 0.00390625
    mean_file: "data/mytest/mean.binaryproto"
  data_param {
    source: "data/mytest/mytest_test_leveldb"
    batch_size: 20 # ref test_iter param of solver.prototxt.
    backend: LEVELDB
layer {
  name: "conv1"
  type: "Convolution"
  bottom: "data"
  top: "conv1"
  param {
    lr_mult: 1
  param {
    lr_mult: 2
  convolution_param {
    num_output: 20
    kernel_size: 5
    stride: 1
    weight_filler {
      type: "xavier"
    bias_filler {
      type: "constant"
layer {
  name: "pool1"
  type: "Pooling"
  bottom: "conv1"
  top: "pool1"
  pooling_param {
    pool: MAX
    kernel_size: 2
    stride: 2
layer {
  name: "conv2"
  type: "Convolution"
  bottom: "pool1"
  top: "conv2"
  param {
    lr_mult: 1
  param {
    lr_mult: 2
  convolution_param {
    num_output: 50
    kernel_size: 5
    stride: 1
    weight_filler {
      type: "xavier"
    bias_filler {
      type: "constant"
layer {
  name: "pool2"
  type: "Pooling"
  bottom: "conv2"
  top: "pool2"
  pooling_param {
    pool: MAX
    kernel_size: 2
    stride: 2
layer {
  name: "ip1"
  type: "InnerProduct"
  bottom: "pool2"
  top: "ip1"
  param {
    lr_mult: 1
  param {
    lr_mult: 2
  inner_product_param {
    num_output: 500
    weight_filler {
      type: "xavier"
    bias_filler {
      type: "constant"
layer {
  name: "relu1"
  type: "ReLU"
  bottom: "ip1"
  top: "ip1"
layer {
  name: "ip2"
  type: "InnerProduct"
  bottom: "ip1"
  top: "ip2"
  param {
    lr_mult: 1
  param {
    lr_mult: 2
  inner_product_param {
    num_output: 2 #NOTE that we try 2 classification(true or false), so we change num_output para from 1000 to 2
    weight_filler {
      type: "xavier"
    bias_filler {
      type: "constant"
layer {
  name: "accuracy"
  type: "Accuracy"
  bottom: "ip2"
  bottom: "label"
  top: "accuracy"
  include {
    phase: TEST
layer {
  name: "loss"
  type: "SoftmaxWithLoss"
  bottom: "ip2"
  bottom: "label"
  top: "loss"


以下をterminalで実行。(convert_imagesetと、compute_image_mean、の仕様については、--helpオプションで見れる。) --grayオプションを指定しないと、RGBとして認識される。caffe-windowsディレクトリに移動して、コマンド実行。

$ ./examples/mytest/learn.sh
Creating leveldb...
I1019 04:52:57.107475  9896 convert_imageset.cpp:86] A total of 10000 images.
I1019 04:52:57.127476  9896 db_leveldb.cpp:18] Opened leveldb data/mytest/mytest_train_leveldb
I1019 04:52:59.663730  9896 convert_imageset.cpp:144] Processed 1000 files.
I1019 04:53:00.843848  9896 convert_imageset.cpp:144] Processed 2000 files.
I1019 04:53:02.155979  9896 convert_imageset.cpp:144] Processed 3000 files.
I1019 04:53:03.460110  9896 convert_imageset.cpp:144] Processed 4000 files.
I1019 04:53:05.120276  9896 convert_imageset.cpp:144] Processed 5000 files.
I1019 04:53:06.133378  9896 convert_imageset.cpp:144] Processed 6000 files.
I1019 04:53:07.143478  9896 convert_imageset.cpp:144] Processed 7000 files.
I1019 04:53:09.223686  9896 convert_imageset.cpp:144] Processed 8000 files.
I1019 04:53:10.565820  9896 convert_imageset.cpp:144] Processed 9000 files.
I1019 04:53:11.995964  9896 convert_imageset.cpp:144] Processed 10000 files.
I1019 04:53:12.098973  6780 convert_imageset.cpp:86] A total of 2000 images.
I1019 04:53:12.110975  6780 db_leveldb.cpp:18] Opened leveldb data/mytest/mytest_test_leveldb
I1019 04:53:12.894053  6780 convert_imageset.cpp:144] Processed 1000 files.
I1019 04:53:13.695133  6780 convert_imageset.cpp:144] Processed 2000 files.
Computing image mean...
I1019 04:53:13.972162   752 caffe.cpp:179] Use CPU.
I1019 04:53:13.973161   752 solver.cpp:48] Initializing solver from parameters:
test_iter: 10
test_interval: 500
I1019 05:16:55.221271  8968 solver.cpp:404]     Test net output #0: accuracy = 1
I1019 05:16:55.221271  8968 solver.cpp:404]     Test net output #1: loss = 0.00398486 (* 1 = 0.00398486 loss)
I1019 05:16:55.348284  8968 solver.cpp:228] Iteration 9500, loss = 0.000340846
I1019 05:16:55.348284  8968 solver.cpp:244]     Train net output #0: loss = 0.000340768 (* 1 = 0.000340768 loss)
I1019 05:16:55.348284  8968 sgd_solver.cpp:106] Iteration 9500, lr = 0.00606002
I1019 05:18:03.680117  8968 solver.cpp:454] Snapshotting to binary proto file examples/mytest/lenet/mytest_iter_10000.caffemodel
I1019 05:18:03.696118  8968 sgd_solver.cpp:273] Snapshotting solver state to binary proto file examples/mytest/lenet/mytest_iter_10000.solverstate
I1019 05:18:03.783128  8968 solver.cpp:317] Iteration 10000, loss = 0.000334693
I1019 05:18:03.783128  8968 solver.cpp:337] Iteration 10000, Testing net (#0)
I1019 05:18:03.861135  8968 solver.cpp:404]     Test net output #0: accuracy = 1
I1019 05:18:03.861135  8968 solver.cpp:404]     Test net output #1: loss = 0.00207814 (* 1 = 0.00207814 loss)
I1019 05:18:03.861135  8968 solver.cpp:322] Optimization Done.
I1019 05:18:03.861135  8968 caffe.cpp:223] Optimization Done.

判別簡易な画像なので、#0: accuracy = 1です。

  • train.shの中身
#!/usr/bin/env sh
# This script converts the mytest data into lmdb/leveldb format,
# depending on the value assigned to $BACKEND.



echo "Creating ${BACKEND}..."

rm -rf $DATA/mytest_train_${BACKEND}
rm -rf $DATA/mytest_test_${BACKEND}
rm -rf $DATA/mean.binaryproto

# write gray option
convert_imageset.exe $DATA/src/ \
  $DATA/train.txt $DATA/mytest_train_${BACKEND} -backend=${BACKEND} -gray
# write gray option
convert_imageset.exe $DATA/src/ \
  $DATA/test.txt $DATA/mytest_test_${BACKEND} -backend=${BACKEND} -gray

echo "Computing image mean..."

compute_image_mean.exe -backend=${BACKEND} \
  $DATA/mytest_train_${BACKEND} $DATA/mean.binaryproto
echo "Done."

caffe train --solver=examples/mytest/lenet/solver.prototxt



$ ./examples/mytest/eval.sh 7000
Creating leveldb...
I1019 05:20:49.808728  7768 convert_imageset.cpp:86] A total of 7000 images.
I1019 05:20:49.819730  7768 db_leveldb.cpp:18] Opened leveldb data/mytest/mytest_deploy_leveldb
I1019 05:20:50.637811  7768 convert_imageset.cpp:144] Processed 1000 files.
I1019 05:20:51.431890  7768 convert_imageset.cpp:144] Processed 2000 files.
I1019 05:20:52.224969  7768 convert_imageset.cpp:144] Processed 3000 files.
I1019 05:20:53.003047  7768 convert_imageset.cpp:144] Processed 4000 files.
I1019 05:20:53.798127  7768 convert_imageset.cpp:144] Processed 5000 files.
I1019 05:20:54.590206  7768 convert_imageset.cpp:144] Processed 6000 files.
I1019 05:20:55.389286  7768 convert_imageset.cpp:144] Processed 7000 files.
I1019 05:20:55.488296  2020 caffe.cpp:247] Use CPU.
I1019 05:20:55.493296  2020 net.cpp:49] Initializing net from parameters:
name: "LeNet"
state {
  phase: TEST
layer {

I1019 05:20:15.893337  9280 caffe.cpp:276] Batch 6998, accuracy = 1
I1019 05:20:15.893337  9280 caffe.cpp:276] Batch 6998, loss = 1.3113e-006
I1019 05:20:15.894337  9280 caffe.cpp:276] Batch 6999, accuracy = 1
I1019 05:20:15.894337  9280 caffe.cpp:276] Batch 6999, loss = 0.00046085
I1019 05:20:15.894337  9280 caffe.cpp:281] Loss: 0.00181827
I1019 05:20:15.894337  9280 caffe.cpp:293] accuracy = 1
I1019 05:20:15.894337  9280 caffe.cpp:293] loss = 0.00181827 (* 1 = 0.00181827 loss)


$ ./examples/mytest/eval.sh 7000
Creating leveldb...
I1019 05:22:51.454891  9516 convert_imageset.cpp:86] A total of 7000 images.
I1019 05:22:51.466892  9516 db_leveldb.cpp:18] Opened leveldb data/mytest/mytest_deploy_leveldb
I1019 05:22:52.261972  9516 convert_imageset.cpp:144] Processed 1000 files.
I1019 05:22:53.054051  9516 convert_imageset.cpp:144] Processed 2000 files.
I1019 05:22:53.842130  9516 convert_imageset.cpp:144] Processed 3000 files.
I1019 05:22:54.529199  9516 convert_imageset.cpp:144] Processed 4000 files.
I1019 05:22:55.265272  9516 convert_imageset.cpp:144] Processed 5000 files.
I1019 05:22:56.077353  9516 convert_imageset.cpp:144] Processed 6000 files.
I1019 05:22:56.871433  9516 convert_imageset.cpp:144] Processed 7000 files.
I1019 05:22:57.544500  9368 caffe.cpp:247] Use CPU.
I1019 05:22:57.548501  9368 net.cpp:49] Initializing net from parameters:
name: "LeNet"
state {
  phase: TEST
layer {
  name: "test0"
I1019 05:22:57.598506  9368 net.cpp:219] label_test0_1_split does not need backward computation.
I1019 05:22:57.598506  9368 net.cpp:219] test0 does not need backward computation.
I1019 05:22:57.598506  9368 net.cpp:261] This network produces output accuracy
I1019 05:22:57.598506  9368 net.cpp:261] This network produces output loss
I1019 05:22:57.598506  9368 net.cpp:274] Network initialization done.
I1019 05:22:57.604506  9368 net.cpp:752] Ignoring source layer mnist
I1019 05:22:57.604506  9368 caffe.cpp:253] Running for 7000 iterations.
I1019 05:22:57.608507  9368 caffe.cpp:276] Batch 0, accuracy = 0
I1019 05:22:57.608507  9368 caffe.cpp:276] Batch 0, loss = 12.3964
I1019 05:22:57.613507  9368 caffe.cpp:276] Batch 1, accuracy = 1
I1019 05:25:11.542899  9588 caffe.cpp:276] Batch 6998, accuracy = 1
I1019 05:25:11.542899  9588 caffe.cpp:276] Batch 6998, loss = 1.3113e-006
I1019 05:25:11.543900  9588 caffe.cpp:276] Batch 6999, accuracy = 1
I1019 05:25:11.543900  9588 caffe.cpp:276] Batch 6999, loss = 0.00046085
I1019 05:25:11.543900  9588 caffe.cpp:281] Loss: 0.00358918
I1019 05:25:11.543900  9588 caffe.cpp:293] accuracy = 0.999857
I1019 05:25:11.543900  9588 caffe.cpp:293] loss = 0.00358918 (* 1 = 0.00358918 loss)


I1019 05:22:57.608507  9368 caffe.cpp:276] Batch 0, accuracy = 0


  • eval.shの中身は以下。
#!/usr/bin/env sh
# This script converts the mytest data into lmdb/leveldb format,
# depending on the value assigned to $BACKEND.



echo "Creating ${BACKEND}..."
rm -rf $DATA/mytest_deploy_${BACKEND}

convert_imageset.exe $DATA/src/ \
  $DATA/answer.txt $DATA/mytest_deploy_${BACKEND} -backend=${BACKEND} -gray

caffe test -model $EXAMPLE/lenet/eval.prototxt -weights $EXAMPLE/lenet/mytest_iter_100000.caffemodel -iterations $1
  • eval.prototxtの中身は以下。
name: "mytest"
layer {
  name: "data"
  type: "Input"
  top: "data"
  input_param { 
    shape: { 
        dim: 1 # the size you want to eval at a time
        dim: 1 # number of colour channels
        dim: 30 # width
        dim: 30 # height
layer {
  name: "conv1"
  type: "Convolution"
  bottom: "data"
  top: "conv1"
  param {
    lr_mult: 1
  param {
    lr_mult: 2
  convolution_param {
    num_output: 20
    kernel_size: 5
    stride: 1
    weight_filler {
      type: "xavier"
    bias_filler {
      type: "constant"
layer {
  name: "pool1"
  type: "Pooling"
  bottom: "conv1"
  top: "pool1"
  pooling_param {
    pool: MAX
    kernel_size: 2
    stride: 2
layer {
  name: "conv2"
  type: "Convolution"
  bottom: "pool1"
  top: "conv2"
  param {
    lr_mult: 1
  param {
    lr_mult: 2
  convolution_param {
    num_output: 50
    kernel_size: 5
    stride: 1
    weight_filler {
      type: "xavier"
    bias_filler {
      type: "constant"
layer {
  name: "pool2"
  type: "Pooling"
  bottom: "conv2"
  top: "pool2"
  pooling_param {
    pool: MAX
    kernel_size: 2
    stride: 2
layer {
  name: "ip1"
  type: "InnerProduct"
  bottom: "pool2"
  top: "ip1"
  param {
    lr_mult: 1
  param {
    lr_mult: 2
  inner_product_param {
    num_output: 500
    weight_filler {
      type: "xavier"
    bias_filler {
      type: "constant"
layer {
  name: "relu1"
  type: "ReLU"
  bottom: "ip1"
  top: "ip1"
layer {
  name: "ip2"
  type: "InnerProduct"
  bottom: "ip1"
  top: "ip2"
  param {
    lr_mult: 1
  param {
    lr_mult: 2
  inner_product_param {
    num_output: 2 #NOTE that we try 2 classification(true or false), so we change num_output para from 1000 to 2
    weight_filler {
      type: "xavier"
    bias_filler {
      type: "constant"
layer {
  name: "prob"
  type: "Softmax"
  bottom: "ip2"
  top: "prob"







  Flags from ..\..\tools\convert_imageset.cpp:
    -backend (The backend {lmdb, leveldb} for storing the result) type: string
      default: "lmdb"
    -check_size (When this option is on, check that all the datum have the same
      size) type: bool default: false
    -encode_type (Optional: What type should we encode the image as
      ('png','jpg',...).) type: string default: ""
    -encoded (When this option is on, the encoded image will be save in datum)
      type: bool default: false
    -gray (When this option is on, treat images as grayscale ones) type: bool
      default: false
    -resize_height (Height images are resized to) type: int32 default: 0
    -resize_width (Width images are resized to) type: int32 default: 0
    -shuffle (Randomly shuffle the order of images and their labels) type: bool
      default: false


  Flags from ..\..\tools\compute_image_mean.cpp:
    -backend (The backend {leveldb, lmdb} containing the images) type: string
      default: "lmdb"


  Flags from ..\..\tools\caffe.cpp:
    -gpu (Optional; run in GPU mode on given device IDs separated by ','.Use
      '-gpu all' to run on all available GPUs. The effective training batch
      size is multiplied by the number of devices.) type: string default: ""
    -iterations (The number of iterations to run.) type: int32 default: 50
    -model (The model definition protocol buffer text file.) type: string
      default: ""
    -sighup_effect (Optional; action to take when a SIGHUP signal is received:
      snapshot, stop or none.) type: string default: "snapshot"
    -sigint_effect (Optional; action to take when a SIGINT signal is received:
      snapshot, stop or none.) type: string default: "stop"
    -snapshot (Optional; the snapshot solver state to resume training.)
      type: string default: ""
    -solver (The solver definition protocol buffer text file.) type: string
      default: ""
    -weights (Optional; the pretrained weights to initialize finetuning,
      separated by ','. Cannot be set simultaneously with snapshot.)
      type: string default: ""

