欠陥画像のCNN動作テスト
前提
- caffeの導入まで済んでいると仮定する。"~/caffe-windows/Build/x64/Release"にPATHを通す。convert_imageset.exe, compute_image_mean.exe, caffe.exeについては、このディレクトリ内にある。
- 欠陥画像なので、2クラス判別(正常 or 異常)。画像は判別容易な画像を用いる。
- 動作テストなので、lenetモデルで検証。
動作環境
Windows 7 64bit + RAM 16GB + Cygwin(cpuモードで)
ディレクトリ配置
caffe-windows/
├ data/
│ └ mytest/
│ ├ train.txt #../../example/mytest/train.sh で作成
│ ├ test.txt #../../example/mytest/train.sh で作成
│ ├ answer.txt #../../example/mytest/deploy.sh で作成
│ ├ mean.binaryproto #../../example/mytest/train.sh で作成
│ ├ mytest_train_leveldb/ #../../example/mytest/train.sh で作成
│ ├ mytest_test_leveldb/ #../../example/mytest/train.sh で作成
│ ├ mytest_deploy_leveldb/ #../../example/mytest/train.sh で作成
│ ├ src/
│ └ 0001.bmp
│ └ 0002.bmp
│ └ (以下省略。訓練用:train data=>10000枚 test data => 2000枚 評価用:eval data => 7000枚作成)
├ example/
│ ├ eval.sh
│ ├ train.sh
│ └ mytest/
│ └ solver.prototxt #../../example/mytest/train.sh(訓練時)に使用
│ └ train_test.prototxt #../../example/mytest/train.sh(訓練時)に使用
│ └ mytest_iter_100000.caffemodel #../../example/mytest/train.sh(訓練時)後に生成される
│ └ eval.prototxt #../../example/mytest/eval.sh(評価時)に使用
├ Build/
│ └ x64/
└ └ Release/ #パスを通しておく。
訓練
訓練の準備
画像データ(30px四方)を用意しておく。今回は、テストのため、わかりやすいように、異常物をランダムにくっつけ、人工的に作成。
画像を(正常):(異常) = 1:1.1位の割合で用意した。グレー画像。訓練表(train.txt, test.txt, answer.txtなど)は、
12001.bmp 0
12002.bmp 1
12003.bmp 1
12004.bmp 0
12005.bmp 0
12006.bmp 0
12007.bmp 0
12008.bmp 1
12009.bmp 0
12010.bmp 0
(..以下省略)
のような形で記述しておく。
- solver.prototxtは以下のように記述
net: "examples/mytest/lenet/train_test.prototxt"
test_iter: 100 #NOTE that ${test_iter} * ${batch_size} = {number of test data}
test_interval: 500
base_lr: 0.01 #adjust value
momentum: 0.9
weight_decay: 0.0005
# The learning rate policy
lr_policy: "inv"
gamma: 0.0001
power: 0.75
# Display every 100 iterations
display: 500
# The maximum number of iterations
max_iter: 100000
# snapshot intermediate results
snapshot: 5000
snapshot_prefix: "examples/mytest/lenet/mytest"
# solver mode: CPU or GPU
solver_mode: CPU
- train_test.prototxtは以下のように記述
name: "LeNet"
layer {
name: "mnist"
type: "Data"
top: "data"
top: "label"
include {
phase: TRAIN
}
transform_param {
scale: 0.00390625 #raw_scale is 255, so, 1/255 scaling
mean_file: "data/mytest/mean.binaryproto"
}
data_param {
source: "data/mytest/mytest_train_leveldb"
batch_size: 100
backend: LEVELDB
}
}
layer {
name: "mnist"
type: "Data"
top: "data"
top: "label"
include {
phase: TEST
}
transform_param {
scale: 0.00390625
mean_file: "data/mytest/mean.binaryproto"
}
data_param {
source: "data/mytest/mytest_test_leveldb"
batch_size: 20 # ref test_iter param of solver.prototxt.
backend: LEVELDB
}
}
layer {
name: "conv1"
type: "Convolution"
bottom: "data"
top: "conv1"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
convolution_param {
num_output: 20
kernel_size: 5
stride: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
layer {
name: "pool1"
type: "Pooling"
bottom: "conv1"
top: "pool1"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
layer {
name: "conv2"
type: "Convolution"
bottom: "pool1"
top: "conv2"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
convolution_param {
num_output: 50
kernel_size: 5
stride: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
layer {
name: "pool2"
type: "Pooling"
bottom: "conv2"
top: "pool2"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
layer {
name: "ip1"
type: "InnerProduct"
bottom: "pool2"
top: "ip1"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
inner_product_param {
num_output: 500
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
layer {
name: "relu1"
type: "ReLU"
bottom: "ip1"
top: "ip1"
}
layer {
name: "ip2"
type: "InnerProduct"
bottom: "ip1"
top: "ip2"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
inner_product_param {
num_output: 2 #NOTE that we try 2 classification(true or false), so we change num_output para from 1000 to 2
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
layer {
name: "accuracy"
type: "Accuracy"
bottom: "ip2"
bottom: "label"
top: "accuracy"
include {
phase: TEST
}
}
layer {
name: "loss"
type: "SoftmaxWithLoss"
bottom: "ip2"
bottom: "label"
top: "loss"
}
訓練
以下をterminalで実行。(convert_imagesetと、compute_image_mean、の仕様については、--helpオプションで見れる。) --grayオプションを指定しないと、RGBとして認識される。caffe-windowsディレクトリに移動して、コマンド実行。
$ ./examples/mytest/learn.sh
Creating leveldb...
I1019 04:52:57.107475 9896 convert_imageset.cpp:86] A total of 10000 images.
I1019 04:52:57.127476 9896 db_leveldb.cpp:18] Opened leveldb data/mytest/mytest_train_leveldb
I1019 04:52:59.663730 9896 convert_imageset.cpp:144] Processed 1000 files.
I1019 04:53:00.843848 9896 convert_imageset.cpp:144] Processed 2000 files.
I1019 04:53:02.155979 9896 convert_imageset.cpp:144] Processed 3000 files.
I1019 04:53:03.460110 9896 convert_imageset.cpp:144] Processed 4000 files.
I1019 04:53:05.120276 9896 convert_imageset.cpp:144] Processed 5000 files.
I1019 04:53:06.133378 9896 convert_imageset.cpp:144] Processed 6000 files.
I1019 04:53:07.143478 9896 convert_imageset.cpp:144] Processed 7000 files.
I1019 04:53:09.223686 9896 convert_imageset.cpp:144] Processed 8000 files.
I1019 04:53:10.565820 9896 convert_imageset.cpp:144] Processed 9000 files.
I1019 04:53:11.995964 9896 convert_imageset.cpp:144] Processed 10000 files.
I1019 04:53:12.098973 6780 convert_imageset.cpp:86] A total of 2000 images.
I1019 04:53:12.110975 6780 db_leveldb.cpp:18] Opened leveldb data/mytest/mytest_test_leveldb
I1019 04:53:12.894053 6780 convert_imageset.cpp:144] Processed 1000 files.
I1019 04:53:13.695133 6780 convert_imageset.cpp:144] Processed 2000 files.
Computing image mean...
Done.
I1019 04:53:13.972162 752 caffe.cpp:179] Use CPU.
I1019 04:53:13.973161 752 solver.cpp:48] Initializing solver from parameters:
test_iter: 10
test_interval: 500
(..中略)
I1019 05:16:55.221271 8968 solver.cpp:404] Test net output #0: accuracy = 1
I1019 05:16:55.221271 8968 solver.cpp:404] Test net output #1: loss = 0.00398486 (* 1 = 0.00398486 loss)
I1019 05:16:55.348284 8968 solver.cpp:228] Iteration 9500, loss = 0.000340846
I1019 05:16:55.348284 8968 solver.cpp:244] Train net output #0: loss = 0.000340768 (* 1 = 0.000340768 loss)
I1019 05:16:55.348284 8968 sgd_solver.cpp:106] Iteration 9500, lr = 0.00606002
I1019 05:18:03.680117 8968 solver.cpp:454] Snapshotting to binary proto file examples/mytest/lenet/mytest_iter_10000.caffemodel
I1019 05:18:03.696118 8968 sgd_solver.cpp:273] Snapshotting solver state to binary proto file examples/mytest/lenet/mytest_iter_10000.solverstate
I1019 05:18:03.783128 8968 solver.cpp:317] Iteration 10000, loss = 0.000334693
I1019 05:18:03.783128 8968 solver.cpp:337] Iteration 10000, Testing net (#0)
I1019 05:18:03.861135 8968 solver.cpp:404] Test net output #0: accuracy = 1
I1019 05:18:03.861135 8968 solver.cpp:404] Test net output #1: loss = 0.00207814 (* 1 = 0.00207814 loss)
I1019 05:18:03.861135 8968 solver.cpp:322] Optimization Done.
I1019 05:18:03.861135 8968 caffe.cpp:223] Optimization Done.
判別簡易な画像なので、#0: accuracy = 1です。
- train.shの中身
#!/usr/bin/env sh
# This script converts the mytest data into lmdb/leveldb format,
# depending on the value assigned to $BACKEND.
DATA=data/mytest
#BUILD=Build/x64/Release
BACKEND="leveldb"
echo "Creating ${BACKEND}..."
rm -rf $DATA/mytest_train_${BACKEND}
rm -rf $DATA/mytest_test_${BACKEND}
rm -rf $DATA/mean.binaryproto
# write gray option
convert_imageset.exe $DATA/src/ \
$DATA/train.txt $DATA/mytest_train_${BACKEND} -backend=${BACKEND} -gray
# write gray option
convert_imageset.exe $DATA/src/ \
$DATA/test.txt $DATA/mytest_test_${BACKEND} -backend=${BACKEND} -gray
echo "Computing image mean..."
compute_image_mean.exe -backend=${BACKEND} \
$DATA/mytest_train_${BACKEND} $DATA/mean.binaryproto
echo "Done."
caffe train --solver=examples/mytest/lenet/solver.prototxt
評価
以下を実行。
$ ./examples/mytest/eval.sh 7000
Creating leveldb...
I1019 05:20:49.808728 7768 convert_imageset.cpp:86] A total of 7000 images.
I1019 05:20:49.819730 7768 db_leveldb.cpp:18] Opened leveldb data/mytest/mytest_deploy_leveldb
I1019 05:20:50.637811 7768 convert_imageset.cpp:144] Processed 1000 files.
I1019 05:20:51.431890 7768 convert_imageset.cpp:144] Processed 2000 files.
I1019 05:20:52.224969 7768 convert_imageset.cpp:144] Processed 3000 files.
I1019 05:20:53.003047 7768 convert_imageset.cpp:144] Processed 4000 files.
I1019 05:20:53.798127 7768 convert_imageset.cpp:144] Processed 5000 files.
I1019 05:20:54.590206 7768 convert_imageset.cpp:144] Processed 6000 files.
I1019 05:20:55.389286 7768 convert_imageset.cpp:144] Processed 7000 files.
I1019 05:20:55.488296 2020 caffe.cpp:247] Use CPU.
I1019 05:20:55.493296 2020 net.cpp:49] Initializing net from parameters:
name: "LeNet"
state {
phase: TEST
}
layer {
(..中略)
I1019 05:20:15.893337 9280 caffe.cpp:276] Batch 6998, accuracy = 1
I1019 05:20:15.893337 9280 caffe.cpp:276] Batch 6998, loss = 1.3113e-006
I1019 05:20:15.894337 9280 caffe.cpp:276] Batch 6999, accuracy = 1
I1019 05:20:15.894337 9280 caffe.cpp:276] Batch 6999, loss = 0.00046085
I1019 05:20:15.894337 9280 caffe.cpp:281] Loss: 0.00181827
I1019 05:20:15.894337 9280 caffe.cpp:293] accuracy = 1
I1019 05:20:15.894337 9280 caffe.cpp:293] loss = 0.00181827 (* 1 = 0.00181827 loss)
answer.txtの一行目の正解を変えてみると、出力は、こんな感じになった。
$ ./examples/mytest/eval.sh 7000
Creating leveldb...
I1019 05:22:51.454891 9516 convert_imageset.cpp:86] A total of 7000 images.
I1019 05:22:51.466892 9516 db_leveldb.cpp:18] Opened leveldb data/mytest/mytest_deploy_leveldb
I1019 05:22:52.261972 9516 convert_imageset.cpp:144] Processed 1000 files.
I1019 05:22:53.054051 9516 convert_imageset.cpp:144] Processed 2000 files.
I1019 05:22:53.842130 9516 convert_imageset.cpp:144] Processed 3000 files.
I1019 05:22:54.529199 9516 convert_imageset.cpp:144] Processed 4000 files.
I1019 05:22:55.265272 9516 convert_imageset.cpp:144] Processed 5000 files.
I1019 05:22:56.077353 9516 convert_imageset.cpp:144] Processed 6000 files.
I1019 05:22:56.871433 9516 convert_imageset.cpp:144] Processed 7000 files.
I1019 05:22:57.544500 9368 caffe.cpp:247] Use CPU.
I1019 05:22:57.548501 9368 net.cpp:49] Initializing net from parameters:
name: "LeNet"
state {
phase: TEST
}
layer {
name: "test0"
(..省略)
I1019 05:22:57.598506 9368 net.cpp:219] label_test0_1_split does not need backward computation.
I1019 05:22:57.598506 9368 net.cpp:219] test0 does not need backward computation.
I1019 05:22:57.598506 9368 net.cpp:261] This network produces output accuracy
I1019 05:22:57.598506 9368 net.cpp:261] This network produces output loss
I1019 05:22:57.598506 9368 net.cpp:274] Network initialization done.
I1019 05:22:57.604506 9368 net.cpp:752] Ignoring source layer mnist
I1019 05:22:57.604506 9368 caffe.cpp:253] Running for 7000 iterations.
I1019 05:22:57.608507 9368 caffe.cpp:276] Batch 0, accuracy = 0
I1019 05:22:57.608507 9368 caffe.cpp:276] Batch 0, loss = 12.3964
I1019 05:22:57.613507 9368 caffe.cpp:276] Batch 1, accuracy = 1
(..中略)
I1019 05:25:11.542899 9588 caffe.cpp:276] Batch 6998, accuracy = 1
I1019 05:25:11.542899 9588 caffe.cpp:276] Batch 6998, loss = 1.3113e-006
I1019 05:25:11.543900 9588 caffe.cpp:276] Batch 6999, accuracy = 1
I1019 05:25:11.543900 9588 caffe.cpp:276] Batch 6999, loss = 0.00046085
I1019 05:25:11.543900 9588 caffe.cpp:281] Loss: 0.00358918
I1019 05:25:11.543900 9588 caffe.cpp:293] accuracy = 0.999857
I1019 05:25:11.543900 9588 caffe.cpp:293] loss = 0.00358918 (* 1 = 0.00358918 loss)
となる。
I1019 05:22:57.608507 9368 caffe.cpp:276] Batch 0, accuracy = 0
とあるので、lenetの動作は問題なさそうです。
- eval.shの中身は以下。
#!/usr/bin/env sh
# This script converts the mytest data into lmdb/leveldb format,
# depending on the value assigned to $BACKEND.
EXAMPLE=examples/mytest
DATA=data/mytest
BUILD=Build/x64/Release
BACKEND="leveldb"
echo "Creating ${BACKEND}..."
rm -rf $DATA/mytest_deploy_${BACKEND}
convert_imageset.exe $DATA/src/ \
$DATA/answer.txt $DATA/mytest_deploy_${BACKEND} -backend=${BACKEND} -gray
caffe test -model $EXAMPLE/lenet/eval.prototxt -weights $EXAMPLE/lenet/mytest_iter_100000.caffemodel -iterations $1
- eval.prototxtの中身は以下。
name: "mytest"
layer {
name: "data"
type: "Input"
top: "data"
input_param {
shape: {
dim: 1 # the size you want to eval at a time
dim: 1 # number of colour channels
dim: 30 # width
dim: 30 # height
}
}
}
layer {
name: "conv1"
type: "Convolution"
bottom: "data"
top: "conv1"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
convolution_param {
num_output: 20
kernel_size: 5
stride: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
layer {
name: "pool1"
type: "Pooling"
bottom: "conv1"
top: "pool1"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
layer {
name: "conv2"
type: "Convolution"
bottom: "pool1"
top: "conv2"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
convolution_param {
num_output: 50
kernel_size: 5
stride: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
layer {
name: "pool2"
type: "Pooling"
bottom: "conv2"
top: "pool2"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
layer {
name: "ip1"
type: "InnerProduct"
bottom: "pool2"
top: "ip1"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
inner_product_param {
num_output: 500
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
layer {
name: "relu1"
type: "ReLU"
bottom: "ip1"
top: "ip1"
}
layer {
name: "ip2"
type: "InnerProduct"
bottom: "ip1"
top: "ip2"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
inner_product_param {
num_output: 2 #NOTE that we try 2 classification(true or false), so we change num_output para from 1000 to 2
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
layer {
name: "prob"
type: "Softmax"
bottom: "ip2"
top: "prob"
}
参考
参考url
http://tutorial.caffe.berkeleyvision.org/tutorial/interfaces.html
http://qiita.com/wyamamo/items/1875561b030f7ff42617
caffeの仕様
--helpオプションで見れるよ。
convert_imageset
Flags from ..\..\tools\convert_imageset.cpp: -backend (The backend {lmdb, leveldb} for storing the result) type: string default: "lmdb" -check_size (When this option is on, check that all the datum have the same size) type: bool default: false -encode_type (Optional: What type should we encode the image as ('png','jpg',...).) type: string default: "" -encoded (When this option is on, the encoded image will be save in datum) type: bool default: false -gray (When this option is on, treat images as grayscale ones) type: bool default: false -resize_height (Height images are resized to) type: int32 default: 0 -resize_width (Width images are resized to) type: int32 default: 0 -shuffle (Randomly shuffle the order of images and their labels) type: bool default: false
compute_image_mean
Flags from ..\..\tools\compute_image_mean.cpp: -backend (The backend {leveldb, lmdb} containing the images) type: string default: "lmdb"
caffe
Flags from ..\..\tools\caffe.cpp: -gpu (Optional; run in GPU mode on given device IDs separated by ','.Use '-gpu all' to run on all available GPUs. The effective training batch size is multiplied by the number of devices.) type: string default: "" -iterations (The number of iterations to run.) type: int32 default: 50 -model (The model definition protocol buffer text file.) type: string default: "" -sighup_effect (Optional; action to take when a SIGHUP signal is received: snapshot, stop or none.) type: string default: "snapshot" -sigint_effect (Optional; action to take when a SIGINT signal is received: snapshot, stop or none.) type: string default: "stop" -snapshot (Optional; the snapshot solver state to resume training.) type: string default: "" -solver (The solver definition protocol buffer text file.) type: string default: "" -weights (Optional; the pretrained weights to initialize finetuning, separated by ','. Cannot be set simultaneously with snapshot.) type: string default: ""