Tensorflow-bin
TPU-MobilenetSSD
1.Introduction
.tfliteの生成までは成功しましたが、最終手順で失敗しました。
2019.03.22追記 手順が間違っていたようなので、後日別記事で訂正します。 参考情報はコチラ
最後の最後のTPUモデルへのコンパイル処理でエラー Uncaught application failure
になります。
Coral側のコンパイラ が未処理例外を発生させて異常終了してしまうようです。
途中まではつまづくことなく順調に進められたのですが。。。残念です。。。
初期サポートモデル以外への対応は 「Coming Soon!!」 となっていますので当然かもしれませんね。
今回はCOCOからVOCへ転移学習をしてみました。
くやしいので、「Coming Soon!!」 が外れたら改めて今回生成したモデルを使用して検証してみたいと思います。
とりあえずは、 RaspberryPi3版 高速化チューニング済み Tensorflow Lite
でチマチマ遊んでおきます。
NCSDKのときの悪夢がよみがえりますが、全く潰しの効かないフレームワークにならないことだけを祈ります。
2.Environment
- Ubuntu 16.04 x86_64
- Corei7 Gen8
- Geforce GTX 1070
- Tensorflow-GPU v1.12.0
- CUDA 9.0
- cuDNN 7
- Pascal VOC 2012 Dataset
- Netron 2.8.1
3.Procedure
$ sudo apt-get install protobuf-compiler python-pil python-lxml python-tk
$ sudo -H pip3 install --user Cython
$ sudo -H pip3 install --user contextlib2
$ sudo -H pip3 install --user jupyter
$ sudo -H pip3 install --user matplotlib
$ cd ~
$ git clone https://github.com/tensorflow/models.git
$ cd models/research
### VOCtrainval_11-May-2012.tar <--- 1.86GB
$ curl -sc /tmp/cookie "https://drive.google.com/uc?export=download&id=1rATNHizJdVHnaJtt-hW9MOgjxoaajzdh" > /dev/null
$ CODE="$(awk '/_warning_/ {print $NF}' /tmp/cookie)"
$ curl -Lb /tmp/cookie "https://drive.google.com/uc?export=download&confirm=${CODE}&id=1rATNHizJdVHnaJtt-hW9MOgjxoaajzdh" -o VOCtrainval_11-May-2012.tar
# Extract the data.
$ tar -xvf VOCtrainval_11-May-2012.tar;rm VOCtrainval_11-May-2012.tar
$ protoc object_detection/protos/*.proto --python_out=.
$ python3 object_detection/dataset_tools/create_pascal_tf_record.py \
--label_map_path=object_detection/data/pascal_label_map.pbtxt \
--data_dir=VOCdevkit \
--year=VOC2012 \
--set=train \
--output_path=pascal_train.record
$ python3 object_detection/dataset_tools/create_pascal_tf_record.py \
--label_map_path=object_detection/data/pascal_label_map.pbtxt \
--data_dir=VOCdevkit \
--year=VOC2012 \
--set=val \
--output_path=pascal_val.record
The label map for the PASCAL VOC data set can be found at object_detection/data/pascal_label_map.pbtxt
$ cd models/research
$ mkdir data
$ mkdir -p models/model/train
$ mkdir -p models/model/eval
$ cp object_detection/data/pascal_label_map.pbtxt data
$ mv pascal_train.record data
$ mv pascal_val.record data
In the object_detection/samples/configs
folder, there are skeleton object_detection configuration files.
$ curl -sc /tmp/cookie "https://drive.google.com/uc?export=download&id=10GqUhvgkEAT4JiV44AxqySqhHvc9DFsa" > /dev/null
$ CODE="$(awk '/_warning_/ {print $NF}' /tmp/cookie)"
$ curl -Lb /tmp/cookie "https://drive.google.com/uc?export=download&confirm=${CODE}&id=10GqUhvgkEAT4JiV44AxqySqhHvc9DFsa" -o ssdlite_mobilenet_v2_coco_2018_05_09.tar.gz
$ tar -zxvf ssdlite_mobilenet_v2_coco_2018_05_09.tar.gz;rm ssdlite_mobilenet_v2_coco_2018_05_09.tar.gz
$ cp object_detection/samples/configs/ssdlite_mobilenet_v2_coco.config models/model/ssdlite_mobilenet_v2_voc.config
$ nano models/model/ssdlite_mobilenet_v2_voc.config
# SSDLite with Mobilenet v2 configuration for Pascal VOC Dataset.
# Users should configure the fine_tune_checkpoint field in the train config as
# well as the label_map_path and input_path fields in the train_input_reader and
# eval_input_reader. Search for "PATH_TO_BE_CONFIGURED" to find the fields that
# should be configured.
model {
ssd {
num_classes: 20
box_coder {
faster_rcnn_box_coder {
y_scale: 10.0
x_scale: 10.0
height_scale: 5.0
width_scale: 5.0
}
}
matcher {
argmax_matcher {
matched_threshold: 0.5
unmatched_threshold: 0.5
ignore_thresholds: false
negatives_lower_than_unmatched: true
force_match_for_each_row: true
}
}
similarity_calculator {
iou_similarity {
}
}
anchor_generator {
ssd_anchor_generator {
num_layers: 6
min_scale: 0.2
max_scale: 0.95
aspect_ratios: 1.0
aspect_ratios: 2.0
aspect_ratios: 0.5
aspect_ratios: 3.0
aspect_ratios: 0.3333
}
}
image_resizer {
fixed_shape_resizer {
height: 320
width: 320
}
}
box_predictor {
convolutional_box_predictor {
min_depth: 0
max_depth: 0
num_layers_before_predictor: 0
use_dropout: false
dropout_keep_probability: 0.8
kernel_size: 3
use_depthwise: true
box_code_size: 4
apply_sigmoid_to_scores: false
conv_hyperparams {
activation: RELU_6,
regularizer {
l2_regularizer {
weight: 0.00004
}
}
initializer {
truncated_normal_initializer {
stddev: 0.03
mean: 0.0
}
}
batch_norm {
train: true,
scale: true,
center: true,
decay: 0.9997,
epsilon: 0.001,
}
}
}
}
feature_extractor {
type: 'ssd_mobilenet_v2'
min_depth: 16
depth_multiplier: 1.0
use_depthwise: true
conv_hyperparams {
activation: RELU_6,
regularizer {
l2_regularizer {
weight: 0.00004
}
}
initializer {
truncated_normal_initializer {
stddev: 0.03
mean: 0.0
}
}
batch_norm {
train: true,
scale: true,
center: true,
decay: 0.9997,
epsilon: 0.001,
}
}
}
loss {
classification_loss {
weighted_sigmoid {
}
}
localization_loss {
weighted_smooth_l1 {
}
}
hard_example_miner {
num_hard_examples: 3000
iou_threshold: 0.99
loss_type: CLASSIFICATION
max_negatives_per_positive: 3
min_negatives_per_image: 3
}
classification_weight: 1.0
localization_weight: 1.0
}
normalize_loss_by_num_matches: true
post_processing {
batch_non_max_suppression {
score_threshold: 1e-8
iou_threshold: 0.6
max_detections_per_class: 100
max_total_detections: 100
}
score_converter: SIGMOID
}
}
}
train_config: {
batch_size: 4
optimizer {
rms_prop_optimizer: {
learning_rate: {
exponential_decay_learning_rate {
initial_learning_rate: 0.004
decay_steps: 800720
decay_factor: 0.95
}
}
momentum_optimizer_value: 0.9
decay: 0.9
epsilon: 1.0
}
}
fine_tune_checkpoint: "ssdlite_mobilenet_v2_coco_2018_05_09/model.ckpt"
fine_tune_checkpoint_type: "detection"
# Note: The below line limits the training process to 200K steps, which we
# empirically found to be sufficient enough to train the pets dataset. This
# effectively bypasses the learning rate schedule (the learning rate will
# never decay). Remove the below line to train indefinitely.
num_steps: 50000
data_augmentation_options {
random_horizontal_flip {
}
}
data_augmentation_options {
ssd_random_crop {
}
}
}
train_input_reader: {
tf_record_input_reader {
input_path: "data/pascal_train.record"
}
label_map_path: "data/pascal_label_map.pbtxt"
}
eval_config: {
num_examples: 8000
# Note: The below line limits the evaluation process to 10 evaluations.
# Remove the below line to evaluate indefinitely.
max_evals: 10
}
eval_input_reader: {
tf_record_input_reader {
input_path: "data/pascal_val.record"
}
label_map_path: "data/pascal_label_map.pbtxt"
shuffle: false
num_readers: 1
num_epochs: 1
}
When running locally, the models/research
and models/research/slim
directories should be appended to PYTHONPATH
. This can be done by running the following from models/research
.
$ export PYTHONPATH=`pwd`:`pwd`/slim:$PYTHONPATH
A local training job can be run with the following command.
$ git clone https://github.com/pdollar/coco.git
$ cd coco/PythonAPI
$ make -j8
$ sudo make install
$ sudo python3 setup.py install
# From the models/research/ directory
$ cd ../..
$ PIPELINE_CONFIG_PATH=models/model/ssdlite_mobilenet_v2_voc.config
$ MODEL_DIR=models/model/train
$ NUM_TRAIN_STEPS=50000
$ SAMPLE_1_OF_N_EVAL_EXAMPLES=1
$ python3 object_detection/model_main.py \
--pipeline_config_path=${PIPELINE_CONFIG_PATH} \
--model_dir=${MODEL_DIR} \
--num_train_steps=${NUM_TRAIN_STEPS} \
--sample_1_of_n_eval_examples=$SAMPLE_1_OF_N_EVAL_EXAMPLES \
--alsologtostderr
$ MODEL_DIR=models/model/train
$ tensorboard --logdir=${MODEL_DIR}
$ mkdir -p models/model/train/tf
$ python3 object_detection/export_inference_graph.py \
--input_type image_tensor \
--pipeline_config_path models/model/ssdlite_mobilenet_v2_voc.config \
--trained_checkpoint_prefix models/model/train/model.ckpt-48323 \
--output_directory models/model/train/tf
$ mkdir -p models/model/train/tflite
$ python3 object_detection/export_tflite_ssd_graph.py \
--pipeline_config_path models/model/ssdlite_mobilenet_v2_voc.config \
--trained_checkpoint_prefix models/model/train/model.ckpt-48323 \
--output_directory models/model/train/tflite \
--config_override " \
model{ \
ssd{ \
post_processing { \
batch_non_max_suppression { \
score_threshold: 0.0 \
iou_threshold: 0.5 \
} \
} \
} \
}"
$ sudo apt instal -y libc-ares-dev
$ git clone https://github.com/PINTO0309/Bazel_bin.git
$ Bazel_bin/0.19.2/Ubuntu1604_x86_64/install.sh
$ git clone -b v1.12.0 https://github.com/tensorflow/tensorflow.git
$ cd tensorflow
$ git checkout -b v1.12.0
$ bazel clean
$ bazel build tensorflow/tools/graph_transforms:summarize_graph
$ bazel-bin/tensorflow/tools/graph_transforms/summarize_graph \
--in_graph=/home/<username>/models/research/models/model/train/tflite/tflite_graph.pb
Found 1 possible inputs: (name=normalized_input_image_tensor, type=float(1), shape=[1,320,320,3])
No variables spotted.
Found 1 possible outputs: (name=TFLite_Detection_PostProcess, op=TFLite_Detection_PostProcess)
Found 3414298 (3.41M) const parameters, 0 (0) variable parameters, and 0 control_edges
Op types used: 491 Identity, 420 Const, 76 FusedBatchNorm, 59 Relu6, 55 Conv2D,
33 DepthwiseConv2dNative, 12 BiasAdd, 12 Reshape, 10 Add, 2 ConcatV2, 1 Placeholder,
1 RealDiv, 1 Sigmoid, 1 Squeeze, 1 TFLite_Detection_PostProcess
To use with tensorflow/tools/benchmark:benchmark_model try these arguments:
bazel run tensorflow/tools/benchmark:benchmark_model -- \
--graph=/home/<username>/models/research/models/model/train/tflite/tflite_graph.pb \
--show_flops \
--input_layer=normalized_input_image_tensor \
--input_layer_type=float \
--input_layer_shape=1,320,320,3 \
--output_layer=TFLite_Detection_PostProcess
INPUT NODE = normalized_input_image_tensor
OUTPUT NODE = TFLite_Detection_PostProcess
$ cd /home/<username>/models/research
$ tflite_convert \
--output_file=models/model/train/tflite/ssdlite_mobilenet_v2_voc.tflite \
--graph_def_file=models/model/train/tflite/tflite_graph.pb \
--inference_type=QUANTIZED_UINT8 \
--input_shapes=1,320,320,3 \
--input_arrays=normalized_input_image_tensor \
--output_arrays=TFLite_Detection_PostProcess \
--default_ranges_min=0 \
--default_ranges_max=6 \
--mean_values=128 \
--std_dev_values=127 \
--allow_custom_ops
4.Reference articles
https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/preparing_inputs.md
https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/running_pets.md
https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/running_locally.md
https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/configuring_jobs.md
https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/detection_model_zoo.md
https://coral.withgoogle.com/web-compiler/
https://github.com/tensorflow/models/issues/5808