More than 5 years have passed since last update.

Edge TPU Accelaratorの動作を少しでも高速化したかったのでダメ元でMobileNetv2-SSDLite(Pascal VOC)の.tfliteを生成してTPUモデルへコンパイルしようとした_その１

Last updated at 2019-03-24Posted at 2019-03-21

Tensorflow-bin　

TPU-MobilenetSSD　

１．Introduction

.tfliteの生成までは成功しましたが、最終手順で失敗しました。
2019.03.22追記　手順が間違っていたようなので、後日別記事で訂正します。参考情報はコチラ

最後の最後のTPUモデルへのコンパイル処理でエラー Uncaught application failure になります。
Coral側のコンパイラ が未処理例外を発生させて異常終了してしまうようです。

途中まではつまづくことなく順調に進められたのですが。。。残念です。。。
初期サポートモデル以外への対応は「Coming Soon!!」となっていますので当然かもしれませんね。

今回はCOCOからVOCへ転移学習をしてみました。
くやしいので、「Coming Soon!!」が外れたら改めて今回生成したモデルを使用して検証してみたいと思います。
とりあえずは、 RaspberryPi3版高速化チューニング済み Tensorflow Lite でチマチマ遊んでおきます。
NCSDKのときの悪夢がよみがえりますが、全く潰しの効かないフレームワークにならないことだけを祈ります。

２．Environment

Ubuntu 16.04 x86_64
Corei7 Gen8
Geforce GTX 1070
Tensorflow-GPU v1.12.0
CUDA 9.0
cuDNN 7
Pascal VOC 2012 Dataset
Netron 2.8.1

３．Procedure

$ sudo apt-get install protobuf-compiler python-pil python-lxml python-tk
$ sudo -H pip3 install --user Cython
$ sudo -H pip3 install --user contextlib2
$ sudo -H pip3 install --user jupyter
$ sudo -H pip3 install --user matplotlib

$ cd ~
$ git clone https://github.com/tensorflow/models.git
$ cd models/research

Download_VOC_2012_datasets

### VOCtrainval_11-May-2012.tar <--- 1.86GB

$ curl -sc /tmp/cookie "https://drive.google.com/uc?export=download&id=1rATNHizJdVHnaJtt-hW9MOgjxoaajzdh" > /dev/null
$ CODE="$(awk '/_warning_/ {print $NF}' /tmp/cookie)"
$ curl -Lb /tmp/cookie "https://drive.google.com/uc?export=download&confirm=${CODE}&id=1rATNHizJdVHnaJtt-hW9MOgjxoaajzdh" -o VOCtrainval_11-May-2012.tar

# Extract the data.
$ tar -xvf VOCtrainval_11-May-2012.tar;rm VOCtrainval_11-May-2012.tar

Generating_the_PASCAL_VOC_TFRecord_files

$ protoc object_detection/protos/*.proto --python_out=.
$ python3 object_detection/dataset_tools/create_pascal_tf_record.py \
    --label_map_path=object_detection/data/pascal_label_map.pbtxt \
    --data_dir=VOCdevkit \
    --year=VOC2012 \
    --set=train \
    --output_path=pascal_train.record

$ python3 object_detection/dataset_tools/create_pascal_tf_record.py \
    --label_map_path=object_detection/data/pascal_label_map.pbtxt \
    --data_dir=VOCdevkit \
    --year=VOC2012 \
    --set=val \
    --output_path=pascal_val.record

The label map for the PASCAL VOC data set can be found at object_detection/data/pascal_label_map.pbtxt

$ cd models/research
$ mkdir data
$ mkdir -p models/model/train
$ mkdir -p models/model/eval

$ cp object_detection/data/pascal_label_map.pbtxt data
$ mv pascal_train.record data
$ mv pascal_val.record data

In the object_detection/samples/configs folder, there are skeleton object_detection configuration files.

Download_data_for_transfer_learning_(ssdlite_mobilenet_v2_coco)

$ curl -sc /tmp/cookie "https://drive.google.com/uc?export=download&id=10GqUhvgkEAT4JiV44AxqySqhHvc9DFsa" > /dev/null
$ CODE="$(awk '/_warning_/ {print $NF}' /tmp/cookie)"
$ curl -Lb /tmp/cookie "https://drive.google.com/uc?export=download&confirm=${CODE}&id=10GqUhvgkEAT4JiV44AxqySqhHvc9DFsa" -o ssdlite_mobilenet_v2_coco_2018_05_09.tar.gz
$ tar -zxvf ssdlite_mobilenet_v2_coco_2018_05_09.tar.gz;rm ssdlite_mobilenet_v2_coco_2018_05_09.tar.gz

$ cp object_detection/samples/configs/ssdlite_mobilenet_v2_coco.config models/model/ssdlite_mobilenet_v2_voc.config
$ nano models/model/ssdlite_mobilenet_v2_voc.config

ssdlite_mobilenet_v2_voc.config

# SSDLite with Mobilenet v2 configuration for Pascal VOC Dataset.
# Users should configure the fine_tune_checkpoint field in the train config as
# well as the label_map_path and input_path fields in the train_input_reader and
# eval_input_reader. Search for "PATH_TO_BE_CONFIGURED" to find the fields that
# should be configured.

model {
  ssd {
    num_classes: 20
    box_coder {
      faster_rcnn_box_coder {
        y_scale: 10.0
        x_scale: 10.0
        height_scale: 5.0
        width_scale: 5.0
      }
    }
    matcher {
      argmax_matcher {
        matched_threshold: 0.5
        unmatched_threshold: 0.5
        ignore_thresholds: false
        negatives_lower_than_unmatched: true
        force_match_for_each_row: true
      }
    }
    similarity_calculator {
      iou_similarity {
      }
    }
    anchor_generator {
      ssd_anchor_generator {
        num_layers: 6
        min_scale: 0.2
        max_scale: 0.95
        aspect_ratios: 1.0
        aspect_ratios: 2.0
        aspect_ratios: 0.5
        aspect_ratios: 3.0
        aspect_ratios: 0.3333
      }
    }
    image_resizer {
      fixed_shape_resizer {
        height: 320
        width: 320
      }
    }
    box_predictor {
      convolutional_box_predictor {
        min_depth: 0
        max_depth: 0
        num_layers_before_predictor: 0
        use_dropout: false
        dropout_keep_probability: 0.8
        kernel_size: 3
        use_depthwise: true
        box_code_size: 4
        apply_sigmoid_to_scores: false
        conv_hyperparams {
          activation: RELU_6,
          regularizer {
            l2_regularizer {
              weight: 0.00004
            }
          }
          initializer {
            truncated_normal_initializer {
              stddev: 0.03
              mean: 0.0
            }
          }
          batch_norm {
            train: true,
            scale: true,
            center: true,
            decay: 0.9997,
            epsilon: 0.001,
          }
        }
      }
    }
    feature_extractor {
      type: 'ssd_mobilenet_v2'
      min_depth: 16
      depth_multiplier: 1.0
      use_depthwise: true
      conv_hyperparams {
        activation: RELU_6,
        regularizer {
          l2_regularizer {
            weight: 0.00004
          }
        }
        initializer {
          truncated_normal_initializer {
            stddev: 0.03
            mean: 0.0
          }
        }
        batch_norm {
          train: true,
          scale: true,
          center: true,
          decay: 0.9997,
          epsilon: 0.001,
        }
      }
    }
    loss {
      classification_loss {
        weighted_sigmoid {
        }
      }
      localization_loss {
        weighted_smooth_l1 {
        }
      }
      hard_example_miner {
        num_hard_examples: 3000
        iou_threshold: 0.99
        loss_type: CLASSIFICATION
        max_negatives_per_positive: 3
        min_negatives_per_image: 3
      }
      classification_weight: 1.0
      localization_weight: 1.0
    }
    normalize_loss_by_num_matches: true
    post_processing {
      batch_non_max_suppression {
        score_threshold: 1e-8
        iou_threshold: 0.6
        max_detections_per_class: 100
        max_total_detections: 100
      }
      score_converter: SIGMOID
    }
  }
}

train_config: {
  batch_size: 4
  optimizer {
    rms_prop_optimizer: {
      learning_rate: {
        exponential_decay_learning_rate {
          initial_learning_rate: 0.004
          decay_steps: 800720
          decay_factor: 0.95
        }
      }
      momentum_optimizer_value: 0.9
      decay: 0.9
      epsilon: 1.0
    }
  }
  fine_tune_checkpoint: "ssdlite_mobilenet_v2_coco_2018_05_09/model.ckpt"
  fine_tune_checkpoint_type:  "detection"
  # Note: The below line limits the training process to 200K steps, which we
  # empirically found to be sufficient enough to train the pets dataset. This
  # effectively bypasses the learning rate schedule (the learning rate will
  # never decay). Remove the below line to train indefinitely.
  num_steps: 50000
  data_augmentation_options {
    random_horizontal_flip {
    }
  }
  data_augmentation_options {
    ssd_random_crop {
    }
  }
}

train_input_reader: {
  tf_record_input_reader {
    input_path: "data/pascal_train.record"
  }
  label_map_path: "data/pascal_label_map.pbtxt"
}

eval_config: {
  num_examples: 8000
  # Note: The below line limits the evaluation process to 10 evaluations.
  # Remove the below line to evaluate indefinitely.
  max_evals: 10
}

eval_input_reader: {
  tf_record_input_reader {
    input_path: "data/pascal_val.record"
  }
  label_map_path: "data/pascal_label_map.pbtxt"
  shuffle: false
  num_readers: 1
  num_epochs: 1
}

When running locally, the models/research and models/research/slim directories should be appended to PYTHONPATH. This can be done by running the following from models/research.

$ export PYTHONPATH=`pwd`:`pwd`/slim:$PYTHONPATH

A local training job can be run with the following command.

$ git clone https://github.com/pdollar/coco.git
$ cd coco/PythonAPI
$ make -j8
$ sudo make install
$ sudo python3 setup.py install

# From the models/research/ directory

$ cd ../..
$ PIPELINE_CONFIG_PATH=models/model/ssdlite_mobilenet_v2_voc.config
$ MODEL_DIR=models/model/train
$ NUM_TRAIN_STEPS=50000
$ SAMPLE_1_OF_N_EVAL_EXAMPLES=1

$ python3 object_detection/model_main.py \
--pipeline_config_path=${PIPELINE_CONFIG_PATH} \
--model_dir=${MODEL_DIR} \
--num_train_steps=${NUM_TRAIN_STEPS} \
--sample_1_of_n_eval_examples=$SAMPLE_1_OF_N_EVAL_EXAMPLES \
--alsologtostderr

Check_your_learning_progress

$ MODEL_DIR=models/model/train
$ tensorboard --logdir=${MODEL_DIR}

Graph_output_for_Tensorflow

$ mkdir -p models/model/train/tf
$ python3 object_detection/export_inference_graph.py \
--input_type image_tensor \
--pipeline_config_path models/model/ssdlite_mobilenet_v2_voc.config \
--trained_checkpoint_prefix models/model/train/model.ckpt-48323 \
--output_directory models/model/train/tf

Graph_output_for_Tensorflow_Lite

$ mkdir -p models/model/train/tflite
$ python3 object_detection/export_tflite_ssd_graph.py \
--pipeline_config_path models/model/ssdlite_mobilenet_v2_voc.config \
--trained_checkpoint_prefix models/model/train/model.ckpt-48323 \
--output_directory models/model/train/tflite \
--config_override " \
        model{ \
          ssd{ \
            post_processing { \
              batch_non_max_suppression { \
                      score_threshold: 0.0 \
                      iou_threshold: 0.5 \
              } \
            } \
          } \
        }"

Overview_check_of_graph_structure

$ sudo apt instal -y libc-ares-dev
$ git clone https://github.com/PINTO0309/Bazel_bin.git
$ Bazel_bin/0.19.2/Ubuntu1604_x86_64/install.sh

$ git clone -b v1.12.0 https://github.com/tensorflow/tensorflow.git
$ cd tensorflow
$ git checkout -b v1.12.0
$ bazel clean
$ bazel build tensorflow/tools/graph_transforms:summarize_graph
$ bazel-bin/tensorflow/tools/graph_transforms/summarize_graph \
--in_graph=/home/<username>/models/research/models/model/train/tflite/tflite_graph.pb

output_sample

Found 1 possible inputs: (name=normalized_input_image_tensor, type=float(1), shape=[1,320,320,3]) 
No variables spotted.
Found 1 possible outputs: (name=TFLite_Detection_PostProcess, op=TFLite_Detection_PostProcess) 
Found 3414298 (3.41M) const parameters, 0 (0) variable parameters, and 0 control_edges
Op types used: 491 Identity, 420 Const, 76 FusedBatchNorm, 59 Relu6, 55 Conv2D, 
33 DepthwiseConv2dNative, 12 BiasAdd, 12 Reshape, 10 Add, 2 ConcatV2, 1 Placeholder, 
1 RealDiv, 1 Sigmoid, 1 Squeeze, 1 TFLite_Detection_PostProcess
To use with tensorflow/tools/benchmark:benchmark_model try these arguments:
bazel run tensorflow/tools/benchmark:benchmark_model -- \
--graph=/home/<username>/models/research/models/model/train/tflite/tflite_graph.pb \
--show_flops \
--input_layer=normalized_input_image_tensor \
--input_layer_type=float \
--input_layer_shape=1,320,320,3 \
--output_layer=TFLite_Detection_PostProcess

INPUT NODE = normalized_input_image_tensor
OUTPUT NODE = TFLite_Detection_PostProcess

TOCO_convert_(.pb->.tflite)

$ cd /home/<username>/models/research
$ tflite_convert \
--output_file=models/model/train/tflite/ssdlite_mobilenet_v2_voc.tflite \
--graph_def_file=models/model/train/tflite/tflite_graph.pb \
--inference_type=QUANTIZED_UINT8 \
--input_shapes=1,320,320,3 \
--input_arrays=normalized_input_image_tensor \
--output_arrays=TFLite_Detection_PostProcess \
--default_ranges_min=0 \
--default_ranges_max=6 \
--mean_values=128 \
--std_dev_values=127 \
--allow_custom_ops

４．Reference articles

https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/preparing_inputs.md
https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/running_pets.md
https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/running_locally.md
https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/configuring_jobs.md
https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/detection_model_zoo.md
https://coral.withgoogle.com/web-compiler/
https://github.com/tensorflow/models/issues/5808

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up