More than 5 years have passed since last update.

Open Images Dataset学習済みモデルで545クラス物体検出

Posted at 2017-11-23

概要

545クラスの物体を検出したい時ってありますよね。そんな時に都合の良い学習済みモデルが、Tensorflow Object Detection APIに追加されました。Open Images Datasetで学習したInception Resnet v2のFaster R-CNNという色々な意味で重いモデルのため、そのまま使うのは厳しいかもしれませんが、備忘録として書いておきます。

環境

Ubuntu 14.04
- VirtualBox上。メモリ割り当て：6GB（4GBだと推論だけでもメモリ不足エラーが出ます。）
Anaconda

環境構築

TensorFlow 1.4.0

conda create -n tf140 python=3.5
source activate tf140
conda install numpy
pip install tensorflow==1.4.0
# pip install tensorflow-gpu==1.4.0

TensorFlow Models

cd workspace/
git clone https://github.com/tensorflow/models.git

Tensorflow Object Detection API
- https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/installation.md
  に従ってインストールします。
  最後のmodel_builder_test.pyでOKと表示されたら完了です。
- protocでこける場合は https://github.com/tensorflow/models/issues/1834 をご参照下さい。

実行

jupyterでチュートリアルを開きます。

cd object_detection/
jupyter notebook object_detection_tutorial.ipynb

そのまま実行するとSSD MobilenetをMS COCOで学習したモデルが使用されるため、以下のように書き換えてから実行します。（なお、ダウンロードするファイルは700MB程あります。）

# What model to download.
MODEL_NAME = 'faster_rcnn_inception_resnet_v2_atrous_oid_2017_11_08'
MODEL_FILE = MODEL_NAME + '.tar.gz'
DOWNLOAD_BASE = 'http://download.tensorflow.org/models/object_detection/'

# Path to frozen detection graph. This is the actual model that is used for the object detection.
PATH_TO_CKPT = MODEL_NAME + '/frozen_inference_graph.pb'

# List of the strings that is used to add correct label for each box.
PATH_TO_LABELS = os.path.join('data', 'oid_bbox_trainable_label_map.pbtxt')

NUM_CLASSES = 545

実行結果例

(画像引用元： https://www.flickr.com/photos/honzasoukup/5384702747 Honza Soukup)
検出クラス数は多いものの、Noseのconfidenceが3%だったり、Bathtub（外側灰色枠）の検出位置がずれていたりと、まだまだ改善の余地がありそうです。
（表示閾値は vis_util.visualize_boxes_and_labels_on_image_array の min_score_thresh で変更できます。）

(画像引用元： https://www.flickr.com/photos/bluumwezi/5233255485 Blue moon in her eyes)
寿司クラスもあります。その他どのようなクラス・データがあるかは以下で確認できます。
http://www.cvdfoundation.org/datasets/open-images-dataset/vis/index.html

（画像引用元： https://ja.wikipedia.org/wiki/%E3%83%95%E3%82%A1%E3%82%A4%E3%83%AB:Serval_in_Tanzania.jpg ）
サーバルクラスはありません。動物検出のクラス数ではYOLO9000が圧勝でしょう。 https://qiita.com/shinya7y/items/d3cb285784c2a1dd8d63

（画像引用元： https://www.youtube.com/watch?v=lh_GcdBamD4 ）
すっごーい！

補足：Open Images Datasetでの学習

faster_rcnn_inception_resnet_v2_atrous_oid.config を見る限りMS COCOの学習と大きな違いは無……

  num_steps: 8000000

8M iterations!

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up