More than 5 years have passed since last update.

歩行者検出技術のデファクト評価フレームワークを動かしてみる

Posted at 2016-10-25

背景

歩行者検出技術は顔検出技術と並んで非常に実用上重要な技術の1つである。特に、自動運転が注目されている現状において、未だにホットな研究分野であり続けている。
歩行者検出技術の評価は、非常に整備された評価フレームワークが存在しデファクトとなっているため、それを動作させてみる。

取り敢えず下記でグラフが出るはず。

git clone https://gist.github.com/32febdc55658f5b98c251c16cd4c53b8.git caltech_pedestrian
cd caltech_pedestrian
chmod u+x caltech_evaluation.sh
./caltech_evaluation.sh
cd code
matlab
addpath(genpath('../pdollar_toolbox'))
dbEval

Caltech Pedestrian Detection Benchmark

歩行者検出技術の評価フレームワークとしてデファクトスタンダードとなっているのはCaltech Pedestrian Detection Benchmarkである。
独自のCaltech Pedestrian Datasetだけではなく、INRIA, ETH, TUD-Brussels, Daimlerといった他のデータセットでの評価も簡単に行えるようになっている。様々な手法の生の結果ファイルもアーカイブされているため、他の手法を実装することなく網羅的な比較が可能であり、殆どの論文でこのフレームワークを利用したグラフが載っているのが現状である。

Requirements

評価データセットで用いる独自のファイル形式を扱うため、Piotr's Matlab Toolboxが必要となる。コンパイルされたバイナリが含まれているが、環境が違う場合はexternal/toolboxCompile.mを実行する。

# download toolbox
git clone https://github.com/pdollar/toolbox pdollar_toolbox

ツール本体は、Matlab evaluation/labeling code (3.2.1) からダウンロードできる。

# download evaluation/labeling code
wget http://www.vision.caltech.edu/Image_Datasets/CaltechPedestrians/code/code3.2.1.zip
mkdir code
unzip code3.2.1.zip -d code
rm code3.2.1.zip

データセットのダウンロード

https://www.vision.caltech.edu/Image_Datasets/CaltechPedestrians/datasets/
から前記5つのデータセットがダウンロード可能である。
データセットは、訓練およびテスト用の動画データ (setXX.tar) 、正解データ (annotations.zip)、既存の手法の結果（resディレクトリ以下のzip）から構成されている。正解データと結果のファイルさえあれば結果のグラフは表示可能なため、今回はCaltechデータセット（USA）の正解データと結果をダウンロードする。
その際、評価スクリプトdbEval.mが存在するツールのディレクトリcode以下に、下記のように展開する必要がある。
code
　└data-[データセット名]
　　└annotations
　　└res

# download annotations
cd code
mkdir data-USA
cd data-USA
wget http://www.vision.caltech.edu/Image_Datasets/CaltechPedestrians/datasets/USA/annotations.zip
unzip annotations.zip
rm annotations.zip

ここではViola JonesとHoGに加えて、メジャーなFPDW [1]、ECCV'16の [2, 3] の結果をダウンロードする。手法名（resディレクトリ以下のzipファイルの拡張子なし）を追加すれば多数の手法を同時に比較可能である。
[1] P. Dollar et al., "The Fastest Pedestrian Detector in the West," BMVC'10.
https://github.com/apennisi/fastestpedestriandetectorinthewest
[2] Z. Cai et al., "A Unified Multi-scale Deep Convolutional Neural Network for Fast Object Detection," ECCV'16.
https://github.com/zhaoweicai/mscnn
[3] L. Zhang et al., "Is Faster R-CNN Doing Well for Pedestrian Detection?," ECCV'16.
https://github.com/zhangliliang/RPN_BF

# download results
mkdir res
cd res
list=("VJ" "HOG" "FPDW" "RPN+BF" "MS-CNN")
for method in "${list[@]}"
do
  wget http://www.vision.caltech.edu/Image_Datasets/CaltechPedestrians/datasets/USA/res/${method}.zip
  unzip ${method}.zip
  rm ${method}.zip
done

結果の表示

評価スクリプトdbEval.mが存在するツールのディレクトリcode以下で、matlabを起動し、

addpath(genpath('[pdollar toolboxのパス]'));
dbEval

とすると下記のような結果のグラフがcode/results以下に作成される。
手法の横の%表記は、log-average miss rateという値で、横軸のfalse positives per imageにおいて、$10^{-2}$から$10^0$の間のlog-scaleで等間隔な9箇所でのmiss rateの平均である。
（大体false positives per imageが$10^{-1}$のときのmiss rateくらいになる）
しかしながら、下記の結果を見てもdeep learningしゅごいになる。

上記のスクリプトをまとめたものは下記。
https://gist.github.com/yu4u/32febdc55658f5b98c251c16cd4c53b8

次回は何かしら検出手法を動かして評価をしてみる。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up