More than 5 years have passed since last update.

JetsonNanoでkeras-yolo3を使う

Last updated at 2019-09-12Posted at 2019-07-31

【内容】

JetsonNanoで手っ取り早くYolov3を動かそうと思い、【keras-yolo3】を動かそうとしたら、少しハマったので情報を残します。

【kerasのインストール】

keras-yolo3は、その名の通りKerasを使うのでKerasをインストールします。
ただその際に、scipyが必要になりますが、これのインストールに少し手間取りました。
下記のコマンドで依存パッケージをインストールすればOKです。
なお、scipyのインストール時にビルドが実行されますので、気長に待ってください。

scipyとkerasのインストール

# 依存パッケージのインストール
sudo apt install libatlas-base-dev gfortran

# scipyインストール
sudo pip3 install scipy

# kerasインストール
sudo pip3 install keras

【リポジトリの取得】

keras-yolo3のリポジトリをクローンします。

keras-yolo3のクローン

git clone https://github.com/qqwweee/keras-yolo3

【重みファイルの取得とConvert (yolov3)】

READMEに沿って、作業を進めてみます。

重みファイルの取得

wget https://pjreddie.com/media/files/yolov3.weights

重みファイルのConvert

python3 convert.py yolov3.cfg yolov3.weights model_data/yolo.h5

【カメラ画像で推論してみる (yolov3)】

いきなりカメラ画像を使ってリアルタイム物体検出を行ってみます。

yolo_video

python3 yolo_video.py --input 0

出力結果

GStreamer: Error opening bin: no element "0"
Traceback (most recent call last):
  File "yolo_video.py", line 75, in <module>
    detect_video(YOLO(**vars(FLAGS)), FLAGS.input, FLAGS.output)
  File "/home/iwama/keras-yolo3/yolo.py", line 176, in detect_video
    raise IOError("Couldn't open webcam or video")
OSError: Couldn't open webcam or video

"0"というビデオファイルまたはwebcamが開けないとエラーになります。
yolo.pyのソースを見てみるとOpenCVのVideoCaptureを使っているだけですが、引数が文字列型で渡されているため、"0"という文字列が渡り、ファイルパスと扱われてしまっているようです。
そこで、下記のようにソースを変更しました。

(git素人なので適当にコマンド打った結果です…)

yolo.py_変更点

diff --git a/yolo.py b/yolo.py
index 4aa3486..385231b 100644
--- a/yolo.py
+++ b/yolo.py
@@ -171,6 +171,8 @@ class YOLO(object):
 
 def detect_video(yolo, video_path, output_path=""):
     import cv2
+    if video_path.isdigit():
+        video_path = int(video_path)
     vid = cv2.VideoCapture(video_path)
     if not vid.isOpened():
         raise IOError("Couldn't open webcam or video")

内容的には引数のvideo_pathが数値ならint型に変換しているだけです。
変更したら、再度実行してみます。

yolo_video実行

python3 yolo_video.py --input 0

ちなみにここでもハマりました。
実行して推論が始まった途端、画面が真っ黒になり無反応に…
恐らく電力不足でJetsonNanoが落ちてしまったのでしょう。
電源にはOutput 2.4Aと書かれていたUSB電源を使っていたのですが、きちんとスペック通りに電力が出ていない感じ。
手元にあった2.5A出るラズパイ用の電源をつなぎ直して再チャレンジで、動くようになりました。

出力結果

(416, 416, 3)
Found 6 boxes for img
mouse 0.37 (357, 304) (377, 314)
laptop 0.98 (393, 245) (475, 305)
chair 0.46 (18, 241) (93, 405)
chair 0.99 (457, 294) (640, 480)
chair 0.99 (174, 283) (345, 480)
person 1.00 (429, 160) (633, 449)
57.69759187599993
Gtk-Message: 13:58:12.566: Failed to load module "canberra-gtk-module"
(416, 416, 3)
Found 7 boxes for img
mouse 0.37 (356, 304) (378, 314)
laptop 0.98 (393, 245) (475, 306)
chair 0.39 (334, 255) (376, 285)
chair 0.49 (19, 241) (93, 405)
chair 0.99 (458, 294) (640, 480)
chair 0.99 (174, 284) (343, 480)
person 1.00 (429, 158) (633, 453)
9.14090474500017
(416, 416, 3)
Found 7 boxes for img
mouse 0.34 (362, 301) (378, 311)
mouse 0.42 (357, 304) (378, 314)
laptop 0.99 (393, 245) (475, 306)
chair 0.30 (18, 241) (92, 405)
chair 0.99 (456, 294) (640, 480)
chair 0.99 (175, 285) (344, 480)
person 1.00 (429, 160) (633, 450)
5.609129336000024
(416, 416, 3)
Found 7 boxes for img
mouse 0.36 (357, 304) (378, 314)
laptop 0.98 (393, 245) (475, 306)
chair 0.36 (334, 255) (376, 284)
chair 0.40 (18, 240) (93, 404)
chair 0.99 (458, 294) (640, 480)
chair 0.99 (173, 284) (345, 480)
person 1.00 (428, 159) (633, 448)
3.7221094729998185
(416, 416, 3)
Found 7 boxes for img
mouse 0.44 (356, 304) (378, 314)
laptop 0.99 (393, 245) (475, 306)
chair 0.33 (333, 255) (376, 284)
chair 0.48 (20, 241) (92, 405)
chair 0.99 (459, 293) (640, 480)
chair 0.99 (174, 284) (344, 480)
person 1.00 (429, 159) (632, 451)
1.2987515009999697
(416, 416, 3)
Found 5 boxes for img
tvmonitor 0.33 (269, 177) (361, 228)
chair 0.55 (17, 241) (95, 404)
chair 0.99 (173, 285) (350, 480)
chair 1.00 (379, 299) (566, 480)
person 0.99 (366, 156) (555, 310)
1.1418892810002035
(416, 416, 3)
Found 5 boxes for img
tvmonitor 0.39 (270, 177) (361, 228)
chair 0.48 (18, 242) (94, 404)
chair 0.99 (174, 286) (352, 480)
chair 1.00 (378, 299) (565, 480)
person 0.99 (366, 156) (554, 310)
1.1087476640000204
(416, 416, 3)
Found 5 boxes for img
tvmonitor 0.34 (269, 177) (361, 228)
chair 0.48 (18, 242) (95, 404)
chair 0.99 (174, 286) (351, 480)
chair 1.00 (378, 300) (565, 480)
person 0.99 (365, 156) (555, 311)
1.1661839919997874
(416, 416, 3)
Found 5 boxes for img
tvmonitor 0.44 (269, 178) (361, 228)
chair 0.53 (18, 241) (94, 404)
chair 0.99 (174, 286) (350, 480)
chair 1.00 (378, 299) (565, 480)
person 0.99 (367, 156) (554, 312)
1.0812733180000578
(416, 416, 3)
Found 5 boxes for img
tvmonitor 0.41 (270, 177) (362, 228)
chair 0.52 (18, 241) (95, 404)
chair 0.99 (174, 285) (351, 480)
chair 1.00 (379, 299) (565, 480)
person 0.99 (366, 156) (555, 311)
1.0736771299998509
(416, 416, 3)
Found 5 boxes for img
tvmonitor 0.34 (270, 177) (360, 228)
chair 0.52 (19, 241) (93, 405)
chair 0.99 (173, 285) (351, 480)
chair 1.00 (379, 299) (565, 480)
person 0.98 (366, 157) (555, 310)
1.103763125000114
(416, 416, 3)
Found 5 boxes for img
tvmonitor 0.41 (269, 177) (362, 228)
chair 0.49 (17, 240) (95, 406)
chair 0.99 (174, 285) (351, 480)
chair 1.00 (378, 300) (566, 480)
person 0.99 (366, 157) (554, 311)
1.0687457690000883

一応動きましたが、起動に恐ろしく時間がかかるうえに、タイムラグがひどいです。
推論時間は1秒程度ですが、タイムラグが致命的でリアルタイム性はありません。

ちなみに、最初にチャレンジした際にはGPUのメモリが足らずに推論できずにエラーになりました。
そこで、tiny-yolov3でチャレンジしてみました。

【重みファイルの取得とConvert (tiny-yolov3)】

まずは重みファイルを取得して変換します。

重みファイルの取得

wget https://pjreddie.com/media/files/yolov3-tiny.weights

重みファイルのConvert

python3 convert.py yolov3-tiny.cfg yolov3-tiny.weights model_data/yolo-tiny.h5

configファイルとしてyolov3-tiny.cfgを指定しています。
このファイルはリポジトリ内に最初から存在しています。

【カメラ画像で推論してみる (tiny-yolov3)】

yolov3の時と同じ様にカメラ画像を使ってリアルタイム物体検出を行ってみます。
この時、モデルファイルやアンカーファイルをtiny-yolo用に指定したかったのですが、READMEに書かれていた--modelや--anchorsでうまく指定できなかったので、下記の様にコマンド引数を変更しました。

yolo_video.py_変更点

diff --git a/yolo_video.py b/yolo_video.py
index 7c39461..66b1f6f 100644
--- a/yolo_video.py
+++ b/yolo_video.py
@@ -25,17 +25,17 @@ if __name__ == '__main__':
     Command line options
     '''
     parser.add_argument(
-        '--model', type=str,
+        '--model_path', type=str,
         help='path to model weight file, default ' + YOLO.get_defaults("model_path")
     )
 
     parser.add_argument(
-        '--anchors', type=str,
+        '--anchors_path', type=str,
         help='path to anchor definitions, default ' + YOLO.get_defaults("anchors_path")
     )
 
     parser.add_argument(
-        '--classes', type=str,
+        '--classes_path', type=str,
         help='path to class definitions, default ' + YOLO.get_defaults("classes_path")
     )

ソースを変更できたら、下記のコマンドで実行します。
tiny_yolo用のアンカーファイルもものが最初から用意されていましたので、これを使います。

yolo_video実行

python3 yolo_video.py --model_path model_data/yolo-tiny.h5 --anchors_path model_data/tiny_yolo_anchors.txt --input 0

出力結果

(416, 416, 3)
Found 3 boxes for img
tvmonitor 0.66 (412, 33) (623, 224)
chair 0.39 (209, 280) (370, 476)
chair 0.67 (455, 368) (611, 478)
0.2118123630007176
(416, 416, 3)
Found 3 boxes for img
tvmonitor 0.70 (413, 32) (623, 225)
chair 0.37 (210, 280) (369, 476)
chair 0.65 (454, 367) (611, 478)
0.23051902199949836
(416, 416, 3)
Found 3 boxes for img
tvmonitor 0.67 (414, 34) (622, 224)
chair 0.36 (220, 282) (388, 476)
chair 0.61 (456, 367) (610, 478)
0.2246938799999043
(416, 416, 3)
Found 3 boxes for img
tvmonitor 0.69 (413, 32) (623, 225)
chair 0.37 (219, 283) (388, 476)
chair 0.63 (454, 366) (612, 479)
0.25109094899926276
(416, 416, 3)
Found 3 boxes for img
tvmonitor 0.73 (414, 33) (622, 225)
chair 0.37 (207, 283) (370, 476)
chair 0.58 (457, 367) (610, 478)
0.2618196370003716
(416, 416, 3)
Found 3 boxes for img
tvmonitor 0.66 (412, 33) (622, 224)
chair 0.40 (220, 284) (387, 475)
chair 0.63 (455, 367) (611, 478)
0.24236862999896402

0.25秒前後で推論しており5FPS程度は出ています。
ただ、先程ほどではありませんが、タイムラグが1.5秒程度あるためリアルタイム性は微妙です。

【最後に】

とりあえず、動かすこと自体が目的だったので、一応目的は達しました。
単に動いた、というだけですけど…
ネイティブのDarknetだともう少し速いらしいので、別途チャレンジしてみたいと思います。
あと、keras-yolo3は独自モデルを作ることもできますが、さすがにTrainingは無理かな?

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up