LoginSignup
27
18

More than 1 year has passed since last update.

LattePanda Alpha 864 (OS付属無し) にUbuntu16.04+OpenVINOを導入してNeural Compute Stick(NCS1) と Neural Compute Stick 2(NCS2) で爆速Semantic Segmentationを楽しむ

Last updated at Posted at 2018-11-24

TensorflowLite-UNet GitHub stars1
Tensorflow-ENet GitHub stars3
Tensorflow-ENet2 GitHub stars3
ICNet-tensorflow GitHub stars3
OpenVINO-ADAS GitHub stars3
OpenVINO-DeeplabV3 GitHub stars3

I wrote an English translation at the end of the article, here

#◆ はじめに
まず、衝撃的な結論から先に言おう。
OpenVINOでIntelのCPU向けにモデルを最適化した場合、Neural Compute Stick 2を使用するよりも、CPU実行のほうが推論が速い。
公式フォーラムで各国のエンジニアと議論した内容は コチラ
あふれ出るワクワクの衝動を抑えきれない、とにかくセッカチで超絶困ったさんなあなたは コチラ から評価結果へショートカット可能。
最終的に シングルボードコンピュータのCPUのみ で下図のようなスピードでのセグメンテーションを実現する。
自動運転で使えるスピードには程遠いが、CPU Onlyということを鑑みるとめちゃくちゃ速い。
分かる人にしか分からない結果。
sample.gif

MobileNet-SSD + RaspberryPi で Neural Compute Stick は使い倒した。
↓ クリックでYoutube再生される
Screenshot 2018-11-26 00:06:00.png
Qiita記事 にしたり、 Githubにコミット したり、この縁あって 本家Tensorflowリポジトリ にPullRequestしてみたり、CaffeとTensorflowのDeepLearning基礎の基礎を学べたことで、自己成長という点では完全にモトがとれたと思っている。
2018/11/14 Neural Compute Stick 2 (初代の8倍の性能) (←本家URLのリンク) の発売が決定したが、初代発売後1年経過した今でもなおSDKの品質が とてつもなく低過ぎ かつ NCS2 は RaspberryPi(ARM)には対応していない ので、購入するかどうかは少しだけ迷ったが、やっぱり我慢できずに購入してしまった。
v1 と全く同じサイズ。 そして、Intel買収先の Movidius の文字が消えた。
できればもう少し横幅を縮めて欲しかったなぁ。。。
ncs2-angled-down-lid-off-500x334.png aaa.jpg

Ver2は、わずか3本で Jetson TX2 の性能を凌駕するということで、ワクワクさんの心がうずく。

Neural Compute Stick v1 ... 100GFLOPS
Neural Compute Stick v2 ... 800GFLOPS (単純に8倍してみた)
TX2 ... 2TFLOPS

実は NCSDK の品質が著しく低いため、SDKの品質に左右されそうにない OpenVINO を試したくなったと言っても過言ではない。

今回は、11月初旬に予約販売で手に入れた LattePanda Alpha 864 (OS無し)Ubuntu16.04 を導入し、更に OpenVINO を導入して Neural Compute Stick および Neural Compute Stick 2 のカスタムセグメンテーションモデルの動作検証を行う。
LattePanda Alpha を調達した目的は、 シングルボードコンピュータ上での Neural Compute Stick + OpenVINO の有用性検証のため。
コスト面において趣味で取り組むレベルを完全に超えているため、みなさんは決してマネしないように。

OpenVINOCaffe, TensorFlow, MXNet, Kaldi, ONNX でそれぞれ生成したモデルを、共通フォーマットの中間バイナリ(IR [Intermediate representation of the model])へ変換し、推論エンジンAPI(Inference Engine)を経由して共通的に実行できる、というもの。
なお、実行基盤は ARMアーキテクチャ には対応しておらず、Intel の x86/64系CPU にしか対応していない。
02.jpg

1. Develop Multiplatform Computer Vision Solutions - Intel Developer Zone
2. Install the Intel® Distribution of OpenVINO™ toolkit for Linux - Intel Developer Zone
3. How to Integrate the Inference Engine in Your Application - Intel Inference Engine Developer Guide
4. Accelerate Deep Learning Inference with Integrated Intel® Processor Graphics Rev 2.0 - Intel Developer Zone

#◆ LattePanda Alpha の外観
1.外箱1
03.jpg
2.外箱2 (渋い)
04.jpg
3.中箱 (パンダの顔があしらってある)
05.jpg
4.同梱品一式 (ケースは付属しない)
06.jpg
5.タバコの箱と比較したサイズ感 (縦横はRaspberryPiより若干大きいが、反面で薄く、タバコの箱の半分ぐらいの厚み)
08.jpg

#◆ LattePanda Alpha のスペック
無駄にハイスペック。

  • 価格:

  • OS無し版:$358 (¥40,000)

  • Win10 バンドル版:$398 (¥45,000)

  • CPU:

  • Intel 7th Gen Core m3-7y30

  • Core:

  • 1.6-2.6GHz Dual-Core,Four-Thread

  • Benchmark (PassMark):

  • Up to 3500, double computing power compared with same price range products in the market

  • Graphics:

  • Intel HD Graphics 615, 300-900MHz

  • RAM:

  • 8G LPDDR3 1866MHz Dual-Channel

  • Memory:

  • 64GB eMMC V5.0l

  • External Memory:

  • 1x M.2 M Key, PCIe 4x, Supports NVMe SSD and SATA SSD

  • 1x M.2 E Key, PCIe 2x,Supports USB2.0, UART, PCM

  • Connectivity:

  • Wi-Fi 802.11 AC, 2.4G & 5G (gitekimark2.png 技適マーク有り)

  • Dual Band Bluetooth 4.2

  • Gigabyte Ethernet

  • USB Ports:

  • 3x USB 3.0 Type A

  • 1x USB Type C, supports PD, DP, USB 3.0

  • Display:

  • HDMI Output

  • Type-C DP Support

  • Extendable eDP touch displays

  • Co-processor:

  • Arduino Leonardo

  • GPIO & Other Features:

  • 2x 50p GPIOs including I2C

  • I2S, USB

  • RS232

  • UART

  • RT

  • Power Managemen

  • Extendable power button

  • OS Support:

  • Windows 10 Pro

  • Linux Ubuntu

#◆ キッティングに使用した部材

  • Windows 10 PC (Ubuntu1604のUSB起動メディアが作成できる環境なら何でもOK)
  • LattePanda Alpha
  • Intel Movidius Neural Compute Stick v1 / v2
  • USBメモリ 16GB
  • HDMIケーブル
  • HDMIディスプレイ
  • USBキーボード
  • USBマウス

#◆ 導入・使用ソフト

  • Ubuntu 16.04 x86_64
  • OpenVINO toolkit 2018 R4 (2018.4.420)
  • Python 3.5
  • OpenCV 3.4.3 (pip3インストール)
  • Rufus v3.3
  • Tensorflow v1.11.0 (pip3インストール)

#◆ Ubuntu16.04のインストール
##● Windows 10 PC での作業 (Ubuntu1604のUSBフラッシュドライブ作成)
1.Ubuntu16.04.5 Desktop のイメージダウンロード (1.5GB)
http://releases.ubuntu.com/releases/16.04/ubuntu-16.04.5-desktop-amd64.iso

2.USBフラッシュドライブ作成用ツール Rufus のダウンロード
rufus-128.png 公式ページ - Rufus - Japanese

ダウンロード用リンク https://github.com/pbatard/rufus/releases/download/v3.3/rufus-3.3.exe

3.USBメモリをWindows10PCへ挿入する

4.Rufus(rufus-3.3.exe) を起動し、DD モードでUbuntu16.04イメージを書き込む
Rufusのメイン画面 (スタートボタン押下後にDDモード指定のダイアログが表示される)
01.png
DDモードの指定
02.png
書き込み中の様子
03.png

5.USBメモリをWindows10PCから外す

##● LattePanda Alpha 864 での作業

6.Wi-Fiアンテナ、キーボード、マウス、HDMIケーブル/ディスプレイ、USBメモリを LattePanda Alphaへ接続し、最後に電源を接続する

例1) Wi-Fiアンテナの接続 (Alphaの場合、アンテナは2本ある)
ezgif.com-optimize1.gif
例2) HDMIケーブルの接続
ezgif.com-optimize2.gif
例3) 全部材接続済みの様子 (電源OFF、青色LEDが消灯した瞬間を撮影してしまった)
12.jpg
例4) 電源のTpye-Cケーブルを接続
Type-Cケーブルを接続すると、通電確認用の赤色LEDが常時点灯状態になり、青色のLEDが一瞬点灯する。
青色LEDが明滅状態になるのを待ってから、電源ボタンを3秒間長押しすると電源がONになり、青色LEDが常時点灯状態になる。
ezgif.com-optimize3.gif

7.LattePanda Alpha の電源がONになると同時にキーボードの Esc キーを連打する
8.BootBoot Option #1 を選択して Enter
68747470733a2f2f71696974612d696d6167652d73746f72652e73332e616d617a6f6e6177732e636f6d2f302f3139343736392f64373266363862372d663436392d366537342d633138632d3534303136663261376333392e6a706567.jpeg
9.USBメモリの名前 + Partition1 を選択して Enter
68747470733a2f2f71696974612d696d6167652d73746f72652e73332e616d617a6f6e6177732e636f6d2f302f3139343736392f32373064303332392d303366342d663036382d653162372d3363653338643236336536362e6a706567.jpeg
10.Save & ExitSave Changes and Exit を選択して Enter
68747470733a2f2f71696974612d696d6167652d73746f72652e73332e616d617a6f6e6177732e636f6d2f302f3139343736392f37373365353536632d643663612d323537342d306136622d3437393233373532643630342e6a706567.jpeg
11.Yes を選択して Enter
68747470733a2f2f71696974612d696d6167652d73746f72652e73332e616d617a6f6e6177732e636f6d2f302f3139343736392f31626632363135372d346335342d663637662d633236312d3234626165343366393866302e6a706567.jpeg
12.Install Ubuntu を選択して Enter
68747470733a2f2f71696974612d696d6167652d73746f72652e73332e616d617a6f6e6177732e636f6d2f302f3139343736392f63653738656338362d353932382d323866302d363735662d3833636162663163353766612e6a706567.jpeg
13.しばらく待ち
68747470733a2f2f71696974612d696d6167652d73746f72652e73332e616d617a6f6e6177732e636f6d2f302f3139343736392f37353261626632372d393332632d633163362d393961382d3139643064326663303636322e6a706567.jpeg
14.English を選択して Continue
68747470733a2f2f71696974612d696d6167652d73746f72652e73332e616d617a6f6e6177732e636f6d2f302f3139343736392f66383934323836392d386633632d313732322d633765332d6162613963323338333863632e6a706567.jpeg
15.Wi-Fiに接続する場合は、 Connect to this network を選択し、一覧からSSIDを選択して Connect
16.Wi-Fiのパスワードを入力して Connect
17.Install third-party software for graphics and Wi-Fi hardware, Flash, MP3 and other media を選択し、Continue
68747470733a2f2f71696974612d696d6167652d73746f72652e73332e616d617a6f6e6177732e636f6d2f302f3139343736392f33306666643838322d613863642d333532392d326662362d3835353438333436356333342e6a706567.jpeg
18.Erase disk and install Ubuntu を選択し、Install Now
68747470733a2f2f71696974612d696d6167652d73746f72652e73332e616d617a6f6e6177732e636f6d2f302f3139343736392f65666163616164622d656436372d623030392d356239322d6636663638666336623630642e6a706567.jpeg
19.Continue
68747470733a2f2f71696974612d696d6167652d73746f72652e73332e616d617a6f6e6177732e636f6d2f302f3139343736392f34356433303239362d363866372d383364612d366433392d6161663866373735333539322e6a706567.jpeg
20.Tokyo を選択し、Continue
68747470733a2f2f71696974612d696d6167652d73746f72652e73332e616d617a6f6e6177732e636f6d2f302f3139343736392f63633333636434632d386434652d613130632d386635322d6534336361303139666332302e6a706567.jpeg
21.左右の欄からそれぞれ Japanese を選択し、Continue
68747470733a2f2f71696974612d696d6167652d73746f72652e73332e616d617a6f6e6177732e636f6d2f302f3139343736392f35656131663635302d333263642d393631362d376162642d6531343462366539326136652e6a706567.jpeg
22.ユーザIDや端末名、パスワードを入力し、Continue
68747470733a2f2f71696974612d696d6167652d73746f72652e73332e616d617a6f6e6177732e636f6d2f302f3139343736392f65646638393138382d333562652d346462332d373162622d6134356162643531303066632e6a706567.jpeg
23.しばらく待ち
68747470733a2f2f71696974612d696d6167652d73746f72652e73332e616d617a6f6e6177732e636f6d2f302f3139343736392f66623438343937392d663063342d623661362d616638652d6436396636643934316337302e6a706567.jpeg
24.Restart Now
※再起動が始まるが、うまくいかない場合は一度電源ケーブルを抜き差しして再度電源をONにする
68747470733a2f2f71696974612d696d6167652d73746f72652e73332e616d617a6f6e6177732e636f6d2f302f3139343736392f35386331343834352d613230622d626534662d313362392d3463366466353339653837642e6a706567.jpeg
25.Ubuntu16.04の起動完了、あっけなく正常起動した。
68747470733a2f2f71696974612d696d6167652d73746f72652e73332e616d617a6f6e6177732e636f6d2f302f3139343736392f30383639393831362d386466362d646131662d666130372d3933663338653231336336382e6a706567.jpeg
26.ログオンしたあとでターミナルを起動し、アップデートだけ行っておく。

アップデートコマンド
$ sudo apt-get update
$ sudo apt-get upgrade

う〜む、シングルボードコンピュータとは思えない異常な快適さ。

公式インストール手順
http://docs.lattepanda.com/content/alpha_edition/power_on/

#◆ OpenVINOのインストール
インストール対象のOpenVINOバージョン: 2018.4.420
##● OpenVINO本体のインストール
AIを始めよう!OpenVINOのインストールからデモの実行まで - Qiita - ammo0613さん の記事を参考に OpenVINO を導入する。
しっかりと手順を記載いただいているため、ココでは取り立てて記載をしない。
ただし、ツールキットがバージョンアップするごとに少しづつコマンドスクリプトが変更されているため、 公式のチュートリアル を併せて参照しながら作業を進めることを推奨する。
##● Intel Movidius Neural Compute Stick v1/v2 のための追加インストール
下記のコマンドを実行する。

USBアクセスルールの更新
$ cd ~
$ sudo usermod -a -G users "$(whoami)"
$ sudo cat <<EOF > 97-usbboot.rules
SUBSYSTEM=="usb", ATTRS{idProduct}=="2150", ATTRS{idVendor}=="03e7", GROUP="users", MODE="0666", ENV{ID_MM_DEVICE_IGNORE}="1"
SUBSYSTEM=="usb", ATTRS{idProduct}=="2485", ATTRS{idVendor}=="03e7", GROUP="users", MODE="0666", ENV{ID_MM_DEVICE_IGNORE}="1"
SUBSYSTEM=="usb", ATTRS{idProduct}=="f63b", ATTRS{idVendor}=="03e7", GROUP="users", MODE="0666", ENV{ID_MM_DEVICE_IGNORE}="1"
EOF

$ sudo cp 97-usbboot.rules /etc/udev/rules.d/
$ sudo udevadm control --reload-rules
$ sudo udevadm trigger
$ sudo ldconfig
$ sudo rm 97-usbboot.rules

sudo ldconfig を実行したときに下記のようなエラーが発生した。
シンボリックが正しく張れていないようだ。

sudo_ldconfig時のエラー内容
alpha@LattePandaAlpha:~$ sudo ldconfig
/sbin/ldconfig.real: /opt/intel/common/mdf/lib64/igfxcmrt64.so is not a symbolic link
/sbin/ldconfig.real: /opt/intel/mediasdk/lib64/libmfxhw64.so.1 is not a symbolic link
/sbin/ldconfig.real: /opt/intel/mediasdk/lib64/libmfx.so.1 is not a symbolic link
/sbin/ldconfig.real: /opt/intel/mediasdk/lib64/libva-glx.so.2 is not a symbolic link
/sbin/ldconfig.real: /opt/intel/mediasdk/lib64/libva.so.2 is not a symbolic link
/sbin/ldconfig.real: /opt/intel/mediasdk/lib64/libigdgmm.so.1 is not a symbolic link
/sbin/ldconfig.real: /opt/intel/mediasdk/lib64/libva-drm.so.2 is not a symbolic link
/sbin/ldconfig.real: /opt/intel/mediasdk/lib64/libva-x11.so.2 is not a symbolic link

調べると、各ファイル下記のような状況になっていた。
さすがIntel、 期待を裏切ったためしが無いぜ。

エラーに関連しそうなファイルたち
./igfxcmrt64.so
./libigfxcmrt64.so

./libmfxhw64.so
./libmfxhw64.so.1
./libmfxhw64.so.1.28

./libmfx.so
./libmfx.so.1
./libmfx.so.1.28

./libva-glx.so
./libva-glx.so.2
./libva-glx.so.2.300.0

./libva.so
./libva.so.2
./libva.so.2.300.0

./libigdgmm.so
./libigdgmm.so.1
./libigdgmm.so.1.0.0

./libva-drm.so
./libva-drm.so.2
./libva-drm.so.2.300.0

./libva-x11.so
./libva-x11.so.2
./libva-x11.so.2.300.0

下記コマンドにより、シンボリックをマニュアルで作成する。

シンボリックのマニュアル生成コマンド
$ cd /opt/intel/common/mdf/lib64
$ sudo mv igfxcmrt64.so igfxcmrt64.so.org
$ sudo ln -s libigfxcmrt64.so igfxcmrt64.so

$ cd /opt/intel/mediasdk/lib64
$ sudo mv libmfxhw64.so.1 libmfxhw64.so.1.org
$ sudo mv libmfx.so.1 libmfx.so.1.org
$ sudo mv libva-glx.so.2 libva-glx.so.2.org
$ sudo mv libva.so.2 libva.so.2.org
$ sudo mv libigdgmm.so.1 libigdgmm.so.1.org
$ sudo mv libva-drm.so.2 libva-drm.so.2.org
$ sudo mv libva-x11.so.2 libva-x11.so.2.org
$ sudo ln -s libmfxhw64.so.1.28 libmfxhw64.so.1
$ sudo ln -s libmfx.so.1.28 libmfx.so.1
$ sudo ln -s libva-glx.so.2.300.0 libva-glx.so.2
$ sudo ln -s libva.so.2.300.0 libva.so.2
$ sudo ln -s libigdgmm.so.1.0.0 libigdgmm.so.1
$ sudo ln -s libva-drm.so.2.300.0 libva-drm.so.2
$ sudo ln -s libva-x11.so.2.300.0 libva-x11.so.2

気を取り直してもう一度 sudo ldconfig を実行する。

sudo_ldconfigの再実行
$ cd ~
$ sudo ldconfig

今度は正常に終了した。

デフォルトで導入される OpenCV4.0.0-pre には Gstreamer のバグが有ってまともに動かなかったので、自力で OpenCV3.4.3 を導入し直す。
下記コマンドを実行する。

OpenCV3.4.3の導入
$ sudo -H pip3 install opencv-python==3.4.3.18
$ nano ~/.bashrc
export PYTHONPATH=/usr/local/lib/python3.5/dist-packages/cv2:$PYTHONPATH

$ source ~/.bashrc

公式インストール手順
Intel®Movidius™Neural Compute StickおよびIntel®Neural Compute Stick 2の追加インストール手順

##● Tensorflow v1.11.0 へのアップグレード
後続のモデルオプティマイザの処理でエラーが発生するため、デフォルトで導入される古いバージョンの Tensorflow v1.9.0 を、 Tensorflow v1.11.0 へアップグレードする。

Tensorflowのバージョン確認用コマンド例
$ python3 -c 'import tensorflow as tf; print(tf.__version__)'
1.9.0
Tensorflow_v1.11.0へのアップグレードコマンド(一応この際なのでpipもアップグレード)
$ sudo -H pip3 install pip --upgrade
$ sudo -H pip3 install tensorflow==1.11.0 --upgrade

##● カスタムレイヤの動作をTensorflowへオフロードするための設定
Intel公式チュートリアル - Offloading Computations to TensorFlow*
OpenVINOの標準APIでサポートされないカスタムレイヤの操作をTensorflow側にオフロードすることができる。
特定のオペレーションだけを切り出して Tensorflow側 に動作を一任することができる仕組みは面白い。
下記コマンドを実行し、Tensorflowランタイムを使用して推論エンジンレイヤを自力ビルドする。
ただし、Intelが提供するスクリプトにバグがあるため、一部マニュアルで修正する必要がある。
また、このタイミングで導入する Bazel0.18.1 である必要がある。
2018年11月17日時点では 0.19.0 以上だと推論エンジンレイヤが正常にビルドできないため注意。

LattePanda AlphaのようにRAMを潤沢に搭載していない端末、例えば RAM 1GB の場合は、
sudo -H $HOME/bin/bazel build --config monolithic //tensorflow/cc/inference_engine_layer:libtensorflow_call_layer.so

sudo -H $HOME/bin/bazel --host_jvm_args=-Xmx512m build --config monolithic --local_resources 1024.0,0.5,0.5 //tensorflow/cc/inference_engine_layer:libtensorflow_call_layer.so
のように読み替えると成功する可能性がある。

TensorFlowランタイムで推論エンジンレイヤを構築_LattePandaAlphaで46分
$ sudo apt-get install -y git pkg-config zip g++ zlib1g-dev unzip
$ cd ~
$ wget https://github.com/bazelbuild/bazel/releases/download/0.18.1/bazel-0.18.1-installer-linux-x86_64.sh
$ sudo chmod +x bazel-0.18.1-installer-linux-x86_64.sh
$ ./bazel-0.18.1-installer-linux-x86_64.sh --user
$ echo 'export PATH=$PATH:$HOME/bin' >> ~/.bashrc
$ source ~/.bashrc
$ cd /opt
$ sudo git clone -b v1.11.0 https://github.com/tensorflow/tensorflow.git
$ cd tensorflow
$ sudo git checkout -b v1.11.0
$ echo 'export TF_ROOT_DIR=/opt/tensorflow' >> ~/.bashrc
$ source ~/.bashrc
$ sudo nano /opt/intel/computer_vision_sdk/bin/setupvars.sh

#Before
INSTALLDIR=/opt/intel//computer_vision_sdk_2018.4.420
↓
#After
INSTALLDIR=/opt/intel/computer_vision_sdk_2018.4.420

$ source /opt/intel/computer_vision_sdk/bin/setupvars.sh
$ sudo nano /opt/intel/computer_vision_sdk/deployment_tools/model_optimizer/tf_call_ie_layer/build.sh

#Before
bazel build --config=monolithic //tensorflow/cc/inference_engine_layer:libtensorflow_call_layer.so
↓
#After
sudo -H $HOME/bin/bazel build --config monolithic //tensorflow/cc/inference_engine_layer:libtensorflow_call_layer.so

$ sudo -E /opt/intel/computer_vision_sdk/deployment_tools/model_optimizer/tf_call_ie_layer/build.sh

推論エンジンレイヤは下記のパスに生成される。

libtensorflow_call_layer.soのパス
/opt/tensorflow/bazel-bin/tensorflow/cc/inference_engine_layer/libtensorflow_call_layer.so

このままでは python 実行時に、一般ユーザーによる /opt 配下へのアクセス権限が無く Permission denied エラーが発生するため、配置先を変更する。

.soの配置先変更
$ su -
$ cp /opt/tensorflow/bazel-bin/tensorflow/cc/inference_engine_layer/libtensorflow_call_layer.so /usr/local/lib
$ exit
$ nano ~/.bashrc
export PYTHONPATH=$PYTHONPATH:/usr/local/lib
$ source ~/.bashrc
$ sudo ldconfig

#◆ デモプログラムの味見
##● 画像分類のサンプル
下記のコマンドを実行する。

Image_Classification_Sample_SqueezeNet
$ cd /opt/intel/computer_vision_sdk/deployment_tools/demo
$ ./demo_squeezenet_download_convert_run.sh

下図の画像を読み込んで。。。
car.jpg
80%の確率で スポーツカー と認識したようだ。
表面上は普通過ぎてつまらんね。
ただ、 153 FPS とか異次元の計測値が出ているw

結果
###################################################

Run Inference Engine classification sample

Run ./classification_sample -d CPU -i /opt/intel/computer_vision_sdk/deployment_tools/demo/../demo/car.png -m /home/alpha/openvino_models/ir/squeezenet1.1/FP32/squeezenet1.1.xml 

[ INFO ] InferenceEngine: 
	API version ............ 1.4
	Build .................. 17328
[ INFO ] Parsing input parameters
[ INFO ] Files were added: 1
[ INFO ]     /opt/intel/computer_vision_sdk/deployment_tools/demo/../demo/car.png
[ INFO ] Loading plugin

	API version ............ 1.4
	Build .................. lnx_20181004
	Description ....... MKLDNNPlugin
[ INFO ] Loading network files:
	/home/alpha/openvino_models/ir/squeezenet1.1/FP32/squeezenet1.1.xml
	/home/alpha/openvino_models/ir/squeezenet1.1/FP32/squeezenet1.1.bin
[ INFO ] Preparing input blobs
[ WARNING ] Image is resized from (787, 259) to (227, 227)
[ INFO ] Batch size is 1
[ INFO ] Preparing output blobs
[ INFO ] Loading model to the plugin
[ INFO ] Starting inference (1 iterations)
[ INFO ] Processing output blobs

Top 10 results:

Image /opt/intel/computer_vision_sdk/deployment_tools/demo/../demo/car.png

817 0.8363345 label sports car, sport car
511 0.0946488 label convertible
479 0.0419131 label car wheel
751 0.0091071 label racer, race car, racing car
436 0.0068161 label beach wagon, station wagon, wagon, estate car, beach waggon, station waggon, waggon
656 0.0037564 label minivan
586 0.0025741 label half track
717 0.0016069 label pickup, pickup truck
864 0.0012027 label tow truck, tow car, wrecker
581 0.0005882 label grille, radiator grille


total inference time: 6.5318211
Average running time of one iteration: 6.5318211 ms

Throughput: 153.0966609 FPS

[ INFO ] Execution successful


###################################################

Demo completed successfully.

##● 3段階推論のサンプル
3段階の推論を別々の学習モデルで連続実行させるサンプルのようだ。
アイデアとしては、私のような者でも思いつくほどにありきたり。

  • 車の検出(黒い車 だとか 白い車だとかの属性含む)
  • ナンバープレートの検出
  • 識別したナンバープレート内の文字認識

下記のコマンドを実行する。

3段階推論のサンプル
$ cd /opt/intel/computer_vision_sdk/deployment_tools/demo
$ ./demo_security_barrier_camera.sh

サンプルなので出来て当たり前。 面白みに欠ける。
license-plate.jpeg

##● 上記以外の各種サンプルプログラム
Intel® Distribution of OpenVINO™ Toolkit - Inference Engine Samples

#◆ 独自モデルの変換と実行サンプルスクリプト
公式チュートリアル - Using the Model Optimizer to Convert TensorFlow* Models
公式チュートリアル - Model Optimizer Developer Guide - TensorFlow* Models with Custom Layers

下記が Tensorflow.pb (FreezeGraph) を OpenVINO用の IR形式 に変換するサンプルスクリプト。

変換コマンド
$ cd /opt/intel/computer_vision_sdk/deployment_tools/model_optimizer
$ python3 mo_tf.py --input_model <INPUT_MODEL>.pb
**変換コマンドのオプションと説明**
変換コマンドのオプション
optional arguments:
  -h, --help            show this help message and exit
  --framework {tf,caffe,mxnet,kaldi,onnx}
                        Name of the framework used to train the input model.

Framework-agnostic parameters:
  --input_model INPUT_MODEL, -w INPUT_MODEL, -m INPUT_MODEL
                        Tensorflow*: a file with a pre-trained model (binary
                        or text .pb file after freezing). Caffe*: a model
                        proto file with model weights
  --model_name MODEL_NAME, -n MODEL_NAME
                        Model_name parameter passed to the final create_ir
                        transform. This parameter is used to name a network in
                        a generated IR and output .xml/.bin files.
  --output_dir OUTPUT_DIR, -o OUTPUT_DIR
                        Directory that stores the generated IR. By default, it
                        is the directory from where the Model Optimizer is
                        launched.
  --input_shape INPUT_SHAPE
                        Input shape(s) that should be fed to an input node(s)
                        of the model. Shape is defined as a comma-separated
                        list of integer numbers enclosed in parentheses or
                        square brackets, for example [1,3,227,227] or
                        (1,227,227,3), where the order of dimensions depends
                        on the framework input layout of the model. For
                        example, [N,C,H,W] is used for Caffe* models and
                        [N,H,W,C] for TensorFlow* models. Model Optimizer
                        performs necessary transformations to convert the
                        shape to the layout required by Inference Engine
                        (N,C,H,W). The shape should not contain undefined
                        dimensions (? or -1) and should fit the dimensions
                        defined in the input operation of the graph. If there
                        are multiple inputs in the model, --input_shape should
                        contain definition of shape for each input separated
                        by a comma, for example: [1,3,227,227],[2,4] for a
                        model with two inputs with 4D and 2D shapes.
  --scale SCALE, -s SCALE
                        All input values coming from original network inputs
                        will be divided by this value. When a list of inputs
                        is overridden by the --input parameter, this scale is
                        not applied for any input that does not match with the
                        original input of the model.
  --reverse_input_channels
                        Switch the input channels order from RGB to BGR (or
                        vice versa). Applied to original inputs of the model
                        if and only if a number of channels equals 3. Applied
                        after application of --mean_values and --scale_values
                        options, so numbers in --mean_values and
                        --scale_values go in the order of channels used in the
                        original model.
  --log_level {CRITICAL,ERROR,WARN,WARNING,INFO,DEBUG,NOTSET}
                        Logger level
  --input INPUT         The name of the input operation of the given model.
                        Usually this is a name of the input placeholder of the
                        model.
  --output OUTPUT       The name of the output operation of the model. For
                        TensorFlow*, do not add :0 to this name.
  --mean_values MEAN_VALUES, -ms MEAN_VALUES
                        Mean values to be used for the input image per
                        channel. Values to be provided in the (R,G,B) or
                        [R,G,B] format. Can be defined for desired input of
                        the model, for example: "--mean_values
                        data[255,255,255],info[255,255,255]". The exact
                        meaning and order of channels depend on how the
                        original model was trained.
  --scale_values SCALE_VALUES
                        Scale values to be used for the input image per
                        channel. Values are provided in the (R,G,B) or [R,G,B]
                        format. Can be defined for desired input of the model,
                        for example: "--scale_values
                        data[255,255,255],info[255,255,255]". The exact
                        meaning and order of channels depend on how the
                        original model was trained.
  --data_type {FP16,FP32,half,float}
                        Data type for all intermediate tensors and weights. If
                        original model is in FP32 and --data_type=FP16 is
                        specified, all model weights and biases are quantized
                        to FP16.
  --disable_fusing      Turn off fusing of linear operations to Convolution
  --disable_resnet_optimization
                        Turn off resnet optimization
  --finegrain_fusing FINEGRAIN_FUSING
                        Regex for layers/operations that won't be fused.
                        Example: --finegrain_fusing Convolution1,.*Scale.*
  --disable_gfusing     Turn off fusing of grouped convolutions
  --move_to_preprocess  Move mean values to IR preprocess section
  --extensions EXTENSIONS
                        Directory or a comma separated list of directories
                        with extensions. To disable all extensions including
                        those that are placed at the default location, pass an
                        empty string.
  --batch BATCH, -b BATCH
                        Input batch size
  --version             Version of Model Optimizer
  --silent              Prevent any output messages except those that
                        correspond to log level equals ERROR, that can be set
                        with the following option: --log_level. By default,
                        log level is already ERROR.
  --freeze_placeholder_with_value FREEZE_PLACEHOLDER_WITH_VALUE
                        Replaces input layer with constant node with provided
                        value, e.g.: "node_name->True"
  --generate_deprecated_IR_V2
                        Force to generate legacy/deprecated IR V2 to work with
                        previous versions of the Inference Engine. The
                        resulting IR may or may not be correctly loaded by
                        Inference Engine API (including the most recent and
                        old versions of Inference Engine) and provided as a
                        partially-validated backup option for specific
                        deployment scenarios. Use it at your own discretion.
                        By default, without this option, the Model Optimizer
                        generates IR V3.
**Tensorflow固有の変換コマンドのオプションと説明**
Tensorflow固有の変換コマンドのオプション
TensorFlow*-specific parameters:
  --input_model_is_text
                        TensorFlow*: treat the input model file as a text
                        protobuf format. If not specified, the Model Optimizer
                        treats it as a binary file by default.
  --input_checkpoint INPUT_CHECKPOINT
                        TensorFlow*: variables file to load.
  --input_meta_graph INPUT_META_GRAPH
                        Tensorflow*: a file with a meta-graph of the model
                        before freezing
  --saved_model_dir SAVED_MODEL_DIR
                        TensorFlow*: directory representing non frozen model
  --saved_model_tags SAVED_MODEL_TAGS
                        Group of tag(s) of the MetaGraphDef to load, in string
                        format, separated by ','. For tag-set contains
                        multiple tags, all tags must be passed in.
  --offload_unsupported_operations_to_tf
                        TensorFlow*: automatically offload unsupported
                        operations to TensorFlow*
  --tensorflow_subgraph_patterns TENSORFLOW_SUBGRAPH_PATTERNS
                        TensorFlow*: a list of comma separated patterns that
                        will be applied to TensorFlow* node names to infer a
                        part of the graph using TensorFlow*.
  --tensorflow_operation_patterns TENSORFLOW_OPERATION_PATTERNS
                        TensorFlow*: a list of comma separated patterns that
                        will be applied to TensorFlow* node type (ops) to
                        infer these operations using TensorFlow*.
  --tensorflow_custom_operations_config_update TENSORFLOW_CUSTOM_OPERATIONS_CONFIG_UPDATE
                        TensorFlow*: update the configuration file with node
                        name patterns with input/output nodes information.
  --tensorflow_use_custom_operations_config TENSORFLOW_USE_CUSTOM_OPERATIONS_CONFIG
                        TensorFlow*: use the configuration file with custom
                        operation description.
  --tensorflow_object_detection_api_pipeline_config TENSORFLOW_OBJECT_DETECTION_API_PIPELINE_CONFIG
                        TensorFlow*: path to the pipeline configuration file
                        used to generate model created with help of Object
                        Detection API.
  --tensorboard_logdir TENSORBOARD_LOGDIR
                        TensorFlow*: dump the input graph to a given directory
                        that should be used with TensorBoard.
  --tensorflow_custom_layer_libraries TENSORFLOW_CUSTOM_LAYER_LIBRARIES
                        TensorFlow*: comma separated list of shared libraries
                        with TensorFlow* custom operations implementation.
  --disable_nhwc_to_nchw
                        Disables default translation from NHWC to NCHW

#◆ 自力生成モデル・Semantic Segmentation 「UNet」 の変換
さて、ようやくここからが今回の検証の本題。
Intel Movidius Neural Compute Stick の 純正SDK 「NCSDK v2.x」 では実行できなかったモデルが、OpenVINO上では動作するかどうかを検証する。
私としては、このモデルがNCSによる推論に成功するだけで歓喜。
なお、この記事は LattePanda Alpha の性能を検証することが目的ではないことを改めて周知させていただく。

まずは、構造が超シンプルな UNet からトライする。
.pbファイルは TensorflowLite-UNet - PINTO0309 - Github に配置してあるものを使用する。
こちらは、 Personクラス のみに限定して学習させた Semantic Segmentation のモデルだ。
TensorflowLite-UNet/model/semanticsegmentation_frozen_person_32.pb (31.1MB)

##● データタイプ FP16 への変換
下記のコマンドを実行する。
--input_model は、変換対象とする.pbファイル名 (FreezeGraph名)
--output_dir は、変換後の lrファイル の出力先パス
--input は、入力ノード名 (プレースホルダ名)
--output は、出力ノード名
--data_type は、変換後のデータ精度型名 [FP16/FP32/half/float]
--batch は、入力バッチサイズの強制置換 (.pbの入力形状がバッチサイズ不定 [-1, 256, 256, 3] のような時に -1 の部分を強制的に置換することができる、OpenVINOは バッチサイズ = -1 を許容しないらしい)
--scale は、BGR各値を 255 (UInt8) で割り算し、0~1 の値範囲へ正規化するときに使用する指定
--mean_values は、ピクセル単位での BGR値 の平均減算値を指定
--offload_unsupported_operations_to_tf は、OpenVINOで処理できないTensorflowのカスタムレイヤーをTensorflow側にオフロードして処理させるための指定

自作「UNet」モデルのIR_FP16への変換スクリプト
$ cd /opt/intel/computer_vision_sdk/deployment_tools/model_optimizer
$ sudo mkdir -p 01_pbmodels/UNet
$ sudo mkdir -p 10_lrmodels/UNet/FP16
$ sudo wget https://github.com/PINTO0309/TensorflowLite-UNet/raw/master/model/semanticsegmentation_frozen_person_32.pb -P 01_pbmodels/UNet
$ sudo python3 mo_tf.py \
--input_model 01_pbmodels/UNet/semanticsegmentation_frozen_person_32.pb \
--output_dir 10_lrmodels/UNet/FP16 \
--input input \
--output output/BiasAdd \
--data_type FP16 \
--batch 1 

<RGB平均値算出の参考POST>
https://forums.fast.ai/t/images-normalization/4058
https://github.com/DrSleep/tensorflow-deeplab-resnet/issues/106

PYTHONによるRGB平均算出のサンプルロジック
# 画像1枚あたりのRGB平均値算出
mean = np.mean(jpgimg, axis=(0, 1))
meanB += mean[0]
meanG += mean[1]
meanR += mean[2]
# 全学習画像のRGB平均値算出
print("meanB =", meanB / imgcnt)
print("meanG =", meanG / imgcnt)
print("meanR =", meanR / imgcnt)

どうにか変換には成功したようだ。
FP32のモデルからFP16へ変換したため、見た目上のファイルサイズが変換前の半分の15.5MBになった。

**lr変換ログ**
変換ログ
Model Optimizer arguments:
Common parameters:
	- Path to the Input Model: 	/opt/intel/computer_vision_sdk_2018.4.420/deployment_tools/model_optimizer/01_pbmodels/UNet/semanticsegmentation_frozen_person_32.pb
	- Path for generated IR: 	/opt/intel/computer_vision_sdk_2018.4.420/deployment_tools/model_optimizer/10_lrmodels/UNet/FP16
	- IR output name: 	semanticsegmentation_frozen_person_32
	- Log level: 	ERROR
	- Batch: 	1
	- Input layers: 	input
	- Output layers: 	output/BiasAdd
	- Input shapes: 	Not specified, inherited from the model
	- Mean values: 	Not specified
	- Scale values: 	Not specified
	- Scale factor: 	Not specified
	- Precision of IR: 	FP16
	- Enable fusing: 	True
	- Enable grouped convolutions fusing: 	True
	- Move mean values to preprocess section: 	False
	- Reverse input channels: 	False
TensorFlow specific parameters:
	- Input model in text protobuf format: 	False
	- Offload unsupported operations: 	False
	- Path to model dump for TensorBoard: 	None
	- List of shared libraries with TensorFlow custom layers implementation: 	None
	- Update the configuration file with input/output node names: 	None
	- Use configuration file used to generate the model with Object Detection API: 	None
	- Operations to offload: 	None
	- Patterns to offload: 	None
	- Use the config file: 	None
Model Optimizer version: 	1.4.292.6ef7232d

[ SUCCESS ] Generated IR model.
[ SUCCESS ] XML file: /opt/intel/computer_vision_sdk_2018.4.420/deployment_tools/model_optimizer/10_lrmodels/UNet/FP16/semanticsegmentation_frozen_person_32.xml
[ SUCCESS ] BIN file: /opt/intel/computer_vision_sdk_2018.4.420/deployment_tools/model_optimizer/10_lrmodels/UNet/FP16/semanticsegmentation_frozen_person_32.bin
[ SUCCESS ] Total execution time: 3.86 seconds.
![SITWXV~4.jpg](https://qiita-image-store.s3.amazonaws.com/0/194769/4a35598f-de96-e839-246d-ec74882671e8.jpeg)

##● データタイプ FP32 への変換
下記のコマンドを実行する。

自作「UNet」モデルのIR_FP32への変換スクリプト
$ cd /opt/intel/computer_vision_sdk/deployment_tools/model_optimizer
$ sudo mkdir -p 01_pbmodels/UNet
$ sudo mkdir -p 10_lrmodels/UNet/FP32
$ sudo wget https://github.com/PINTO0309/TensorflowLite-UNet/raw/master/model/semanticsegmentation_frozen_person_32.pb -P 01_pbmodels/UNet
$ sudo python3 mo_tf.py \
--input_model 01_pbmodels/UNet/semanticsegmentation_frozen_person_32.pb \
--output_dir 10_lrmodels/UNet/FP32 \
--input input \
--output output/BiasAdd \
--data_type FP32 \
--batch 1

こちらも成功したようだ。
元先で精度を変更していないため、最終アウトプットのファイルサイズに変化は無い。

**lr変換ログ**
変換ログ
Model Optimizer arguments:
Common parameters:
	- Path to the Input Model: 	/opt/intel/computer_vision_sdk_2018.4.420/deployment_tools/model_optimizer/01_pbmodels/UNet/semanticsegmentation_frozen_person_32.pb
	- Path for generated IR: 	/opt/intel/computer_vision_sdk_2018.4.420/deployment_tools/model_optimizer/10_lrmodels/UNet/FP32
	- IR output name: 	semanticsegmentation_frozen_person_32
	- Log level: 	ERROR
	- Batch: 	1
	- Input layers: 	input
	- Output layers: 	output/BiasAdd
	- Input shapes: 	Not specified, inherited from the model
	- Mean values: 	Not specified
	- Scale values: 	Not specified
	- Scale factor: 	Not specified
	- Precision of IR: 	FP32
	- Enable fusing: 	True
	- Enable grouped convolutions fusing: 	True
	- Move mean values to preprocess section: 	False
	- Reverse input channels: 	False
TensorFlow specific parameters:
	- Input model in text protobuf format: 	False
	- Offload unsupported operations: 	False
	- Path to model dump for TensorBoard: 	None
	- List of shared libraries with TensorFlow custom layers implementation: 	None
	- Update the configuration file with input/output node names: 	None
	- Use configuration file used to generate the model with Object Detection API: 	None
	- Operations to offload: 	None
	- Patterns to offload: 	None
	- Use the config file: 	None
Model Optimizer version: 	1.4.292.6ef7232d

[ SUCCESS ] Generated IR model.
[ SUCCESS ] XML file: /opt/intel/computer_vision_sdk_2018.4.420/deployment_tools/model_optimizer/10_lrmodels/UNet/FP32/semanticsegmentation_frozen_person_32.xml
[ SUCCESS ] BIN file: /opt/intel/computer_vision_sdk_2018.4.420/deployment_tools/model_optimizer/10_lrmodels/UNet/FP32/semanticsegmentation_frozen_person_32.bin
[ SUCCESS ] Total execution time: 3.70 seconds. 
![SKTIWX~T.jpg](https://qiita-image-store.s3.amazonaws.com/0/194769/db3d5960-33c9-78c7-f390-9add9635ec87.jpeg)

#◆ 自力生成モデル・Semantic Segmentation 「ENet」 の変換(その1)
こちらもNCSのSDK 「NCSDK v2.x」 では実行できなかったモデル。
ちなみに、本家の Tensorflow Lite 上でも現時点では素の状態で動作しない。
こちらも動作すれば歓喜。
.pbファイルは TensorFlow-ENet - PINTO0309 - Github に配置してあるものを使用する。

##● 【失敗】 データタイプ FP16 への変換
下記のコマンドを実行する。

自作「ENet」モデルのIR_FP16への変換スクリプト
$ cd /opt/intel/computer_vision_sdk/deployment_tools/model_optimizer
$ sudo mkdir -p 01_pbmodels/ENet
$ sudo mkdir -p 10_lrmodels/ENet/FP16
$ sudo wget https://github.com/PINTO0309/TensorFlow-ENet/raw/pinto0309work/checkpoint/semanticsegmentation_enet.pb -P 01_pbmodels/ENet
$ sudo python3 mo_tf.py \
--input_model 01_pbmodels/ENet/semanticsegmentation_enet.pb \
--output_dir 10_lrmodels/ENet/FP16 \
--input input \
--output ENet/logits_to_softmax \
--data_type FP16 \
--batch 1 \
--offload_unsupported_operations_to_tf \
--tensorflow_operation_patterns Range,ScatterNd

ダメだ、 ScatterNd の EagerExecution で何故か型変換エラーが発生する。
どのような要素が足りていないのかが分からない。。。

エラー内容
[ ERROR ]  Cannot infer shapes or values for node "TFSubgraphCall_2743".
[ ERROR ]  Error converting shape to a TensorShape: only size-1 arrays can be converted to Python scalars.
[ ERROR ]  
[ ERROR ]  It can happen due to bug in custom shape infer function <function tf_subgraph_infer at 0x7fc3fc967400>.
[ ERROR ]  Or because the node inputs have incorrect values/shapes.
[ ERROR ]  Or because input shapes are incorrect (embedded to the model or passed via --input_shape).
[ ERROR ]  Run Model Optimizer with --log_level=DEBUG for more information.
[ ERROR ]  Stopped shape/value propagation at "TFSubgraphCall_2743" node. 
 For more information please refer to Model Optimizer FAQ (<INSTALL_DIR>/deployment_tools/documentation/docs/MO_FAQ.html), question #38. 

#◆ 自力生成モデル・Semantic Segmentation 「ENet」 の変換(その2)
オジサン、あきらめない! ( ✧Д✧) カッ!!

今度はコチラのリポジトリを拝借して、CPU対応とモデルサイズ縮小の独自カスタマイズを実施。
segmentation - fregu856 - Github

カスタマイズ後
Tensorflow-ENet2 - PINTO0309 - Github

##● 【失敗】 データタイプ FP16 への変換
下記のコマンドを実行する。

「ENet」モデルのIR_FP16への変換スクリプト
$ cd /opt/intel/computer_vision_sdk/deployment_tools/model_optimizer
$ sudo mkdir -p 01_pbmodels/ENet
$ sudo mkdir -p 10_lrmodels/ENet/FP16
$ sudo wget https://github.com/PINTO0309/Tensorflow-ENet2/raw/master/training_logs/best_model/semanticsegmentation_frozen_enet.pb -P 01_pbmodels/ENet
$ sudo python3 mo_tf.py \
--input_model 01_pbmodels/ENet/semanticsegmentation_frozen_enet.pb \
--output_dir 10_lrmodels/ENet/FP16 \
--input imgs_ph \
--output fullconv/Relu \
--data_type FP16 \
--batch 1 \
--offload_unsupported_operations_to_tf \
--tensorflow_operation_patterns Range,ScatterNd

ダメだ。 同じく ScatterNd の EagerExecution で何故か型変換エラーが発生する。
ScatterNd を使用せずに Upsampling するにはどうしたら良いのか。。。分からない。。。

エラー内容
[ ERROR ]  Cannot infer shapes or values for node "TFSubgraphCall_1695".
[ ERROR ]  Error converting shape to a TensorShape: only size-1 arrays can be converted to Python scalars.
[ ERROR ]  
[ ERROR ]  It can happen due to bug in custom shape infer function <function tf_subgraph_infer at 0x7f425c167400>.
[ ERROR ]  Or because the node inputs have incorrect values/shapes.
[ ERROR ]  Or because input shapes are incorrect (embedded to the model or passed via --input_shape).
[ ERROR ]  Run Model Optimizer with --log_level=DEBUG for more information.
[ ERROR ]  Stopped shape/value propagation at "TFSubgraphCall_1695" node. 
 For more information please refer to Model Optimizer FAQ (<INSTALL_DIR>/deployment_tools/documentation/docs/MO_FAQ.html), question #38. 

★Unpoolingの別実装参考
https://assiaben.github.io/posts/2018-06-tf-unpooling/
https://github.com/assiaben/w/blob/master/unpooling/unpool_test.py

#◆ 他力本願モデル・ADAS(先進運転支援システム)向け Semantic Segmentation モデルの変換
くやしいけど、Intelが公式に公開してくれているサンプルモデルを使う。
使うと言っても、OpenVINO導入時に FP16 と FP32 それぞれの精度で変換済みのモデルが勝手にインストールされているようだ。
今回は Neural Compute Stick を使用するため、 FP16 の方を後続の工程で使用することにする。

/opt/intel/computer_vision_sdk/deployment_tools/intel_models/semantic-segmentation-adas-000/FP16
/opt/intel/computer_vision_sdk/deployment_tools/intel_models/semantic-segmentation-adas-000/FP32

semantic-segmentation-adas-0001.bin
semantic-segmentation-adas-0001.xml

#◆ 自力生成モデル・Semantic Segmentation 「ICNet」の変換
.pbファイルは ICNet-tensorflow - PINTO0309 - Github に配置してあるものを使用する。
##● データタイプ FP16 への変換
下記のコマンドを実行する。

「ICNet」モデルのIR_FP16への変換スクリプト
$ cd /opt/intel/computer_vision_sdk/deployment_tools/model_optimizer
$ sudo mkdir -p 01_pbmodels/ICNet
$ sudo mkdir -p 10_lrmodels/ICNet/FP16
$ sudo wget https://github.com/PINTO0309/ICNet-tensorflow/raw/pinto0309work/snapshots/semanticsegmentation_ICNet.pb -P 01_pbmodels/ICNet
$ sudo python3 mo_tf.py \
--input_model 01_pbmodels/ICNet/semanticsegmentation_ICNet.pb \
--output_dir 10_lrmodels/ICNet/FP16 \
--input input \
--output ResizeBilinear_19 \
--data_type FP16

成功したようだ。

**lr変換ログ**
変換ログ
Model Optimizer arguments:
Common parameters:
	- Path to the Input Model: 	/opt/intel/computer_vision_sdk_2018.4.420/deployment_tools/model_optimizer/01_pbmodels/ICNet/semanticsegmentation_ICNet.pb
	- Path for generated IR: 	/opt/intel/computer_vision_sdk_2018.4.420/deployment_tools/model_optimizer/10_lrmodels/ICNet/FP16
	- IR output name: 	semanticsegmentation_ICNet
	- Log level: 	ERROR
	- Batch: 	1
	- Input layers: 	input
	- Output layers: 	ResizeBilinear_19
	- Input shapes: 	Not specified, inherited from the model
	- Mean values: 	Not specified
	- Scale values: 	Not specified
	- Scale factor: 	Not specified
	- Precision of IR: 	FP16
	- Enable fusing: 	True
	- Enable grouped convolutions fusing: 	True
	- Move mean values to preprocess section: 	False
	- Reverse input channels: 	False
TensorFlow specific parameters:
	- Input model in text protobuf format: 	False
	- Offload unsupported operations: 	True
	- Path to model dump for TensorBoard: 	None
	- List of shared libraries with TensorFlow custom layers implementation: 	None
	- Update the configuration file with input/output node names: 	None
	- Use configuration file used to generate the model with Object Detection API: 	None
	- Operations to offload: 	None
	- Patterns to offload: 	None
	- Use the config file: 	None
Model Optimizer version: 	1.4.292.6ef7232d

[ SUCCESS ] Generated IR model.
[ SUCCESS ] XML file: /opt/intel/computer_vision_sdk_2018.4.420/deployment_tools/model_optimizer/10_lrmodels/ICNet/FP16/semanticsegmentation_ICNet.xml
[ SUCCESS ] BIN file: /opt/intel/computer_vision_sdk_2018.4.420/deployment_tools/model_optimizer/10_lrmodels/ICNet/FP16/semanticsegmentation_ICNet.bin
[ SUCCESS ] Total execution time: 6.58 seconds. 
![S3SVMM~H.PNG](https://qiita-image-store.s3.amazonaws.com/0/194769/9f4a9e13-e805-a6ef-a507-05ba07d24dbc.png)

##● データタイプ FP32 への変換
下記のコマンドを実行する。

「ICNet」モデルのIR_FP32への変換スクリプト
$ cd /opt/intel/computer_vision_sdk/deployment_tools/model_optimizer
$ sudo mkdir -p 01_pbmodels/ICNet
$ sudo mkdir -p 10_lrmodels/ICNet/FP32
$ sudo wget https://github.com/PINTO0309/ICNet-tensorflow/raw/pinto0309work/snapshots/semanticsegmentation_ICNet.pb -P 01_pbmodels/ICNet
$ sudo python3 mo_tf.py \
--input_model 01_pbmodels/ICNet/semanticsegmentation_ICNet.pb \
--output_dir 10_lrmodels/ICNet/FP32 \
--input input \
--output ResizeBilinear_19 \
--data_type FP32

こちらも成功したようだ。

**lr変換ログ**
変換ログ
Model Optimizer arguments:
Common parameters:
	- Path to the Input Model: 	/opt/intel/computer_vision_sdk_2018.4.420/deployment_tools/model_optimizer/01_pbmodels/ICNet/semanticsegmentation_ICNet.pb
	- Path for generated IR: 	/opt/intel/computer_vision_sdk_2018.4.420/deployment_tools/model_optimizer/10_lrmodels/ICNet/FP32
	- IR output name: 	semanticsegmentation_ICNet
	- Log level: 	ERROR
	- Batch: 	1
	- Input layers: 	input
	- Output layers: 	ResizeBilinear_19
	- Input shapes: 	Not specified, inherited from the model
	- Mean values: 	Not specified
	- Scale values: 	Not specified
	- Scale factor: 	Not specified
	- Precision of IR: 	FP32
	- Enable fusing: 	True
	- Enable grouped convolutions fusing: 	True
	- Move mean values to preprocess section: 	False
	- Reverse input channels: 	False
TensorFlow specific parameters:
	- Input model in text protobuf format: 	False
	- Offload unsupported operations: 	True
	- Path to model dump for TensorBoard: 	None
	- List of shared libraries with TensorFlow custom layers implementation: 	None
	- Update the configuration file with input/output node names: 	None
	- Use configuration file used to generate the model with Object Detection API: None
	- Operations to offload: 	None
	- Patterns to offload: 	None
	- Use the config file: 	None
Model Optimizer version: 	1.4.292.6ef7232d

[ SUCCESS ] Generated IR model.
[ SUCCESS ] XML file: /opt/intel/computer_vision_sdk_2018.4.420/deployment_tools/model_optimizer/10_lrmodels/ICNet/FP32/semanticsegmentation_ICNet.xml
[ SUCCESS ] BIN file: /opt/intel/computer_vision_sdk_2018.4.420/deployment_tools/model_optimizer/10_lrmodels/ICNet/FP32/semanticsegmentation_ICNet.bin
[ SUCCESS ] Total execution time: 8.47 seconds. 
![STROU9~D.jpg](https://qiita-image-store.s3.amazonaws.com/0/194769/e14e2bac-7f06-c890-2173-f8dca1b059e0.jpeg)

#◆ OpenVINOによる UNet 実行環境の構築と実行
本当は ENet を実装したかったが、ScatterNd の変換エラーがどうしてもクリアできなかったため、仕方なく UNet を実装してみる。

リアルタイムセグメンテーション用UNet実行プログラムサンプル
import sys
import cv2
import numpy as np
from PIL import Image
import time
from openvino.inference_engine import IENetwork, IEPlugin

model_xml='/opt/intel/computer_vision_sdk_2018.4.420/deployment_tools/model_optimizer/10_lrmodels/UNet/FP32/semanticsegmentation_frozen_person_32.xml'
model_bin='/opt/intel/computer_vision_sdk_2018.4.420/deployment_tools/model_optimizer/10_lrmodels/UNet/FP32/semanticsegmentation_frozen_person_32.bin'
net = IENetwork.from_ir(model=model_xml, weights=model_bin)
seg_image = Image.open("data/input/009649.png")
palette = seg_image.getpalette() # Get a color palette
index_void = 2 # Define index_void Back Ground
camera_width = 320
camera_height = 240
fps = ""
elapsedTime = 0

plugin = IEPlugin(device="HETERO:MYRIAD,CPU")
plugin.set_config({"TARGET_FALLBACK": "HETERO:MYRIAD,CPU"})
plugin.set_initial_affinity(net)

#plugin = IEPlugin(device="MYRIAD")
#plugin = IEPlugin(device="CPU")

exec_net = plugin.load(network=net)

input_blob = next(iter(net.inputs))        #input_blob = 'input'
out_blob   = next(iter(net.outputs))       #out_blob   = 'output/BiasAdd'
n, c, h, w = net.inputs[input_blob].shape  #n, c, h, w = 1, 3, 256, 256

del net

cap = cv2.VideoCapture(0)
cap.set(cv2.CAP_PROP_FPS, 30)
cap.set(cv2.CAP_PROP_FRAME_WIDTH, camera_width)
cap.set(cv2.CAP_PROP_FRAME_HEIGHT, camera_height)
time.sleep(1)

while cap.isOpened():
    t1 = time.time()
    ret, frame = cap.read()
    if not ret:
        break
    #frame = cv2.imread('data/input/000003.jpg')
    prepimg = frame[:, :, ::-1].copy()
    #prepimg = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
    prepimg = Image.fromarray(prepimg)
    prepimg = prepimg.resize((256, 256), Image.ANTIALIAS)
    prepimg = np.asarray(prepimg) / 255.0
    prepimg = prepimg.transpose((2, 0, 1)).reshape((1, c, h, w))

    t2 = time.perf_counter()
    exec_net.start_async(request_id=0, inputs={input_blob: prepimg})

    if exec_net.requests[0].wait(-1) == 0:
        outputs = exec_net.requests[0].outputs[out_blob] # (1, 3, 256, 256)
        print("SegmentationTime = {:.7f}".format(time.perf_counter() - t2))
        outputs = outputs.transpose((2, 3, 1, 0)).reshape((h, w, c)) # (256, 256 3)
        outputs = cv2.resize(outputs, (camera_width, camera_height)) # (240, 320, 3)

        # View
        res = np.argmax(outputs, axis=2)
        if index_void is not None:
            res = np.where(res == index_void, 0, res)
        image = Image.fromarray(np.uint8(res), mode="P")
        image.putpalette(palette)
        image = image.convert("RGB")

        image = np.asarray(image)
        image = cv2.cvtColor(image, cv2.COLOR_RGB2BGR)
        image = cv2.addWeighted(frame, 1, image, 0.9, 0)

    cv2.putText(image, fps, (camera_width-180,15), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (38,0,255), 1, cv2.LINE_AA)
    cv2.imshow("Result", image)

    if cv2.waitKey(1)&0xFF == ord('q'):
        break
    elapsedTime = time.time() - t1
    fps = "(Playback) {:.1f} FPS".format(1/elapsedTime)

cv2.destroyAllWindows()
del exec_net
del plugin

000003.jpg
4.jpg

◆ 処理速度の計測結果
Inference_Time2.jpg
おいっ! CPUのほうが速いぞ! Σ(゚ロ゚;)
まさかとは思うが、Intel 7th Gen Core m3-7y30 のほうが、専用AIチップの Myriad X より性能が良い、なんてことはないよね???
MKL-DNN ってそんなに強力なの???
本気で懐疑的過ぎて、即座に公式フォーラムへ issue を挙げてしまった。
とにかく、現状の計測方法では、何故か Neural Compute Stick を使わないほうがパフォーマンスが良い。
ただし、Neural Compute Stick v1 よりも Neural Compute Stick v2 のほうが2倍以上のパフォーマンスが出ていることは確かだ。
また、RaspberryPi3 の ARM CPU 単独で同じモデルを処理させたときは、11秒掛かっていたため、超爆速パフォーマンスになっているのも確かだ。
ちなみに参考に記載したプログラムは、コメント部を調整すればUSBカメラで撮影した動画をリアルタイムにセグメンテーションできるようにしてある。
USBカメラ撮影で、おおむね 4 FPS 〜 5 FPS の性能が出ていた。
しかしながら、精度は極悪なので使えたものではないけれど。。。
【2018/11/29追記】
海外のエンジニアと協力して検証し、 Intel Celeron になら勝てる、との結論に至るw 検証結果は コチラ

◆ 公式フォーラムへ投稿したissue
https://software.intel.com/en-us/forums/computer-vision/topic/800215

#◆ 【一部成功】OpenVINOによるADAS用セグメンテーション実行環境の構築と実行

ダウンロード可能なモデルの一覧は下記を参照。
Intel公式のチュートリアルは雑過ぎるため、あえて Github上 の OpenCV のリポジトリから拝借しても良い。
内容を突き合わせしたが、OpenCVの内容のほうが新しいようだ。
ちなみに、OpenCVのリポジトリのほうが、ダウンロード可能なモデルの種類が豊富だ。
あくまで参考までに記載するため、あえて実施しなくてもこの先の検証は継続可能。

OpenCV - Github - Public Topologies Downloader

OpenCVのモデルダウンロード用リポジトリからの各種モデルのダウンロードのサンプル
$ sudo -H pip3 install pyyaml requests
$ cd ~
$ git clone https://github.com/opencv/open_model_zoo.git
$ cd open_model_zoo/model_downloader
$ ./downloader.py --name semantic-segmentation-adas-0001
$ ./downloader.py --name semantic-segmentation-adas-0001-fp16
$ ./downloader.py --name チョメチョメ
 :

では本題。上記を実施していなくてもココから先の作業を実施すれば良い。
下記コマンドを実行し、サンプルプログラムをビルドする。

サンプルプログラムビルド用のシェルスクリプトの実行
$ sudo /opt/intel/computer_vision_sdk/deployment_tools/inference_engine/samples/build_samples.sh

何故か home/<username>/inference_engine_samples_build/intel64/Release 配下にビルド済みバイナリが生成される。

サンプルプログラムフォルダへ移動と使い方表示
$ cd ~/inference_engine_samples_build/intel64
$ sudo chmod 777 Release
$ cd Release
$ ./segmentation_demo -h

[ INFO ] InferenceEngine: 
	API version ............ 1.4
	Build .................. 17328
[ INFO ] Parsing input parameters

segmentation_demo [OPTION]
Options:

    -h                        Print a usage message.
    -i "<path>"               Required. Path to an .bmp image.
    -m "<path>"               Required. Path to an .xml file with a trained model.
      -l "<absolute_path>"    Required for MKLDNN (CPU)-targeted custom layers. Absolute path to a shared library with the kernels impl.
          Or
      -c "<absolute_path>"    Required for clDNN (GPU)-targeted custom kernels. Absolute path to the xml file with the kernels desc.
    -pp "<path>"              Path to a plugin folder.
    -d "<device>"             Specify the target device to infer on: CPU, GPU, FPGA or MYRIAD is acceptable. The demo will look for a suitable plugin for a specified device (CPU by default).
    -ni "<integer>"           Number of iterations (default 1)
    -pc                       Enables per-layer performance report

まずは、 Neural Compute Stick モード で実行する。

SemanticSegmentationサンプルプログラムの実行
$ ./segmentation_demo \
-i test.png \
-m /opt/intel/computer_vision_sdk/deployment_tools/intel_models/semantic-segmentation-adas-000/FP16/semantic-segmentation-adas-0001.xml \
-d MYRIAD \
-pc

動かない。。。 ナメてんのか? いんてる。
まぁ、Python API 自体がまだお試し版リリースであることは理解している。

実行エラーログ
[ INFO ] InferenceEngine: 
	API version ............ 1.4
	Build .................. 17328
[ INFO ] Parsing input parameters
[ INFO ] Files were added: 1
[ INFO ]     ./test.png
[ INFO ] Loading plugin

	API version ............ 1.4
	Build .................. 17328
	Description ....... myriadPlugin
[ INFO ] Loading network files
[ INFO ] Preparing input blobs
[ WARNING ] Image is resized from (512, 256) to (2048, 1024)
[ INFO ] Batch size is 1
[ INFO ] Preparing output blobs
[ INFO ] Loading model to the plugin
[ ERROR ] Cannot convert layer "argmax" due to unsupported layer type "ArgMax"

ArgMax ぐらい、モデルの外で自力でプログラムを書いても1〜2行のことなので、変換前モデルを捜索するも、何故か caffemodel を非公開にしていらっしゃる。
呆れる。。。 品質の低さとユーザー軽視は、彼の ○i○r○s○f○ より酷い。
しかし、オジサンはイチイチメゲない。 次に行ってみよう。

次は、 CPU モード で実行してみる。

SemanticSegmentationサンプルプログラムの実行
$ ./segmentation_demo \
-i test.png \
-m /opt/intel/computer_vision_sdk/deployment_tools/intel_models/semantic-segmentation-adas-000/FP16/semantic-segmentation-adas-0001.xml \
-d CPU \
-pc

下図のテスト用画像をインプットすると。。。
test.jpg
なかなか綺麗にセグメンテーションされた。
CPU実行にもかかわらず、推論時間は 909 ms だった。
かなり速い!!
out_0.jpg

#◆ OpenVINOによる ICNet 実行環境の構築
最後の砦、 ICNet。 エッジセグメンテーションの未来は君の双肩に掛かっている。
下記プログラムを実行する。
今回は CPUエクステンションライブラリ を有効にする。

リアルタイムセグメンテーション用ICNet実行プログラムサンプル
import sys
import cv2
import numpy as np
from PIL import Image
import time
from openvino.inference_engine import IENetwork, IEPlugin

model_xml='/opt/intel/computer_vision_sdk_2018.4.420/deployment_tools/model_optimizer/10_lrmodels/ICNet/FP32/semanticsegmentation_ICNet.xml'
model_bin='/opt/intel/computer_vision_sdk_2018.4.420/deployment_tools/model_optimizer/10_lrmodels/ICNet/FP32/semanticsegmentation_ICNet.bin'
net = IENetwork.from_ir(model=model_xml, weights=model_bin)
seg_image = Image.open("data/input/009649.png")
palette = seg_image.getpalette() # Get a color palette
index_void = 2 # Define index_void Back Ground
camera_width = 320
camera_height = 240
fps = ""
elapsedTime = 0

#plugin = IEPlugin(device="HETERO:MYRIAD,CPU")
#plugin.set_config({"TARGET_FALLBACK": "HETERO:MYRIAD,CPU"})
#plugin.set_initial_affinity(net)

#plugin = IEPlugin(device="MYRIAD")
plugin = IEPlugin(device="CPU")

plugin.add_cpu_extension("/home/alpha/inference_engine_samples_build/intel64/Release/lib/libcpu_extension.so")
exec_net = plugin.load(network=net)

input_blob = next(iter(net.inputs))        #input_blob = 'input'
out_blob   = next(iter(net.outputs))       #out_blob   = 'ResizeBilinear_19'
#print(net.inputs[input_blob].shape)
h, w, c    = net.inputs[input_blob].shape  #h, w, c = 256, 512, 3

del net

cap = cv2.VideoCapture(0)
cap.set(cv2.CAP_PROP_FPS, 30)
cap.set(cv2.CAP_PROP_FRAME_WIDTH, camera_width)
cap.set(cv2.CAP_PROP_FRAME_HEIGHT, camera_height)
time.sleep(1)

while cap.isOpened():
    t1 = time.time()
    #ret, frame = cap.read()
    #if not ret:
    #    break
    frame = cv2.imread('data/input/000003.jpg')
    camera_height, camera_width, channels = frame.shape[:3]
    prepimg = frame[:, :, ::-1].copy()
    prepimg = Image.fromarray(prepimg)
    prepimg = prepimg.resize((512, 256), Image.ANTIALIAS)
    if prepimg.mode == "RGBA":
        prepimg = prepimg.convert("RGB")
    t2 = time.perf_counter()

    exec_net.start_async(request_id=0, inputs={input_blob: prepimg})

    if exec_net.requests[0].wait(-1) == 0:
        outputs = exec_net.requests[0].outputs[out_blob] # (1, 19, 256, 256)

    print(outputs[0].shape)
    print("SegmentationTime = {:.7f}".format(time.perf_counter() - t2))
    outputs = outputs[0] # (19, 256, 512)
    outputs = np.argmax(outputs, axis=0) # (256, 512)

    # View
    image = Image.fromarray(np.uint8(outputs), mode="P")
    image.putpalette(palette)
    image = image.convert("RGB")
    image = image.resize((camera_width, camera_height))

    image.save("2.jpg")
    image = np.asarray(image)

    cv2.putText(image, fps, (camera_width-180,15), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (38,0,255), 1, cv2.LINE_AA)
    cv2.imshow("Result", image)

    if cv2.waitKey(1)&0xFF == ord('q'):
        break
    elapsedTime = time.time() - t1
    fps = "(Playback) {:.1f} FPS".format(1/elapsedTime)

cv2.destroyAllWindows()
del exec_net
del plugin

60ms という猛烈スピードで推論されたが、結果がハチャメチャになった。
どこがバグっているのだろう。。。

#◆ 参考にさせていただいた記事、謝辞
ammo0613さんに先を越されてしまいました。。。 (´;Д;`)
AIを始めよう!OpenVINOのインストールからデモの実行まで - ammo0613 - Qiita
AIを始めよう!OpenVINOで使うモデルを整備する - ammo0613 - Qiita
AIを始めよう!PythonでOpenVINOの仕組みを理解する - ammo0613 - Qiita

#◆ 本日のまとめ

  • 純正SDK の NCSDK v1 / v2 はバグだらけでMovidius社は修正する気も無いみたいだし、ほんとダメね
  • NCSDKで実行できなかったモデルが、OpenVINOなら実行できることが分かった
  • OpenVINO、かなり速くてSDKとしての完成度が高い
  • Intel x86/64系で第7世代以降のCPUを搭載した端末なら、OpenVINOはカナリお勧め
  • Neural Compute Stick 2 。。。 現時点では期待したほどパフォーマンスが上がらないので、あまりオススメしないかも。。。
  • 個人的には ARMベースの TX2やら何やらを購入するより、導入コストとパフォーマンスのバランスを鑑みて、Intel CPU のLattePanda単体で購入したほうが潰しが効きそう、と感じた (正直、Stickはめちゃくちゃ遅いし、ARM非対応ならゴミ同然)
  • 次回、Intelが提供してくれた Semantic Segmentation モデルでリアルタイムセグメンテーションして遊ぼうと思う

★★備忘★★
https://software.intel.com/en-us/articles/OpenVINO-InferEngine#CPU%20Extensions
https://software.intel.com/en-us/articles/OpenVINO-InferEngine#Adding%20your%20own%20kernels

#◆ 次回記事
CPU単体で無理やり RealTime Semantic Segmentaion [1 FPS / CPU only]

#Introducing Ubuntu 16.04 + OpenVINO to Latte Panda Alpha 864 (without OS included) and enjoying Semantic Segmentation with Neural Compute Stick and Neural Compute Stick 2
#◆ Introduction
Finally CPU only realizes segmentation with speed as shown below.
sample.gif

Last article, [Detection rate approx. 30FPS] RaspberryPi3 Model B(plus none) is slightly later than TX2, acquires object detection rate of MobilenetSSD and corresponds to MultiModel (VOC+WIDER FACE).
↓ Youtube plays on click (Neural Compute Stick + MobilenetSSD + RaspberryPi)
Screenshot 2018-11-26 00:06:00.png
It is not an exaggeration to say that we tried OpenVINO which is unlikely to be influenced by the quality of SDK because the quality of NCSDK is remarkably low.

I introduced Ubuntu 16.04 to Latte Panda Alpha 864 (without OS) acquired by reservation sale in early November and introduced OpenVINO to verify the operation of the custom segmentation model of Neural Compute Stick and Neural Compute Stick 2 I do.
The purpose of raising Latte Panda Alpha is to verify the usefulness of Neural Compute Stick + OpenVINO on a single board computer.
Because it completely exceeds the level to tackle with hobby in terms of cost, everyone should never manage.

OpenVINO converts models generated by Caffe, TensorFlow, MXNet, Kaldi, ONNX into intermediate binary of a common format (IR [Intermediate Representation of the model]), commonly via an inference engine API (Inference Engine).
The execution platform does not correspond to the ARM architecture, it only supports Intel's x86 / 64 series CPU.
02.jpg

1. Develop Multiplatform Computer Vision Solutions - Intel Developer Zone
2. Install the Intel® Distribution of OpenVINO™ toolkit for Linux - Intel Developer Zone
3. How to Integrate the Inference Engine in Your Application - Intel Inference Engine Developer Guide
4. Accelerate Deep Learning Inference with Integrated Intel® Processor Graphics Rev 2.0 - Intel Developer Zone

#◆ Appearance of Latte Panda Alpha
1. Outer case1
03.jpg
2. Outer case2
04.jpg
3.Inside box
05.jpg
4.Supplied package (Case not included)
06.jpg
5.Sense of size compared to cigarette box (The length and breadth are somewhat larger than RaspberryPi、On the other hand, thin、About half the thickness of the cigarette box)
08.jpg

#◆ Specification of LattePanda Alpha

  • Price:
  • OS-less version:$358 (¥40,000)
  • Win10 bundle version:$398 (¥45,000)
  • CPU:
  • Intel 7th Gen Core m3-7y30
  • Core:
  • 1.6-2.6GHz Dual-Core,Four-Thread
  • Benchmark (PassMark):
  • Up to 3500, double computing power compared with same price range products in the market
  • Graphics:
  • Intel HD Graphics 615, 300-900MHz
  • RAM:
  • 8G LPDDR3 1866MHz Dual-Channel
  • Memory:
  • 64GB eMMC V5.0l
  • External Memory:
  • 1x M.2 M Key, PCIe 4x, Supports NVMe SSD and SATA SSD
  • 1x M.2 E Key, PCIe 2x,Supports USB2.0, UART, PCM
  • Connectivity:
  • Wi-Fi 802.11 AC, 2.4G & 5G
  • Dual Band Bluetooth 4.2
  • Gigabyte Ethernet
  • USB Ports:
  • 3x USB 3.0 Type A
  • 1x USB Type C, supports PD, DP, USB 3.0
  • Display:
  • HDMI Output
  • Type-C DP Support
  • Extendable eDP touch displays
  • Co-processor:
  • Arduino Leonardo
  • GPIO & Other Features:
  • 2x 50p GPIOs including I2C
  • I2S, USB
  • RS232
  • UART
  • RT
  • Power Managemen
  • Extendable power button
  • OS Support:
  • Windows 10 Pro
  • Linux Ubuntu

#◆ Parts used for kitting

  • Windows 10 PC (Anything is OK if you can create USB boot media for Ubuntu 1604)
  • LattePanda Alpha
  • Intel Movidius Neural Compute Stick v1 / v2
  • USB Memory 16GB
  • HDMI cable
  • HDMI display
  • USB keyboard
  • USB mouse

#◆ Installation / use software

  • Ubuntu 16.04 x86_64
  • OpenVINO toolkit 2018 R4 (2018.4.420)
  • Python 3.5
  • OpenCV 3.4.3 (pip3 install)
  • Rufus v3.3
  • Tensorflow v1.11.0 (pip3 install)

#◆ Installation of Ubuntu 16.04
##● Working with Windows 10 PC (Create USB flash drive of Ubuntu1604)
1.Ubuntu 16.04.5 Desktop Image Download (1.5GB)
http://releases.ubuntu.com/releases/16.04/ubuntu-16.04.5-desktop-amd64.iso

2.Download USB flash drive creation tool Rufus

rufus-128.png Official page - Rufus - Japanese

Download link https://github.com/pbatard/rufus/releases/download/v3.3/rufus-3.3.exe

3.Insert USB memory into Windows 10 PC

4.Start Rufus(rufus-3.3.exe)、Writing an Ubuntu 16.04 image in DD mode
Rufus main screen (DD mode designation dialog is displayed after pressing the start button)
01.png
Specify DD mode
02.png
State of writing
03.png

5.Remove USB memory from Windows 10 PC

##● Working with LattePanda Alpha 864

6.Connect the Wi-Fi antenna, keyboard, mouse, HDMI cable / display, USB memory to LattePanda Alpha and finally connect the power

例1) Wi-Fi antenna connection (There are two antennas)
ezgif.com-optimize1.gif
例2) Connecting the HDMI cable
ezgif.com-optimize2.gif
例3) All parts connected state
12.jpg
例4) Connect the power supply Tpye-C cable
When the Type-C cable is connected, the red LED for energization confirmation is always on and the blue LED lights momentarily.
Wait for the blue LED to blink, press and hold the power button for 3 seconds to turn on the power, and the blue LED is always on.
ezgif.com-optimize3.gif

7.Latte Panda Alpha's power turns on and at the same time hits the keyboard's Esc key
8.BootBoot Option #1 select, and Enter
DSC_0110.jpg
9.USB memory name + Partition1 select, and Enter
DSC_0111.jpg
10.Save & ExitSave Changes and Exit select, and Enter
DSC_0112.jpg
11.Yes select, and Enter
DSC_0113.jpg
12.Install Ubuntu select, and Enter
DSC_0114.jpg
13.Wait for a while
DSC_0115.jpg
14.English select, and Continue
DSC_0116.jpg
15.When connecting to Wi-Fi, Connect to this network sekect、and Select the SSID from the list Connect
DSC_0117.jpg
16.Enter the Wi-Fi password, and Connect
DSC_0118.jpg
17.Install third-party software for graphics and Wi-Fi hardware, Flash, MP3 and other media select, and Continue
DSC_0119.jpg
18.Erase disk and install Ubuntu select, and Install Now
DSC_0120.jpg
19.Continue
DSC_0121.jpg
20.Tokyo select, and Continue
DSC_0122.jpg
21.Select Japanese respectively from the left and right columns, Continue
DSC_0123.jpg
22.Enter user ID, terminal name, password, Continue
DSC_0124.jpg
23.Wait for a while
DSC_0125.jpg
24.Restart Now
※Rebooting starts but if it does not work, disconnect the power cable once and turn it on again
DSC_0126.jpg
25.Ubuntu 16.04 startup completed
DSC_0127.jpg
26.After logging on, start the terminal and update

Update_command
$ sudo apt-get update
$ sudo apt-get upgrade

Official installation procedure
http://docs.lattepanda.com/content/alpha_edition/power_on/

#◆ Installation of OpenVINO
OpenVINO version to be installed: 2018.4.420
##● Installation of OpenVINO main unit
Official tutorial
##● Additional installation for Intel Movidius Neural Compute Stick v1 / v2
Execute the following command.

Update_USB_access_rule
$ cd ~
$ sudo usermod -a -G users "$(whoami)"
$ sudo cat <<EOF > 97-usbboot.rules
SUBSYSTEM=="usb", ATTRS{idProduct}=="2150", ATTRS{idVendor}=="03e7", GROUP="users", MODE="0666", ENV{ID_MM_DEVICE_IGNORE}="1"
SUBSYSTEM=="usb", ATTRS{idProduct}=="2485", ATTRS{idVendor}=="03e7", GROUP="users", MODE="0666", ENV{ID_MM_DEVICE_IGNORE}="1"
SUBSYSTEM=="usb", ATTRS{idProduct}=="f63b", ATTRS{idVendor}=="03e7", GROUP="users", MODE="0666", ENV{ID_MM_DEVICE_IGNORE}="1"
EOF

$ sudo cp 97-usbboot.rules /etc/udev/rules.d/
$ sudo udevadm control --reload-rules
$ sudo udevadm trigger
$ sudo ldconfig
$ sudo rm 97-usbboot.rules

Use the following command to create a symbolic manually.

Symbolic_manual_generation_command
$ cd /opt/intel/common/mdf/lib64
$ sudo mv igfxcmrt64.so igfxcmrt64.so.org
$ sudo ln -s libigfxcmrt64.so igfxcmrt64.so

$ cd /opt/intel/mediasdk/lib64
$ sudo mv libmfxhw64.so.1 libmfxhw64.so.1.org
$ sudo mv libmfx.so.1 libmfx.so.1.org
$ sudo mv libva-glx.so.2 libva-glx.so.2.org
$ sudo mv libva.so.2 libva.so.2.org
$ sudo mv libigdgmm.so.1 libigdgmm.so.1.org
$ sudo mv libva-drm.so.2 libva-drm.so.2.org
$ sudo mv libva-x11.so.2 libva-x11.so.2.org
$ sudo ln -s libmfxhw64.so.1.28 libmfxhw64.so.1
$ sudo ln -s libmfx.so.1.28 libmfx.so.1
$ sudo ln -s libva-glx.so.2.300.0 libva-glx.so.2
$ sudo ln -s libva.so.2.300.0 libva.so.2
$ sudo ln -s libigdgmm.so.1.0.0 libigdgmm.so.1
$ sudo ln -s libva-drm.so.2.300.0 libva-drm.so.2
$ sudo ln -s libva-x11.so.2.300.0 libva-x11.so.2

Run sudo ldconfig again.

Rerun_sudo_ldconfig
$ cd ~
$ sudo ldconfig

Introduced by default OpenCV 4.0.0-pre has a bug in Gstreamer and it did not work properly, so reintroduce OpenCV 3.4.3 on its own.
Execute the following command.

Introduction_of_OpenCV3.4.3
$ sudo -H pip3 install opencv-python==3.4.3.18
$ nano ~/.bashrc
export PYTHONPATH=/usr/local/lib/python3.5/dist-packages/cv2:$PYTHONPATH

$ source ~/.bashrc

Official installation procedure
Intel®Movidius™Neural Compute Stick and Intel®Neural Compute Stick 2 additional installation procedure

##● Upgrade to Tensorflow v1.11.0
Upgrade to old version Tensorflow v1.9.0, introduced by default, to Tensorflow v1.11.0, since subsequent model optimizer processing will fail.

Tensorflow_version_check_command_example
$ python3 -c 'import tensorflow as tf; print(tf.__version__)'
1.9.0
Upgrade_command_to_Tensorflow_v1.11.0
$ sudo -H pip3 install pip --upgrade
$ sudo -H pip3 install tensorflow==1.11.0 --upgrade

##● Settings for offloading custom layer behavior to Tensorflow
Intel official tutorial - Offloading Computations to TensorFlow*
You can offload custom layer operations not supported by the OpenVINO standard API to Tensorflow side.
Execute the following command and self-build the inference engine layer using the Tensorflow runtime.
However, there are bugs in the scripts provided by Intel, so it is necessary to manually correct them.
Also, it is necessary to introduce Bazel at this timing 0.18.1.
As of November 17, 2018, 0.19.0 or more, the inference engine layer can not build normally, so be careful.

In the case of a terminal not fully loaded with RAM like LattePanda Alpha, for example 1GB of RAM,
sudo -H $HOME/bin/bazel build --config monolithic //tensorflow/cc/inference_engine_layer:libtensorflow_call_layer.so

sudo -H $HOME/bin/bazel --host_jvm_args=-Xmx512m build --config monolithic --local_resources 1024.0,0.5,0.5 //tensorflow/cc/inference_engine_layer:libtensorflow_call_layer.so
If you read like this, the build may succeed.

Build_inference_engine_layer_at_TensorFlow_runtime
$ sudo apt-get install -y git pkg-config zip g++ zlib1g-dev unzip
$ cd ~
$ wget https://github.com/bazelbuild/bazel/releases/download/0.18.1/bazel-0.18.1-installer-linux-x86_64.sh
$ sudo chmod +x bazel-0.18.1-installer-linux-x86_64.sh
$ ./bazel-0.18.1-installer-linux-x86_64.sh --user
$ echo 'export PATH=$PATH:$HOME/bin' >> ~/.bashrc
$ source ~/.bashrc
$ cd /opt
$ sudo git clone -b v1.11.0 https://github.com/tensorflow/tensorflow.git
$ cd tensorflow
$ sudo git checkout -b v1.11.0
$ echo 'export TF_ROOT_DIR=/opt/tensorflow' >> ~/.bashrc
$ source ~/.bashrc
$ sudo nano /opt/intel/computer_vision_sdk/bin/setupvars.sh

#Before
INSTALLDIR=/opt/intel//computer_vision_sdk_2018.4.420
↓
#After
INSTALLDIR=/opt/intel/computer_vision_sdk_2018.4.420

$ source /opt/intel/computer_vision_sdk/bin/setupvars.sh
$ sudo nano /opt/intel/computer_vision_sdk/deployment_tools/model_optimizer/tf_call_ie_layer/build.sh

#Before
bazel build --config=monolithic //tensorflow/cc/inference_engine_layer:libtensorflow_call_layer.so
↓
#After
sudo -H $HOME/bin/bazel build --config monolithic //tensorflow/cc/inference_engine_layer:libtensorflow_call_layer.so

$ sudo -E /opt/intel/computer_vision_sdk/deployment_tools/model_optimizer/tf_call_ie_layer/build.sh

The inference engine layer is generated in the following path.

libtensorflow_call_layer.so's_PATH
/opt/tensorflow/bazel-bin/tensorflow/cc/inference_engine_layer/libtensorflow_call_layer.so

In this case, when you run python, there is no access privilege to /opt under the ordinary user Permission denied Because an error occurs, change the placement place.

Changing_the_location_of_.so
$ su -
$ cp /opt/tensorflow/bazel-bin/tensorflow/cc/inference_engine_layer/libtensorflow_call_layer.so /usr/local/lib
$ exit
$ nano ~/.bashrc
export PYTHONPATH=$PYTHONPATH:/usr/local/lib
$ source ~/.bashrc
$ sudo ldconfig

#◆ Taste of the demonstration program
##● Sample image classification
Execute the following command.

Image_Classification_Sample_SqueezeNet
$ cd /opt/intel/computer_vision_sdk/deployment_tools/demo
$ ./demo_squeezenet_download_convert_run.sh

Load the image shown below. . .
car.jpg
It seems that 80% probability sports car recognized.

Result
###################################################

Run Inference Engine classification sample

Run ./classification_sample -d CPU -i /opt/intel/computer_vision_sdk/deployment_tools/demo/../demo/car.png -m /home/alpha/openvino_models/ir/squeezenet1.1/FP32/squeezenet1.1.xml 

[ INFO ] InferenceEngine: 
	API version ............ 1.4
	Build .................. 17328
[ INFO ] Parsing input parameters
[ INFO ] Files were added: 1
[ INFO ]     /opt/intel/computer_vision_sdk/deployment_tools/demo/../demo/car.png
[ INFO ] Loading plugin

	API version ............ 1.4
	Build .................. lnx_20181004
	Description ....... MKLDNNPlugin
[ INFO ] Loading network files:
	/home/alpha/openvino_models/ir/squeezenet1.1/FP32/squeezenet1.1.xml
	/home/alpha/openvino_models/ir/squeezenet1.1/FP32/squeezenet1.1.bin
[ INFO ] Preparing input blobs
[ WARNING ] Image is resized from (787, 259) to (227, 227)
[ INFO ] Batch size is 1
[ INFO ] Preparing output blobs
[ INFO ] Loading model to the plugin
[ INFO ] Starting inference (1 iterations)
[ INFO ] Processing output blobs

Top 10 results:

Image /opt/intel/computer_vision_sdk/deployment_tools/demo/../demo/car.png

817 0.8363345 label sports car, sport car
511 0.0946488 label convertible
479 0.0419131 label car wheel
751 0.0091071 label racer, race car, racing car
436 0.0068161 label beach wagon, station wagon, wagon, estate car, beach waggon, station waggon, waggon
656 0.0037564 label minivan
586 0.0025741 label half track
717 0.0016069 label pickup, pickup truck
864 0.0012027 label tow truck, tow car, wrecker
581 0.0005882 label grille, radiator grille


total inference time: 6.5318211
Average running time of one iteration: 6.5318211 ms

Throughput: 153.0966609 FPS

[ INFO ] Execution successful


###################################################

Demo completed successfully.

##● Sample of three step inference
It seems to be a sample that makes three levels of inference run continuously in separate learning models.

  • Car detection
  • Detection of license plate
  • Character recognition in the identified license plate

Execute the following command.

Sample_of_three_step_inference
$ cd /opt/intel/computer_vision_sdk/deployment_tools/demo
$ ./demo_security_barrier_camera.sh

license-plate.jpeg

##● Various sample programs other than the above
Intel® Distribution of OpenVINO™ Toolkit - Inference Engine Samples

#◆ Proprietary model conversion and execution sample script
Official tutorial - Using the Model Optimizer to Convert TensorFlow* Models
Official tutorial - Model Optimizer Developer Guide - TensorFlow* Models with Custom Layers

Sample script below to convert .pb (FreezeGraph) of Tensorflow to IR format for OpenVINO.

Convert_command
$ cd /opt/intel/computer_vision_sdk/deployment_tools/model_optimizer
$ python3 mo_tf.py --input_model <INPUT_MODEL>.pb
**Conversion command options and explanation**
Conversion_command_options
optional arguments:
  -h, --help            show this help message and exit
  --framework {tf,caffe,mxnet,kaldi,onnx}
                        Name of the framework used to train the input model.

Framework-agnostic parameters:
  --input_model INPUT_MODEL, -w INPUT_MODEL, -m INPUT_MODEL
                        Tensorflow*: a file with a pre-trained model (binary
                        or text .pb file after freezing). Caffe*: a model
                        proto file with model weights
  --model_name MODEL_NAME, -n MODEL_NAME
                        Model_name parameter passed to the final create_ir
                        transform. This parameter is used to name a network in
                        a generated IR and output .xml/.bin files.
  --output_dir OUTPUT_DIR, -o OUTPUT_DIR
                        Directory that stores the generated IR. By default, it
                        is the directory from where the Model Optimizer is
                        launched.
  --input_shape INPUT_SHAPE
                        Input shape(s) that should be fed to an input node(s)
                        of the model. Shape is defined as a comma-separated
                        list of integer numbers enclosed in parentheses or
                        square brackets, for example [1,3,227,227] or
                        (1,227,227,3), where the order of dimensions depends
                        on the framework input layout of the model. For
                        example, [N,C,H,W] is used for Caffe* models and
                        [N,H,W,C] for TensorFlow* models. Model Optimizer
                        performs necessary transformations to convert the
                        shape to the layout required by Inference Engine
                        (N,C,H,W). The shape should not contain undefined
                        dimensions (? or -1) and should fit the dimensions
                        defined in the input operation of the graph. If there
                        are multiple inputs in the model, --input_shape should
                        contain definition of shape for each input separated
                        by a comma, for example: [1,3,227,227],[2,4] for a
                        model with two inputs with 4D and 2D shapes.
  --scale SCALE, -s SCALE
                        All input values coming from original network inputs
                        will be divided by this value. When a list of inputs
                        is overridden by the --input parameter, this scale is
                        not applied for any input that does not match with the
                        original input of the model.
  --reverse_input_channels
                        Switch the input channels order from RGB to BGR (or
                        vice versa). Applied to original inputs of the model
                        if and only if a number of channels equals 3. Applied
                        after application of --mean_values and --scale_values
                        options, so numbers in --mean_values and
                        --scale_values go in the order of channels used in the
                        original model.
  --log_level {CRITICAL,ERROR,WARN,WARNING,INFO,DEBUG,NOTSET}
                        Logger level
  --input INPUT         The name of the input operation of the given model.
                        Usually this is a name of the input placeholder of the
                        model.
  --output OUTPUT       The name of the output operation of the model. For
                        TensorFlow*, do not add :0 to this name.
  --mean_values MEAN_VALUES, -ms MEAN_VALUES
                        Mean values to be used for the input image per
                        channel. Values to be provided in the (R,G,B) or
                        [R,G,B] format. Can be defined for desired input of
                        the model, for example: "--mean_values
                        data[255,255,255],info[255,255,255]". The exact
                        meaning and order of channels depend on how the
                        original model was trained.
  --scale_values SCALE_VALUES
                        Scale values to be used for the input image per
                        channel. Values are provided in the (R,G,B) or [R,G,B]
                        format. Can be defined for desired input of the model,
                        for example: "--scale_values
                        data[255,255,255],info[255,255,255]". The exact
                        meaning and order of channels depend on how the
                        original model was trained.
  --data_type {FP16,FP32,half,float}
                        Data type for all intermediate tensors and weights. If
                        original model is in FP32 and --data_type=FP16 is
                        specified, all model weights and biases are quantized
                        to FP16.
  --disable_fusing      Turn off fusing of linear operations to Convolution
  --disable_resnet_optimization
                        Turn off resnet optimization
  --finegrain_fusing FINEGRAIN_FUSING
                        Regex for layers/operations that won't be fused.
                        Example: --finegrain_fusing Convolution1,.*Scale.*
  --disable_gfusing     Turn off fusing of grouped convolutions
  --move_to_preprocess  Move mean values to IR preprocess section
  --extensions EXTENSIONS
                        Directory or a comma separated list of directories
                        with extensions. To disable all extensions including
                        those that are placed at the default location, pass an
                        empty string.
  --batch BATCH, -b BATCH
                        Input batch size
  --version             Version of Model Optimizer
  --silent              Prevent any output messages except those that
                        correspond to log level equals ERROR, that can be set
                        with the following option: --log_level. By default,
                        log level is already ERROR.
  --freeze_placeholder_with_value FREEZE_PLACEHOLDER_WITH_VALUE
                        Replaces input layer with constant node with provided
                        value, e.g.: "node_name->True"
  --generate_deprecated_IR_V2
                        Force to generate legacy/deprecated IR V2 to work with
                        previous versions of the Inference Engine. The
                        resulting IR may or may not be correctly loaded by
                        Inference Engine API (including the most recent and
                        old versions of Inference Engine) and provided as a
                        partially-validated backup option for specific
                        deployment scenarios. Use it at your own discretion.
                        By default, without this option, the Model Optimizer
                        generates IR V3.
**Tensorflow-specific conversion command options and explanation**
Tensorflow-specific_conversion_command_options
TensorFlow*-specific parameters:
  --input_model_is_text
                        TensorFlow*: treat the input model file as a text
                        protobuf format. If not specified, the Model Optimizer
                        treats it as a binary file by default.
  --input_checkpoint INPUT_CHECKPOINT
                        TensorFlow*: variables file to load.
  --input_meta_graph INPUT_META_GRAPH
                        Tensorflow*: a file with a meta-graph of the model
                        before freezing
  --saved_model_dir SAVED_MODEL_DIR
                        TensorFlow*: directory representing non frozen model
  --saved_model_tags SAVED_MODEL_TAGS
                        Group of tag(s) of the MetaGraphDef to load, in string
                        format, separated by ','. For tag-set contains
                        multiple tags, all tags must be passed in.
  --offload_unsupported_operations_to_tf
                        TensorFlow*: automatically offload unsupported
                        operations to TensorFlow*
  --tensorflow_subgraph_patterns TENSORFLOW_SUBGRAPH_PATTERNS
                        TensorFlow*: a list of comma separated patterns that
                        will be applied to TensorFlow* node names to infer a
                        part of the graph using TensorFlow*.
  --tensorflow_operation_patterns TENSORFLOW_OPERATION_PATTERNS
                        TensorFlow*: a list of comma separated patterns that
                        will be applied to TensorFlow* node type (ops) to
                        infer these operations using TensorFlow*.
  --tensorflow_custom_operations_config_update TENSORFLOW_CUSTOM_OPERATIONS_CONFIG_UPDATE
                        TensorFlow*: update the configuration file with node
                        name patterns with input/output nodes information.
  --tensorflow_use_custom_operations_config TENSORFLOW_USE_CUSTOM_OPERATIONS_CONFIG
                        TensorFlow*: use the configuration file with custom
                        operation description.
  --tensorflow_object_detection_api_pipeline_config TENSORFLOW_OBJECT_DETECTION_API_PIPELINE_CONFIG
                        TensorFlow*: path to the pipeline configuration file
                        used to generate model created with help of Object
                        Detection API.
  --tensorboard_logdir TENSORBOARD_LOGDIR
                        TensorFlow*: dump the input graph to a given directory
                        that should be used with TensorBoard.
  --tensorflow_custom_layer_libraries TENSORFLOW_CUSTOM_LAYER_LIBRARIES
                        TensorFlow*: comma separated list of shared libraries
                        with TensorFlow* custom operations implementation.
  --disable_nhwc_to_nchw
                        Disables default translation from NHWC to NCHW

#◆ Self-generated model・Semantic Segmentation 「UNet」 Conversion
First of all, I will try from UNet whose structure is super simple.
The .pb file is placed in TensorflowLite-UNet - PINTO0309 - Github
This is a model of Semantic Segmentation that I have learned only Person class.
TensorflowLite-UNet/model/semanticsegmentation_frozen_person_32.pb (31.1MB)

##● Conversion to data type FP16
Execute the following command.
--input_model is the name of the .pb file to be converted (FreezeGraph name)
--output_dir is output destination path of converted lr file
--input is input node name (placeholder name)
--output is output node name
--data_type is data precision type name after conversion [FP16/FP32/half/float]
--batch is forced substitution of input batch size
--scale is normalization of value range
--mean_values is specify the average subtraction value of BGR value in pixel units
--offload_unsupported_operations_to_tf is specification for offloading Tensorflow's custom layer that can not be processed by OpenVINO to Tensorflow side

Conversion_script_of_my_own_"UNet"_model_to_IR_FP16
$ cd /opt/intel/computer_vision_sdk/deployment_tools/model_optimizer
$ sudo mkdir -p 01_pbmodels/UNet
$ sudo mkdir -p 10_lrmodels/UNet/FP16
$ sudo wget https://github.com/PINTO0309/TensorflowLite-UNet/raw/master/model/semanticsegmentation_frozen_person_32.pb -P 01_pbmodels/UNet
$ sudo python3 mo_tf.py \
--input_model 01_pbmodels/UNet/semanticsegmentation_frozen_person_32.pb \
--output_dir 10_lrmodels/UNet/FP16 \
--input input \
--output output/BiasAdd \
--data_type FP16 \
--batch 1 

<Reference POST for calculating RGB average value>
https://forums.fast.ai/t/images-normalization/4058
https://github.com/DrSleep/tensorflow-deeplab-resnet/issues/106

Sample_logic_for_average_calculation_of_RGB_by_PYTHON
# Calculate RGB average value per image
mean = np.mean(jpgimg, axis=(0, 1))
meanB += mean[0]
meanG += mean[1]
meanR += mean[2]
# Calculate RGB average values ​​of all learning images
print("meanB =", meanB / imgcnt)
print("meanG =", meanG / imgcnt)
print("meanR =", meanR / imgcnt)

Since it converted from the model of FP32 to FP16, the apparent file size became 15.5MB which is half the size before conversion.

**lr conversion log**
conversion_log
Model Optimizer arguments:
Common parameters:
	- Path to the Input Model: 	/opt/intel/computer_vision_sdk_2018.4.420/deployment_tools/model_optimizer/01_pbmodels/UNet/semanticsegmentation_frozen_person_32.pb
	- Path for generated IR: 	/opt/intel/computer_vision_sdk_2018.4.420/deployment_tools/model_optimizer/10_lrmodels/UNet/FP16
	- IR output name: 	semanticsegmentation_frozen_person_32
	- Log level: 	ERROR
	- Batch: 	1
	- Input layers: 	input
	- Output layers: 	output/BiasAdd
	- Input shapes: 	Not specified, inherited from the model
	- Mean values: 	Not specified
	- Scale values: 	Not specified
	- Scale factor: 	Not specified
	- Precision of IR: 	FP16
	- Enable fusing: 	True
	- Enable grouped convolutions fusing: 	True
	- Move mean values to preprocess section: 	False
	- Reverse input channels: 	False
TensorFlow specific parameters:
	- Input model in text protobuf format: 	False
	- Offload unsupported operations: 	False
	- Path to model dump for TensorBoard: 	None
	- List of shared libraries with TensorFlow custom layers implementation: 	None
	- Update the configuration file with input/output node names: 	None
	- Use configuration file used to generate the model with Object Detection API: 	None
	- Operations to offload: 	None
	- Patterns to offload: 	None
	- Use the config file: 	None
Model Optimizer version: 	1.4.292.6ef7232d

[ SUCCESS ] Generated IR model.
[ SUCCESS ] XML file: /opt/intel/computer_vision_sdk_2018.4.420/deployment_tools/model_optimizer/10_lrmodels/UNet/FP16/semanticsegmentation_frozen_person_32.xml
[ SUCCESS ] BIN file: /opt/intel/computer_vision_sdk_2018.4.420/deployment_tools/model_optimizer/10_lrmodels/UNet/FP16/semanticsegmentation_frozen_person_32.bin
[ SUCCESS ] Total execution time: 3.86 seconds.
![SITWXV~4.jpg](https://qiita-image-store.s3.amazonaws.com/0/194769/4a35598f-de96-e839-246d-ec74882671e8.jpeg)

##● Conversion to data type FP32
Execute the following command.

Conversion_script_of_my_own_"UNet"_model_to_IR_FP32
$ cd /opt/intel/computer_vision_sdk/deployment_tools/model_optimizer
$ sudo mkdir -p 01_pbmodels/UNet
$ sudo mkdir -p 10_lrmodels/UNet/FP32
$ sudo wget https://github.com/PINTO0309/TensorflowLite-UNet/raw/master/model/semanticsegmentation_frozen_person_32.pb -P 01_pbmodels/UNet
$ sudo python3 mo_tf.py \
--input_model 01_pbmodels/UNet/semanticsegmentation_frozen_person_32.pb \
--output_dir 10_lrmodels/UNet/FP32 \
--input input \
--output output/BiasAdd \
--data_type FP32 \
--batch 1
**lr conversion log**
conversion_log
Model Optimizer arguments:
Common parameters:
	- Path to the Input Model: 	/opt/intel/computer_vision_sdk_2018.4.420/deployment_tools/model_optimizer/01_pbmodels/UNet/semanticsegmentation_frozen_person_32.pb
	- Path for generated IR: 	/opt/intel/computer_vision_sdk_2018.4.420/deployment_tools/model_optimizer/10_lrmodels/UNet/FP32
	- IR output name: 	semanticsegmentation_frozen_person_32
	- Log level: 	ERROR
	- Batch: 	1
	- Input layers: 	input
	- Output layers: 	output/BiasAdd
	- Input shapes: 	Not specified, inherited from the model
	- Mean values: 	Not specified
	- Scale values: 	Not specified
	- Scale factor: 	Not specified
	- Precision of IR: 	FP32
	- Enable fusing: 	True
	- Enable grouped convolutions fusing: 	True
	- Move mean values to preprocess section: 	False
	- Reverse input channels: 	False
TensorFlow specific parameters:
	- Input model in text protobuf format: 	False
	- Offload unsupported operations: 	False
	- Path to model dump for TensorBoard: 	None
	- List of shared libraries with TensorFlow custom layers implementation: 	None
	- Update the configuration file with input/output node names: 	None
	- Use configuration file used to generate the model with Object Detection API: 	None
	- Operations to offload: 	None
	- Patterns to offload: 	None
	- Use the config file: 	None
Model Optimizer version: 	1.4.292.6ef7232d

[ SUCCESS ] Generated IR model.
[ SUCCESS ] XML file: /opt/intel/computer_vision_sdk_2018.4.420/deployment_tools/model_optimizer/10_lrmodels/UNet/FP32/semanticsegmentation_frozen_person_32.xml
[ SUCCESS ] BIN file: /opt/intel/computer_vision_sdk_2018.4.420/deployment_tools/model_optimizer/10_lrmodels/UNet/FP32/semanticsegmentation_frozen_person_32.bin
[ SUCCESS ] Total execution time: 3.70 seconds. 
![SKTIWX~T.jpg](https://qiita-image-store.s3.amazonaws.com/0/194769/db3d5960-33c9-78c7-f390-9add9635ec87.jpeg)

#◆ Construction and execution of UNet execution environment by OpenVINO

UNet_executable_program_sample_for_real-time_segmentation
import sys
import cv2
import numpy as np
from PIL import Image
import time
from openvino.inference_engine import IENetwork, IEPlugin

model_xml='/opt/intel/computer_vision_sdk_2018.4.420/deployment_tools/model_optimizer/10_lrmodels/UNet/FP32/semanticsegmentation_frozen_person_32.xml'
model_bin='/opt/intel/computer_vision_sdk_2018.4.420/deployment_tools/model_optimizer/10_lrmodels/UNet/FP32/semanticsegmentation_frozen_person_32.bin'
net = IENetwork.from_ir(model=model_xml, weights=model_bin)
seg_image = Image.open("data/input/009649.png")
palette = seg_image.getpalette() # Get a color palette
index_void = 2 # Define index_void Back Ground
camera_width = 320
camera_height = 240
fps = ""
elapsedTime = 0

plugin = IEPlugin(device="HETERO:MYRIAD,CPU")
plugin.set_config({"TARGET_FALLBACK": "HETERO:MYRIAD,CPU"})
plugin.set_initial_affinity(net)

#plugin = IEPlugin(device="MYRIAD")
#plugin = IEPlugin(device="CPU")

exec_net = plugin.load(network=net)

input_blob = next(iter(net.inputs))        #input_blob = 'input'
out_blob   = next(iter(net.outputs))       #out_blob   = 'output/BiasAdd'
n, c, h, w = net.inputs[input_blob].shape  #n, c, h, w = 1, 3, 256, 256

del net

cap = cv2.VideoCapture(0)
cap.set(cv2.CAP_PROP_FPS, 30)
cap.set(cv2.CAP_PROP_FRAME_WIDTH, camera_width)
cap.set(cv2.CAP_PROP_FRAME_HEIGHT, camera_height)
time.sleep(1)

while cap.isOpened():
    t1 = time.time()
    ret, frame = cap.read()
    if not ret:
        break
    #frame = cv2.imread('data/input/000003.jpg')
    prepimg = frame[:, :, ::-1].copy()
    #prepimg = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
    prepimg = Image.fromarray(prepimg)
    prepimg = prepimg.resize((256, 256), Image.ANTIALIAS)
    prepimg = np.asarray(prepimg) / 255.0
    prepimg = prepimg.transpose((2, 0, 1)).reshape((1, c, h, w))

    t2 = time.perf_counter()
    exec_net.start_async(request_id=0, inputs={input_blob: prepimg})

    if exec_net.requests[0].wait(-1) == 0:
        outputs = exec_net.requests[0].outputs[out_blob] # (1, 3, 256, 256)
        print("SegmentationTime = {:.7f}".format(time.perf_counter() - t2))
        outputs = outputs.transpose((2, 3, 1, 0)).reshape((h, w, c)) # (256, 256 3)
        outputs = cv2.resize(outputs, (camera_width, camera_height)) # (240, 320, 3)

        # View
        res = np.argmax(outputs, axis=2)
        if index_void is not None:
            res = np.where(res == index_void, 0, res)
        image = Image.fromarray(np.uint8(res), mode="P")
        image.putpalette(palette)
        image = image.convert("RGB")

        image = np.asarray(image)
        image = cv2.cvtColor(image, cv2.COLOR_RGB2BGR)
        image = cv2.addWeighted(frame, 1, image, 0.9, 0)

    cv2.putText(image, fps, (camera_width-180,15), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (38,0,255), 1, cv2.LINE_AA)
    cv2.imshow("Result", image)

    if cv2.waitKey(1)&0xFF == ord('q'):
        break
    elapsedTime = time.time() - t1
    fps = "(Playback) {:.1f} FPS".format(1/elapsedTime)

cv2.destroyAllWindows()
del exec_net
del plugin

000003.jpg
4.jpg

◆ Measurement result of processing speed
Inference_Time2.jpg
With USB camera shooting, the performance of 4 FPS to 5 FPS was out.

27
18
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
27
18