More than 3 years have passed since last update.

Ubuntu 20.04 on Raspberry PI 4 model BでCoral USB Accelerater(TensorFlow Lite)を動かしてみる

Last updated at 2020-07-30Posted at 2020-07-23

Vitis AIと比較を行うために、Webカメラでの物体検知を行いたい。

Raspberry PIにUbuntu20.04をインストール

Raspberry PI Imagerを使ってSDカードにUbuntuを焼く。

初回はユーザ名ubuntu、パスワードubuntuでログインし、パスワードを設定、パッケージを更新しておく。(結構時間がかかる)

sudo apt update
sudo apt full-upgrade -y

LANの設定

/etc/wpa_supplicant/wpa_supplicant-wlan0.confを作って編集

network={
    ssid="SSID名"
    scan_ssid=1
    key_mgmt=WPA-PSK
    psk="パスワード"
}

systemctl enable wpa_supplicant@wlan0を実行
/etc/systemd/network/25-wlan.networkを作って編集

[Match]
Name=wl*

[Network]
DHCP=true
MulticastDNS=true

有線の分も作成。無線だと画面転送が遅い。
/etc/systemd/network/20-eth.networkを作って編集

[Match]
Name=eth*

[Network]
DHCP=true

A start job is running for wait for network to be configured.で起動が遅いので以下を実施しておく。
/etc/netplan/70-netcfg.yamlに、以下の内容を書いておく。

network:
    ethernets:
        wlan0:
            optional: true
    version: 2

有効化

sudo netplan apply

最小限のGUI構築(うまくできない？)

Debian / Ubuntu上のwaylandとwestonで最小限のGUIを構築するを参考にする。

まず、不要なパッケージを削除

sudo apt --purge remove nplan ifupdown avahi-autoipd

日本語ロケール化

sudo dpkg-reconfigure locales
# ja_JP.UTF-8を選択

必要そうなパッケージをインストール

sudo apt install libnss-resolve libnss-systemd dbus-user-session policykit-1

Wayland/westonの最小限パッケージをインストール

sudo apt install weston xwayland xfonts-base fonts-vlgothic fonts-ipafont poppler-data

.weston.iniを書いて、設定

[core]
xwayland=true
modules=systemd-notify.so

起動方法(起動できない)

weston --log=weston.log --config=.weston.ini

Ubuntu 20.04でPython3.7.8を使う(pyenv)

TensorFlow LiteがPython3.7までの対応で、Ubuntu 20.04はPython3.8なので、pyenvを使用する。
refer: https://www.kkaneko.jp/tools/ubuntu/ubuntu_pyenv.html

# 必要パッケージをインストール
sudo apt -yV install --no-install-recommends make build-essential libssl-dev zlib1g-dev libbz2-dev libreadline-dev libsqlite3-dev wget curl llvm libncurses5-dev xz-utils tk-dev libxml2-dev libxmlsec1-dev libffi-dev liblzma-dev

# pyenvをgitからダウンロード
git clone https://github.com/pyenv/pyenv.git ~/.pyenv

# 設定
echo 'export PYENV_ROOT="${HOME}/.pyenv"' >> ~/.bashrc
echo 'if [ -d "${PYENV_ROOT}" ]; then' >> ~/.bashrc
echo '    export PATH=${PYENV_ROOT}/bin:$PATH' >> ~/.bashrc
echo '    eval "$(pyenv init -)"' >> ~/.bashrc
echo 'fi' >> ~/.bashrc
exec $SHELL -l
source ~/.bashrc

Python3.7.8をインストールして適用

pyenv install 3.7.8
pyenv shell 3.7.8

pyenv上のPython3.7.8にopencvをインストール

WebカメラとTensorFlow Liteを簡単に扱いたいのでopencvを使うことにする。

まず、必要なパッケージをインストール

sudo apt install libgtk2.0-dev pkg-config libusb-1.0-0-dev

そんで、ソースからビルド。VideoCaptureで動画ファイルを読む場合は、-DWITH_FFMPEG=ONが必要。
refer: https://zv-louis.hatenablog.com/entry/2018/05/08/063000

wget https://github.com/opencv/opencv/archive/4.4.0.zip -O opencv-4.4.0.zip
unzip opencv-4.4.0.zip
wget https://github.com/opencv/opencv_contrib/archive/4.4.0.zip -O opencv_contrib-4.4.0.zip
unzip opencv_contrib-4.4.0.zip

cd opencv-4.4.0
mkdir build
cd build
cmake .. \
-DCMAKE_BUILD_TYPE=RELEASE \
-DCMAKE_INSTALL_PREFIX=~/.pyenv/versions/3.7.8/usr/local/ \
-DINSTALL_C_EXAMPLES=OFF \
-DWITH_FFMPEG=ON \
-DBUILD_NEW_PYTHON_SUPPORT=ON \
-DBUILD_opencv_python3=ON \
-DBUILD_opencv_legacy=OFF \
-DINSTALL_PYTHON_EXAMPLES=ON \
-DBUILD_EXAMPLES=ON \
-DPYTHON_EXECUTABLE=~/.pyenv/versions/3.7.8/bin/python \
-DPYTHON_LIBRARY=~/.pyenv/versions/3.7.8/lib/libpython3.7m.a \
-DPYTHON_INCLUDE_DIR=~/.pyenv/versions/3.7.8/include/python3.7m \
-DPYTHON_INCLUDE_DIRS=~/.pyenv/versions/3.7.8/include/python3.7m \
-DPYTHON_INCLUDE_DIRS2=~/.pyenv/versions/3.7.8/include/python3.7m \
-DINCLUDE_DIRS=~/.pyenv/versions/3.7.8/include/python3.7m \
-DINCLUDE_DIRS2=~/.pyenv/versions/3.7.8/include/python3.7m \
-DPYTHON_PACKAGES_PATH=~/.pyenv/versions/3.7.8/lib/python3.7/site-packages \
-DPYTHON_NUMPY_INCLUDE_DIR=~/.pyenv/versions/3.7.8/lib/python3.7/site-packages/numpy/core/include \
-DOPENCV_EXTRA_MODULES_PATH=../../opencv_contrib-4.4.0/modules

# raspiは4コアだが、4コアすべて使うとスワップアウトしたりしてやっかい
make -j 3; make install

# cv2でimportできるようにシンボリックリンクを張る
 ln -s ~/.pyenv/versions/3.7.8/usr/local/lib/python3.7/site-packages/cv2/python-3.7/cv2.cpython-37m-aarch64-linux-gnu.so ~/.pyenv/versions/3.7.8/lib/python3.7/site-packages/

swapが足りない(メモリがどうのこうののエラーになる)場合は以下の対応をする。
refer: https://www.digitalocean.com/community/tutorials/how-to-add-swap-space-on-ubuntu-20-04

sudo fallocate -l 2G /swapfile
sudo chmod 600 /swapfile
sudo mkswap /swapfile
sudo swapon /swapfile
echo '/swapfile none swap sw 0 0' | sudo tee -a /etc/fstab

TensorFlow Liteのもろもろをインストール

Get started with the USB Acceleratorを参考に、環境をインストール。

Install the Edge TPU runtime

echo "deb https://packages.cloud.google.com/apt coral-edgetpu-stable main" | sudo tee /etc/apt/sources.list.d/coral-edgetpu.list
curl https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key add -
sudo apt-get update

# ファンとかで冷却できる場合はlibedgetpu1-max、安全に使いたい場合はlibedgetpu1-std
# sudo apt-get install libedgetpu1-std
sudo apt-get install libedgetpu1-max

tflite_runtimeのインストール

python3.8系統にはまだ対応していないので、python3.7系統が必要。

pip3 install https://dl.google.com/coral/python/tflite_runtime-2.1.0.post1-cp37-cp37m-linux_aarch64.whl

デモ(動作確認)

git clone https://github.com/google-coral/tflite.git ./google-coral_tflite
cd google-coral_tflite
cd python/examples/classification
bash install_requirements.sh

# 実行
python3 classify_image.py \
--model models/mobilenet_v2_1.0_224_inat_bird_quant_edgetpu.tflite \
--labels models/inat_bird_labels.txt \
--input images/parrot.jpg

USB Webカメラのデモ

tfliteのサンプルを動かしてみる。

以下で準備。

git clone https://github.com/google-coral/examples-camera google-coral_examples-camera
cd google-coral_examples-camera

# モデルダウンロード
bash download_models.sh

# opnecvのサンプルを動かしてみる
cd opencv
# 必要パッケージ
sudo apt-get -y install libhdf5-1* libatlas-base-dev

実行(結局X11 Forwardingした)

python3 detect.py --camera_idx 0

カメラによっては20fps超えるぞ。 pic.twitter.com/KcQuI53eKM
— nv-h (@saido_nv) July 23, 2020

表示しながらでも20fps超なので結構高速だが、複数カメラに対して複数アプリ起動はできない。(TPUが死ぬ)
CPU使用率はそこそこ少ない。(Raspberry PIは高速なはずだから多いのか？)

C++アプリ(詳細は未確認)

後述のgo-tfliteが良さそうなので、詳細確認を保留している。

ホストでクロスコンパイル
https://github.com/google-coral/examples-camera/tree/master/nativeapp

sudo apt-get install libglib2.0-dev libgstreamer1.0-dev libedgetpu-dev

cd ~/sources
git clone https://github.com/tensorflow/tensorflow
cd tensorflow
# git checkout d855adfc5a0195788bf5f92c3c7352e638aa1109
# https://github.com/tensorflow/tensorflow/issues/36184#issuecomment-579274152
git checkout -b v2.2.0 refs/tags/v2.2.0 # r2.2ブランチでもよさそう
./tensorflow/lite/tools/make/download_dependencies.sh
./tensorflow/lite/tools/make/build_aarch64_lib.sh
cd ..
tar -cvzf tensorflow.tar.gz tensorflow
# sftpとかでtensorflow.tar.gzを転送

Raspberry PI上で以下のサンプルをビルドしてみる。
https://github.com/mattn/webcam-detect-tflite

以下からmplus-TESTFLIGHT-*.tar.xzをダウンロードしてmplus-1c-thin.ttfを取り出す。
https://osdn.net/projects/mplus-fonts/releases/62344

sudo apt install libopencv-dev
cd ~/sources

git clone https://github.com/mattn/webcam-detect-tflite
cd webcam-detect-tflite
cp <どこか>/mplus-1c-thin.ttf . # フォントが必要

MakefileがWindwos向けなので、以下のように書き換える。

CXXFLAGS ?= \
	-I$(HOME)/sources/tensorflow \
	-I$(HOME)/sources/tensorflow/tensorflow/lite/tools/make/downloads/flatbuffers/include \
	-I$(HOME)/sources/tensorflow/tensorflow/lite/tools/make/downloads/absl \
	-I/usr/include/freetype2/ \
	-I/usr/include/opencv4

LDFLAGS ?= \
	-L$(HOME)/sources/tensorflow/tensorflow/lite/tools/make/gen/linux_aarch64/lib/

.PHONY: all clean

all: webcam-detector

webcam-detector: main.o
	gcc -O3 -o webcam-detector main.o \
	$(LDFLAGS) \
	-ltensorflow-lite \
	-lstdc++ -lpthread -ldl -lm \
	-lopencv_videoio -lopencv_core -lopencv_highgui -lopencv_imgproc\
	-lfreetype

main.o : main.cxx
	g++ -c --std=c++11 main.cxx -O3 $(CXXFLAGS)

clean:
	rm -f webcam-detector

実行してみると、結構高速に動いている模様。

Go binding for TensorFlow Lite

マルチスレッドとかをGoで関単に書きたいので、これを使いたい。
https://github.com/mattn/go-tflite

まずは、tfliteのC APIをインストールする必要がある。
tfliteのC APIをビルドするためにはbazelなるものが必要で、これはarm64向けのaptパッケージがないようなので、バイナリをインストールする。

CPUだけ使う場合(EdgeTPUを使わない)は、最新のタグv2.2.0で良いが、EdgeTPUも使いたい場合はd855adfc5a0195788bf5f92c3c7352e638aa1109のコミットのものを使用する必要がある。
(@mattn コメントありがとうございます！)

sudo apt install g++ unzip zip openjdk-11-jdk

cd ~/Downloads
wget https://github.com/bazelbuild/bazel/releases/download/3.4.1/bazel-3.4.1-linux-arm64

cd ~/sources
git clone https://github.com/tensorflow/tensorflow
cd tensorflow

# EdgeTPU版はこのコミットのものが必要(動作確認は後述)
git checkout d855adfc5a0195788bf5f92c3c7352e638aa1109
https://github.com/tensorflow/tensorflow/issues/36184#issuecomment-579274152

# CPU版だけでいい場合はこれでも大丈夫
# git checkout -b v2.2.0 rafs/tags/v2.2.0 # r2.2ブランチでもよさそう

cd ~/sources/tensorflow
pyenv global 3.7.8 # "pyenv shell 3.7.8" ではダメ

# C APIをビルド
~/Downloads/bazel-3.4.1-linux-arm64 build -c opt --config monolithic //tensorflow/lite/c:tensorflowlite_c

上記でC APIがビルドできたので、go-tfliteをビルドする。

# golangが必要
sudo apt install golang-go

cd ~/sources
git clone https://github.com/mattn/go-tflite
cd go-tflite
# C APIのパス設定
export CGO_CFLAGS=-I/home/ubuntu/sources/tensorflow
export CGO_LDFLAGS=-L/home/ubuntu/sources/tensorflow/bazel-out/aarch64-opt/bin/tensorflow/lite/c
export LD_LIBRARY_PATH=/home/ubuntu/sources/tensorflow/bazel-out/aarch64-opt/bin/tensorflow/lite/c

label_image_edgetpuを動かしてみる。

cd _example/label_image_edgetpu
./fetch_testfiles.sh
go run main.go

以下の表示になる

EdgeTPU Version: BuildLabel(COMPILER=6.3.0 20170516,DATE=redacted,TIME=redacted,CL_NUMBER=317268237), RuntimeVersion(13)
01: 1  bicycle: 0.643137
02: 2  car: 0.533333
00: 0  person: 0.250980
03: 3  motorcycle: 0.235294

CPU版label_imageも問題なく動く。

cd ../label_image
go run main.go
# -> 85: peacock: 0.976471

同様に、ssdとssd_edgetpuも動くが、ssd_edgetpuのコンソール表示がうるさい。

ubuntu@ubuntu:~/sources/go-tflite/_example/ssd_edgetpu$ go run main.go
[ WARN:0] global ../modules/videoio/src/cap_gstreamer.cpp (480) isPipelinePlaying OpenCV | GStreamer warning: GStreamer: pipeline have not been created
EdgeTPU Version: BuildLabel(COMPILER=6.3.0 20170516,DATE=redacted,TIME=redacted,CL_NUMBER=317268237), RuntimeVersion(13)
I :319] e-fuse programming revision: 2
2020/07/27 11:45:33 width: 300, height: 300, type: UInt8, scale: 0.0078125, zeropoint: 128
2020/07/27 11:45:33 input tensor count: 1, output tensor count: 4
I :1137] bulk in 1024 bytes from buffer index [0]
I :1137] bulk in 1024 bytes from buffer index [1]
I :1137] bulk in 1024 bytes from buffer index [2]
I :1137] bulk in 1024 bytes from buffer index [3]
I :1137] bulk in 1024 bytes from buffer index [4]
I :1137] bulk in 1024 bytes from buffer index [5]
I :1137] bulk in 1024 bytes from buffer index [6]
I :1137] bulk in 504 bytes from buffer index [7]
I :1137] bulk in 1024 bytes from buffer index [8]
I :1137] bulk in 1024 bytes from buffer index [9]
I :1137] bulk in 1024 bytes from buffer index [10]
I :1137] bulk in 1024 bytes from buffer index [11]
I :1137] bulk in 1024 bytes from buffer index [12]
I :1137] bulk in 1024 bytes from buffer index [13]
I :1137] bulk in 1024 bytes from buffer index [14]
I :1137] bulk in 1024 bytes from buffer index [15]
I :1137] bulk in 1024 bytes from buffer index [16]
I :1137] bulk in 1024 bytes from buffer index [17]
I :1137] bulk in 1024 bytes from buffer index [18]
I :1137] bulk in 1024 bytes from buffer index [19]
I :1137] bulk in 1024 bytes from buffer index [20]
I :1137] bulk in 1024 bytes from buffer index [21]
I :1137] bulk in 1024 bytes from buffer index [22]
I :1137] bulk in 1024 bytes from buffer index [23]
I :1137] bulk in 1024 bytes from buffer index [24]
I :1137] bulk in 1024 bytes from buffer index [25]
I :1137] bulk in 1024 bytes from buffer index [26]
I :1137] bulk in 1024 bytes from buffer index [27]
I :1137] bulk in 1024 bytes from buffer index [28]
I :1137] bulk in 1024 bytes from buffer index [29]
I :1137] bulk in 1024 bytes from buffer index [30]
I :1137] bulk in 1024 bytes from buffer index [31]
I :1137] bulk in 1024 bytes from buffer index [0]
I :1137] bulk in 1024 bytes from buffer index [1]
I :1137] bulk in 1024 bytes from buffer index [2]

...繰り返しっぽい

CPU版のコンソール表示は以下のようにすっきりしている。

ubuntu@ubuntu:~/sources/go-tflite/_example/ssd$ go run main.go
[ WARN:0] global ../modules/videoio/src/cap_gstreamer.cpp (480) isPipelinePlaying OpenCV | GStreamer warning: GStreamer: pipeline have not been created
2020/07/27 11:37:35 width: 300, height: 300, type: UInt8, scale: 0.0078125, zeropoint: 128
2020/07/27 11:37:35 input tensor count: 1, output tensor count: 4

CPU版が15fpsで、EdgeTPU版が20fpsであまり差がない。カメラのfps設定は関係なさそうなので、シリアル処理していて表示がボトルネックになってる可能性が高いように思う。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up