More than 5 years have passed since last update.

【失敗】RaspberryPi4の活用を見越して QEMU Staticモードによる x86_64 Ubuntu 環境への超高速な Debian Buster arm64(aarch64)エミュレーション環境の構築 (Tensorflow aarch64ビルド用)

Last updated at 2019-07-06Posted at 2019-07-05

１．Introduction

前回記事 RaspberryPi4の活用を見越して QEMU Staticモードによる x86_64 Ubuntu 環境への高速な Debian Buster arm64(aarch64)エミュレーション環境の構築 の環境では、Bazelが正常に動作しませんでした。ということで、下記の環境を再構築してみます。結果的には、QEMUのバグで Bazel のビルドが完走できませんが、もったいないので後学のために失敗記事を公開しておきます。次のQEMUバージョンで解消することを祈るばかりです。 Tensorflowのビルド以外の用途でしたら、最新でいて、とても高速で、快適なビルド環境が構築できると思います。

Ubuntu 16.04 x86_64 + Docker -> Ubuntu 19.04 x86_64 + chroot -> Debian Buster arm64(aarch64)

Ubuntu 16.04 の Docker上で Ubuntu 19.04 を動かしつつ、さらに Ubuntu 19.04 上の QEMU で Debian Buster (arm64/aarch64) を動作させます。何故こんなに複雑な環境にせざるを得なかったかというと、 Ubuntu 18.04以前のバージョンのQEMUがとても古く、Bazelのビルドがまともに開始出来なかったためです。普通にワキワキとアプリを動かして楽しむ分には問題になりませんが、 Tensorflow のビルドだけは致命的な問題になります。

頑張って環境を構築しただけありまして、私のメインPCのノートパソコンでは、１２スレッド並列ビルドが実行できるようになりました。かなり高速です。技適通過済みの RaspberryPi4 が早く欲しいです。

２．Environment

Ubuntu 16.04 xenial (x86_64)
Docker
Ubuntu 19.04 disco (x86_64)
QEMU 3.1.0
Debian Buster (arm64/aarch64)
Bazel 0.24.1 (arm64/aarch64)
Tensorflow v1.14.0 (arm64/aarch64)
Python 3.7.3

３．Procedure

Ubuntu_16.04

$ git clone --single-branch https://github.com/tianon/docker-brew-ubuntu-core.git
$ cd docker-brew-ubuntu-core/disco
$ curl -sc /tmp/cookie "https://drive.google.com/uc?export=download&id=1T_BJB0VURffAyuQOk5rzDjbOlsQIagwf" > /dev/null
$ CODE="$(awk '/_warning_/ {print $NF}' /tmp/cookie)"
$ curl -Lb /tmp/cookie "https://drive.google.com/uc?export=download&confirm=${CODE}&id=1T_BJB0VURffAyuQOk5rzDjbOlsQIagwf" -o ubuntu-disco-core-cloudimg-amd64-root.tar.gz
$ sudo docker build -t ubuntu1904 .

Ubuntu_16.04

$ cd ~
$ mkdir qemu-debian-arm64;cd qemu-debian-arm64
$ curl -sc /tmp/cookie "https://drive.google.com/uc?export=download&id=1mkxiqwDYYJ3OjBNn7rUh3Y-uTuJw1qL0" > /dev/null
$ CODE="$(awk '/_warning_/ {print $NF}' /tmp/cookie)"
$ curl -Lb /tmp/cookie "https://drive.google.com/uc?export=download&confirm=${CODE}&id=1mkxiqwDYYJ3OjBNn7rUh3Y-uTuJw1qL0" -o 20190628_raspberry-pi-3_buster_PREVIEW.img.xz
$ sudo xz -dv 20190628_raspberry-pi-3_buster_PREVIEW.img.xz
$ sudo losetup -P -f -r 20190628_raspberry-pi-3_buster_PREVIEW.img
$ sudo losetup -nl
/dev/loop1         0      0         1  1 /var/lib/snapd/snaps/shotcut_47.snap
/dev/loop6         0      0         1  1 /var/lib/snapd/snaps/vidcutter_14.snap
/dev/loop4         0      0         1  1 /var/lib/snapd/snaps/core_7169.snap
/dev/loop2         0      0         1  1 /var/lib/snapd/snaps/vott_x1.snap
/dev/loop0         0      0         1  1 /var/lib/snapd/snaps/core_7270.snap
/dev/loop7         0      0         1  1 /home/b920405/qemu-debian-arm64/20190628_raspberry-pi-3_buster_PREVIEW.img
/dev/loop5         0      0         1  1 /var/lib/snapd/snaps/skype_66.snap
/dev/loop3         0      0         1  1 /var/lib/snapd/snaps/shotcut_45.snap

$ sudo fdisk -l /dev/loop7
Disk /dev/loop7: 1.5 GiB, 1572864000 bytes, 3072000 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0x36969c21

デバイス     起動  Start 最後から セクタ  Size Id タイプ
/dev/loop7p1        2048   614399  612352  299M  c W95 FAT32 (LBA)
/dev/loop7p2      614400  3071999 2457600  1.2G 83 Linux

$ sudo mount -o ro /dev/loop7p2 /mnt
$ sudo tar cvjf ./debian-buster-arm64.tar.bz2 -C /mnt .

$ sudo umount /mnt
$ sudo losetup -d /dev/loop7

$ mkdir -p debian;cd debian
$ sudo tar xjf ../debian-buster-arm64.tar.bz2

$ sudo docker run --privileged=true -v ${HOME}/qemu-debian-arm64:/qemu-debian-arm64 -it --name="ubuntu1904" ubuntu1904 /bin/bash

Docker_+_Ubuntu_19.04

# cat /etc/lsb-release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=19.04
DISTRIB_CODENAME=disco
DISTRIB_DESCRIPTION="Ubuntu 19.04"

# apt-get update
# apt-get install -y qemu qemu-user-static \
libhdf5-dev libc-ares-dev libeigen3-dev \
libatlas3-base net-tools build-essential \
zip unzip python3-pip curl wget git nano

# update-binfmts --display | grep aarch64
qemu-aarch64 (enabled):
 interpreter = /usr/bin/qemu-aarch64-static

# cd qemu-debian-arm64
# cp /usr/bin/qemu-aarch64-static debian/usr/bin/

chroot_+_Debian_Buster

# mount -t sysfs sysfs debian/sys
# mount -t proc proc debian/proc
# mount -t devtmpfs udev debian/dev
# mount -t devpts devpts debian/dev/pts
# chroot debian /bin/bash

# nano /etc/resolv.conf
nameserver 8.8.8.8
nameserver 8.8.4.4

# apt-get update

chroot_+_Debian_Buster

# apt-get install -y \
libhdf5-dev libc-ares-dev libeigen3-dev \
libatlas3-base net-tools build-essential \
zip unzip python3-pip curl wget git zip unzip
# pip3 install pip --upgrade
# pip3 install zipper
# pip3 install keras_applications==1.0.7 --no-deps
# pip3 install keras_preprocessing==1.0.9 --no-deps
# wget https://github.com/PINTO0309/Tensorflow-bin/raw/master/packages/absl_py-0.7.1-cp37-none-any.whl
# wget https://github.com/PINTO0309/Tensorflow-bin/raw/master/packages/gast-0.2.2-cp37-none-any.whl
# wget https://github.com/PINTO0309/Tensorflow-bin/raw/master/packages/grpcio-1.21.1-cp37-cp37m-linux_aarch64.whl
# wget https://github.com/PINTO0309/Tensorflow-bin/raw/master/packages/h5py-2.9.0-cp37-cp37m-linux_aarch64.whl
# wget https://github.com/PINTO0309/Tensorflow-bin/raw/master/packages/numpy-1.16.4-cp37-cp37m-linux_aarch64.whl
# wget https://github.com/PINTO0309/Tensorflow-bin/raw/master/packages/wrapt-1.11.2-cp37-cp37m-linux_aarch64.whl
# pip3 install *.whl
# apt-get install -y openmpi-bin libopenmpi-dev
# pip3 install -U --user mock zipper wheel

# apt-get update
# apt-get remove -y openjdk-8* --purge
# apt-get install -y openjdk-11-jdk

# cd ~
# mkdir bazel;cd bazel
# wget https://github.com/bazelbuild/bazel/releases/download/0.24.1/bazel-0.24.1-dist.zip
# unzip bazel-0.24.1-dist.zip
# env EXTRA_BAZEL_ARGS="--host_javabase=@local_jdk//:jdk"

$ nano compile.sh

#################################################################################
bazel_build "src:bazel_nojdk${EXE_EXT}" \
  --action_env=PATH \
  --host_platform=@bazel_tools//platforms:host_platform \
  --platforms=@bazel_tools//platforms:target_platform \
  || fail "Could not build Bazel"
#################################################################################
↓
#################################################################################
bazel_build "src:bazel_nojdk${EXE_EXT}" \
  --host_javabase=@local_jdk//:jdk \
  --action_env=PATH \
  --host_platform=@bazel_tools//platforms:host_platform \
  --platforms=@bazel_tools//platforms:target_platform \
  || fail "Could not build Bazel"
#################################################################################

# bash ./compile.sh
# cp output/bazel /usr/local/bin

# bazel version
Extracting Bazel installation...
WARNING: --batch mode is deprecated. Please instead explicitly shut down your Bazel server using the command "bazel shutdown".
Build label: 0.24.1- (@non-git)
Build target: bazel-out/aarch64-opt/bin/src/main/java/com/google/devtools/build/lib/bazel/BazelServer_deploy.jar
Build time: Sun Jun 23 20:46:48 2019 (1561322808)
Build timestamp: 1561322808
Build timestamp as int: 1561322808


# cd ~
# git clone -b v1.14.0 https://github.com/tensorflow/tensorflow.git
# cd tensorflow
# git checkout -b v1.14.0

Add the following two lines to the last line.

tensorflow/lite/python/interpreter.py

  def set_num_threads(self, i):
    return self._interpreter.SetNumThreads(i)

Add InterpreterWrapper :: SetNumThreads (int i).

tensorflow/lite/python/interpreter_wrapper/interpreter_wrapper.cc

// Corrected the vicinity of the last line as follows
PyObject* InterpreterWrapper::ResetVariableTensors() {
  TFLITE_PY_ENSURE_VALID_INTERPRETER();
  TFLITE_PY_CHECK(interpreter_->ResetVariableTensors());
  Py_RETURN_NONE;
}

PyObject* InterpreterWrapper::SetNumThreads(int i) {
  interpreter_->SetNumThreads(i);
  Py_RETURN_NONE;
}

}  // namespace interpreter_wrapper
}  // namespace tflite

Add SetNumThreads(int i).

tensorflow/lite/python/interpreter_wrapper/interpreter_wrapper.h

  // should be the interpreter object providing the memory.
  PyObject* tensor(PyObject* base_object, int i);

  PyObject* SetNumThreads(int i);

 private:
  // Helper function to construct an `InterpreterWrapper` object.
  // It only returns InterpreterWrapper if it can construct an `Interpreter`.

Change BUILD_WITH_NNAPI = true to BUILD_WITH_NNAPI = false.

tensorflow/lite/tools/make/Makefile

BUILD_WITH_NNAPI=false

It corrects as follows.

tensorflow/lite/tools/make/targets/aarch64_makefile.inc

# Settings for generic aarch64 boards such as Odroid C2 or Pine64.
ifeq ($(TARGET),aarch64)
  # The aarch64 architecture covers all 64-bit ARM chips. This arch mandates
  # NEON, so FPU flags are not needed below.
  TARGET_ARCH := armv8-a
  TARGET_TOOLCHAIN_PREFIX := aarch64-linux-gnu-

  CXXFLAGS += \
    -march=armv8-a \
    -funsafe-math-optimizations \
    -ftree-vectorize \
    -flax-vector-conversions \
    -fomit-frame-pointer \
    -fPIC

  CFLAGS += \
    -march=armv8-a \
    -funsafe-math-optimizations \
    -ftree-vectorize \
    -flax-vector-conversions \
    -fomit-frame-pointer \
    -fPIC

  LDFLAGS := \
    -Wl,--no-export-dynamic \
    -Wl,--exclude-libs,ALL \
    -Wl,--gc-sections \
    -Wl,--as-needed

  LIBS := \
    -lstdc++ \
    -lpthread \
    -lm \
    -ldl \
    -lrt

endif

Add 4 lines as below. Please remove the + mark.

tensorflow/lite/build_def.bzl

            "/DTF_COMPILE_LIBRARY",
            "/wd4018",  # -Wno-sign-compare
        ],
+       str(Label("//tensorflow:linux_aarch64")): [
+           "-flax-vector-conversions",
+           "-fomit-frame-pointer",
+       ],
        "//conditions:default": [
            "-Wno-sign-compare",
        ],

Configuration_of_Tensorflow

# ./configure
WARNING: --batch mode is deprecated. Please instead explicitly shut down your Bazel server using the command "bazel shutdown".
You have bazel 0.24.1- (@non-git) installed.
Please specify the location of python. [Default is /usr/bin/python3]: 


Found possible Python library paths:
  /usr/local/lib/python3.7/dist-packages
  /usr/lib/python3/dist-packages
Please input the desired Python library path to use.  Default is [/usr/local/lib/python3.7/dist-packages]

Do you wish to build TensorFlow with XLA JIT support? [Y/n]: n
No XLA JIT support will be enabled for TensorFlow.

Do you wish to build TensorFlow with OpenCL SYCL support? [y/N]: n
No OpenCL SYCL support will be enabled for TensorFlow.

Do you wish to build TensorFlow with ROCm support? [y/N]: n
No ROCm support will be enabled for TensorFlow.

Do you wish to build TensorFlow with CUDA support? [y/N]: n
No CUDA support will be enabled for TensorFlow.

Do you wish to download a fresh release of clang? (Experimental) [y/N]: n
Clang will not be downloaded.

Do you wish to build TensorFlow with MPI support? [y/N]: n
No MPI support will be enabled for TensorFlow.

Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -march=native -Wno-sign-compare]: 


Would you like to interactively configure ./WORKSPACE for Android builds? [y/N]: n
Not configuring the WORKSPACE for Android builds.

Preconfigured Bazel build configs. You can use any of the below by adding "--config=<>" to your build command. See .bazelrc for more details.
	--config=mkl         	# Build with MKL support.
	--config=monolithic  	# Config for mostly static monolithic build.
	--config=gdr         	# Build with GDR support.
	--config=verbs       	# Build with libverbs support.
	--config=ngraph      	# Build with Intel nGraph support.
	--config=numa        	# Build with NUMA support.
	--config=dynamic_kernels	# (Experimental) Build kernels into separate shared objects.
Preconfigured Bazel build configs to DISABLE default on features:
	--config=noaws       	# Disable AWS S3 filesystem support.
	--config=nogcp       	# Disable GCP support.
	--config=nohdfs      	# Disable HDFS support.
	--config=noignite    	# Disable Apache Ignite support.
	--config=nokafka     	# Disable Apache Kafka support.
	--config=nonccl      	# Disable NVIDIA NCCL support.
Configuration finished

Tensorflow_v1.14.0_build_by_Bazel_0.24.1

# bazel build \
--config=opt \
--config=noaws \
--config=nogcp \
--config=nohdfs \
--config=noignite \
--config=nokafka \
--config=nonccl \
--local_resources=8192.0,4.0,1.0 \
--copt=-ftree-vectorize \
--copt=-funsafe-math-optimizations \
--copt=-ftree-loop-vectorize \
--copt=-flax-vector-conversions \
--copt=-fomit-frame-pointer \
//tensorflow/tools/pip_package:build_pip_package

４．Finally

Bazelのビルドで失敗します。次期バージョン以降のQEMUに期待するしかありませんが、この手順は arm64(aarch64) アーキテクチャのエミュレーション以外の用途にも流用ができそうです。 Tensorflow以外のビルド用途にはもってこいかもしれません。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up