1.Introduction
今回は Google Cloud Storage をキャッシング環境に使用した最もお手軽なビルド手順を試行しました。 1時間掛かる Tensorflow のビルドを 2分20秒ほどに短縮できます。 ビルド済みのバイナリとソースファイルをハッシュ化してストレージ上にキャッシュし、2回目以降のビルド時にはキャッシュ済みのハッシュ値と現ファイルから計算したハッシュ値を比較して同じであればビルドを簡略化し、ハッシュ値に差分のあるファイルのみコンパイルします。 よって、初回ビルド時はキャッシュが全く無くキャッシングの処理に余分な時間を割くため、通常のビルドよりも少しだけ遅くなります。 検証の結果、Tensorflow がバージョンアップした場合はキャッシングのベースとなっているハッシュ値が完全に異なるようで、うまくキャッシュが効きませんでした。 試行錯誤しながらビルドバラメータを頻繁に変更したり、とあるソースファイルのバグフィックスを単発で実施した場合のリビルドの場合にはかなり強力なパフォーマンスを発揮します。 複数の開発者間でビルド済みのバイナリを共有してデバッグ効率を少しだけ上げるような用途には向いていそうです。 次回は、nginxを使用した完全無料ローカルキャッシュ環境構築にトライしてみようと思います。
Build Tensorflow super fast using Bazel's Remote Caching feature. It is 24 times faster than the standard build method. However, according to the examination result of this article, it seems that the difference in the version upgrade of the repository can not be compensated. Provides a shared and incremental compilation environment for pre-built binaries among multiple developers. Next time, I would like to try the local cache procedure by nginx.
Bazel の リモートキャッシングを有効にした Tensorflow v1.14.0 のフルビルドの様子を動画にしました。 速すぎて笑えます。https://t.co/fACRf2WfKu
— PINTO0309 (@PINTO03091) August 4, 2019
2.Environment
- Ubuntu 16.04 x86_64
- Tensorflow v1.13.2 + Bazel 0.20.0
- Tensorflow v1.14.0 + Bazel 0.24.1
- Google Cloud Storage
3.Prepare Google Cloud Storage for Caching
You need a Google Cloud account with billing enabled.
3−1.Create storage bucket
https://docs.bazel.build/versions/master/remote-caching.html#google-cloud-storage
3−2.Create service account
https://cloud.google.com/iam/docs/creating-managing-service-accounts#creating_a_service_account
A json file with a name such as xxxxx-xxxxxxxxxxxx.json
will be downloaded automatically.
3−3.Preparation for connection to Remote cache (Google Cloud Storage)
$ cd ~
$ mkdir bazel-caching
$ mv ~/Downloads/xxxxx-xxxxxxxxxxxx.json ${HOME}/bazel-caching
4.(First time) Building Tensorflow v1.13.2
$ sudo apt-get install -y libhdf5-dev libc-ares-dev libeigen3-dev
$ sudo pip3 install keras_applications==1.0.7 --no-deps
$ sudo pip3 install keras_preprocessing==1.0.9 --no-deps
$ sudo pip3 install h5py==2.9.0
$ sudo apt-get install -y openmpi-bin libopenmpi-dev
$ sudo -H pip3 install -U --user six numpy wheel mock
$ sudo apt update;sudo apt upgrade
$ cd ~
$ git clone https://github.com/PINTO0309/Bazel_bin.git
$ Bazel_bin/0.20.0/Ubuntu1604_x86_64/install.sh
$ cd ~
$ git clone https://github.com/tensorflow/tensorflow.git
$ cd tensorflow
$ git branch -a
* master
remotes/origin/0.6.0
remotes/origin/HEAD -> origin/master
remotes/origin/bananabowl-patch-1
remotes/origin/ewilderj-patch-1
remotes/origin/jvishnuvardhan-patch-1
remotes/origin/jvishnuvardhan-patch-9
remotes/origin/master
remotes/origin/patch-cherry-pick-tf-data
remotes/origin/r0.10
remotes/origin/r0.11
remotes/origin/r0.12
remotes/origin/r0.7
remotes/origin/r0.8
remotes/origin/r0.9
remotes/origin/r1.0
remotes/origin/r1.1
remotes/origin/r1.10
remotes/origin/r1.11
remotes/origin/r1.12
remotes/origin/r1.13
remotes/origin/r1.14
remotes/origin/r1.2
remotes/origin/r1.3
remotes/origin/r1.4
remotes/origin/r1.5
remotes/origin/r1.6
remotes/origin/r1.7
remotes/origin/r1.8
remotes/origin/r1.9
remotes/origin/r2.0
remotes/origin/release_1.14.0
remotes/origin/rthadur-patch-1
$ git checkout -b r1.13 origin/r1.13
$ sudo bazel clean
$ ./configure
WARNING: --batch mode is deprecated. Please instead explicitly shut down your Bazel server using the command "bazel shutdown".
INFO: Invocation ID: 202dac03-9d5d-4544-ba79-90001b7b2ca9
You have bazel 0.20.0- (@non-git) installed.
Please specify the location of python. [Default is /usr/bin/python]: /usr/bin/python3
Found possible Python library paths:
/opt/intel/openvino_2019.2.242/python/python3
/opt/intel/openvino_2019.2.242/python/python3.5
/home/b920405/git/caffe-jacinto/python
/opt/intel/openvino_2019.2.242/deployment_tools/model_optimizer
.
/opt/movidius/caffe/python
/usr/lib/python3/dist-packages
/usr/local/lib/python3.5/dist-packages
/usr/local/lib
Please input the desired Python library path to use. Default is [/opt/intel/openvino_2019.2.242/python/python3]
/usr/local/lib/python3.5/dist-packages
Do you wish to build TensorFlow with XLA JIT support? [Y/n]: n
No XLA JIT support will be enabled for TensorFlow.
Do you wish to build TensorFlow with OpenCL SYCL support? [y/N]: n
No OpenCL SYCL support will be enabled for TensorFlow.
Do you wish to build TensorFlow with ROCm support? [y/N]: n
No ROCm support will be enabled for TensorFlow.
Do you wish to build TensorFlow with CUDA support? [y/N]: n
No CUDA support will be enabled for TensorFlow.
Do you wish to download a fresh release of clang? (Experimental) [y/N]: n
Clang will not be downloaded.
Do you wish to build TensorFlow with MPI support? [y/N]: n
No MPI support will be enabled for TensorFlow.
Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -march=native -Wno-sign-compare]:
Would you like to interactively configure ./WORKSPACE for Android builds? [y/N]: n
Not configuring the WORKSPACE for Android builds.
Preconfigured Bazel build configs. You can use any of the below by adding "--config=<>" to your build command. See .bazelrc for more details.
--config=mkl # Build with MKL support.
--config=monolithic # Config for mostly static monolithic build.
--config=gdr # Build with GDR support.
--config=verbs # Build with libverbs support.
--config=ngraph # Build with Intel nGraph support.
--config=dynamic_kernels # (Experimental) Build kernels into separate shared objects.
Preconfigured Bazel build configs to DISABLE default on features:
--config=noaws # Disable AWS S3 filesystem support.
--config=nogcp # Disable GCP support.
--config=nohdfs # Disable HDFS support.
--config=noignite # Disable Apacha Ignite support.
--config=nokafka # Disable Apache Kafka support.
--config=nonccl # Disable NVIDIA NCCL support.
Configuration finished
$ sudo bazel build \
--config=opt \
--config=noaws \
--config=nogcp \
--config=nohdfs \
--config=noignite \
--config=nokafka \
--config=nonccl \
--copt=-ftree-vectorize \
--copt=-funsafe-math-optimizations \
--copt=-ftree-loop-vectorize \
--copt=-fomit-frame-pointer \
--remote_http_cache=https://storage.googleapis.com/bucket-bazel-tensorflow \
--google_credentials=${HOME}/bazel-caching/xxxxx-xxxxxxxxxxxx.json \
//tensorflow/tools/pip_package:build_pip_package
Action result metadata is stored under the path /ac/
Output files are stored under the path /cas/
5.(Second time) Building Tensorflow v1.14.0
$ cd ~
$ Bazel_bin/0.24.1/Ubuntu1604_x86_64/install.sh
$ cd ~
$ cd tensorflow
$ git checkout -b r1.14 origin/r1.14
$ sudo bazel clean
$ ./configure
WARNING: --batch mode is deprecated. Please instead explicitly shut down your Bazel server using the command "bazel shutdown".
You have bazel 0.24.1- (@non-git) installed.
Please specify the location of python. [Default is /usr/bin/python]: /usr/bin/python3
Found possible Python library paths:
/opt/intel/openvino_2019.2.242/python/python3
/opt/intel/openvino_2019.2.242/python/python3.5
/home/b920405/git/caffe-jacinto/python
/opt/intel/openvino_2019.2.242/deployment_tools/model_optimizer
.
/opt/movidius/caffe/python
/usr/lib/python3/dist-packages
/usr/local/lib/python3.5/dist-packages
/usr/local/lib
Please input the desired Python library path to use. Default is [/opt/intel/openvino_2019.2.242/python/python3]
/usr/local/lib/python3.5/dist-packages
Do you wish to build TensorFlow with XLA JIT support? [Y/n]: n
No XLA JIT support will be enabled for TensorFlow.
Do you wish to build TensorFlow with OpenCL SYCL support? [y/N]: n
No OpenCL SYCL support will be enabled for TensorFlow.
Do you wish to build TensorFlow with ROCm support? [y/N]: n
No ROCm support will be enabled for TensorFlow.
Do you wish to build TensorFlow with CUDA support? [y/N]: n
No CUDA support will be enabled for TensorFlow.
Do you wish to download a fresh release of clang? (Experimental) [y/N]: n
Clang will not be downloaded.
Do you wish to build TensorFlow with MPI support? [y/N]: n
No MPI support will be enabled for TensorFlow.
Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -march=native -Wno-sign-compare]:
Would you like to interactively configure ./WORKSPACE for Android builds? [y/N]: n
Not configuring the WORKSPACE for Android builds.
Preconfigured Bazel build configs. You can use any of the below by adding "--config=<>" to your build command. See .bazelrc for more details.
--config=mkl # Build with MKL support.
--config=monolithic # Config for mostly static monolithic build.
--config=gdr # Build with GDR support.
--config=verbs # Build with libverbs support.
--config=ngraph # Build with Intel nGraph support.
--config=numa # Build with NUMA support.
--config=dynamic_kernels # (Experimental) Build kernels into separate shared objects.
Preconfigured Bazel build configs to DISABLE default on features:
--config=noaws # Disable AWS S3 filesystem support.
--config=nogcp # Disable GCP support.
--config=nohdfs # Disable HDFS support.
--config=noignite # Disable Apache Ignite support.
--config=nokafka # Disable Apache Kafka support.
--config=nonccl # Disable NVIDIA NCCL support.
Configuration finished
$ sudo bazel build \
--config=opt \
--config=noaws \
--config=nogcp \
--config=nohdfs \
--config=noignite \
--config=nokafka \
--config=nonccl \
--copt=-ftree-vectorize \
--copt=-funsafe-math-optimizations \
--copt=-ftree-loop-vectorize \
--copt=-fomit-frame-pointer \
--remote_http_cache=https://storage.googleapis.com/bucket-bazel-tensorflow \
--google_credentials=${HOME}/bazel-caching/xxxxx-xxxxxxxxxxxx.json \
//tensorflow/tools/pip_package:build_pip_package
It seems that caching does not work well if you switch the version or branch of the OSS to be built.
6.(Third time) Building Tensorflow v1.14.0 (Fix the program and rebuild it after removing Bazel's prebuilt binaries)
# Add the following two lines to the last line
def set_num_threads(self, i):
return self._interpreter.SetNumThreads(i)
// Corrected the vicinity of the last line as follows
PyObject* InterpreterWrapper::ResetVariableTensors() {
TFLITE_PY_ENSURE_VALID_INTERPRETER();
TFLITE_PY_CHECK(interpreter_->ResetVariableTensors());
Py_RETURN_NONE;
}
PyObject* InterpreterWrapper::SetNumThreads(int i) {
interpreter_->SetNumThreads(i);
Py_RETURN_NONE;
}
} // namespace interpreter_wrapper
} // namespace tflite
// should be the interpreter object providing the memory.
PyObject* tensor(PyObject* base_object, int i);
PyObject* SetNumThreads(int i);
private:
// Helper function to construct an `InterpreterWrapper` object.
// It only returns InterpreterWrapper if it can construct an `Interpreter`.
BUILD_WITH_NNAPI=false
from tensorflow.contrib import checkpoint
# if os.name != "nt" and platform.machine() != "s390x":
# from tensorflow.contrib import cloud
from tensorflow.contrib import cluster_resolver
from tensorflow.contrib.summary import summary
if os.name != "nt" and platform.machine() != "s390x":
try:
from tensorflow.contrib import cloud
except ImportError:
pass
from tensorflow.python.util.lazy_loader import LazyLoader
ffmpeg = LazyLoader("ffmpeg", globals(),
"tensorflow.contrib.ffmpeg")
$ sudo bazel clean
$ ./configure
WARNING: --batch mode is deprecated. Please instead explicitly shut down your Bazel server using the command "bazel shutdown".
You have bazel 0.24.1- (@non-git) installed.
Please specify the location of python. [Default is /usr/bin/python]: /usr/bin/python3
Found possible Python library paths:
/opt/intel/openvino_2019.2.242/python/python3
/opt/intel/openvino_2019.2.242/python/python3.5
/home/b920405/git/caffe-jacinto/python
/opt/intel/openvino_2019.2.242/deployment_tools/model_optimizer
.
/opt/movidius/caffe/python
/usr/lib/python3/dist-packages
/usr/local/lib/python3.5/dist-packages
/usr/local/lib
Please input the desired Python library path to use. Default is [/opt/intel/openvino_2019.2.242/python/python3]
/usr/local/lib/python3.5/dist-packages
Do you wish to build TensorFlow with XLA JIT support? [Y/n]: n
No XLA JIT support will be enabled for TensorFlow.
Do you wish to build TensorFlow with OpenCL SYCL support? [y/N]: n
No OpenCL SYCL support will be enabled for TensorFlow.
Do you wish to build TensorFlow with ROCm support? [y/N]: n
No ROCm support will be enabled for TensorFlow.
Do you wish to build TensorFlow with CUDA support? [y/N]: n
No CUDA support will be enabled for TensorFlow.
Do you wish to download a fresh release of clang? (Experimental) [y/N]: n
Clang will not be downloaded.
Do you wish to build TensorFlow with MPI support? [y/N]: n
No MPI support will be enabled for TensorFlow.
Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -march=native -Wno-sign-compare]:
Would you like to interactively configure ./WORKSPACE for Android builds? [y/N]: n
Not configuring the WORKSPACE for Android builds.
Preconfigured Bazel build configs. You can use any of the below by adding "--config=<>" to your build command. See .bazelrc for more details.
--config=mkl # Build with MKL support.
--config=monolithic # Config for mostly static monolithic build.
--config=gdr # Build with GDR support.
--config=verbs # Build with libverbs support.
--config=ngraph # Build with Intel nGraph support.
--config=numa # Build with NUMA support.
--config=dynamic_kernels # (Experimental) Build kernels into separate shared objects.
Preconfigured Bazel build configs to DISABLE default on features:
--config=noaws # Disable AWS S3 filesystem support.
--config=nogcp # Disable GCP support.
--config=nohdfs # Disable HDFS support.
--config=noignite # Disable Apache Ignite support.
--config=nokafka # Disable Apache Kafka support.
--config=nonccl # Disable NVIDIA NCCL support.
Configuration finished
$ sudo bazel build \
--config=opt \
--config=noaws \
--config=nogcp \
--config=nohdfs \
--config=noignite \
--config=nokafka \
--config=nonccl \
--copt=-ftree-vectorize \
--copt=-funsafe-math-optimizations \
--copt=-ftree-loop-vectorize \
--copt=-fomit-frame-pointer \
--remote_http_cache=https://storage.googleapis.com/bucket-bazel-tensorflow \
--google_credentials=${HOME}/bazel-caching/xxxxx-xxxxxxxxxxxx.json \
//tensorflow/tools/pip_package:build_pip_package
If you just fix a bug fix or build sequence without switching the OSS version or branch to build, caching seems to work well.
7.Appendix
gsutil -m rm -r gs://bucket-bazel-tensorflow/ac
gsutil -m rm -r gs://bucket-bazel-tensorflow/cas
8.Reference articles
https://docs.bazel.build/versions/master/remote-caching.html