LoginSignup
7
4

More than 3 years have passed since last update.

Ultra-fast build of Tensorflow with Bazel Remote Caching [Google Cloud Storage version]

Last updated at Posted at 2019-08-04

Tensorflow-bin GitHub stars

Bazel_bin GitHub stars

1.Introduction

今回は Google Cloud Storage をキャッシング環境に使用した最もお手軽なビルド手順を試行しました。 1時間掛かる Tensorflow のビルドを 2分20秒ほどに短縮できます。 ビルド済みのバイナリとソースファイルをハッシュ化してストレージ上にキャッシュし、2回目以降のビルド時にはキャッシュ済みのハッシュ値と現ファイルから計算したハッシュ値を比較して同じであればビルドを簡略化し、ハッシュ値に差分のあるファイルのみコンパイルします。 よって、初回ビルド時はキャッシュが全く無くキャッシングの処理に余分な時間を割くため、通常のビルドよりも少しだけ遅くなります。 検証の結果、Tensorflow がバージョンアップした場合はキャッシングのベースとなっているハッシュ値が完全に異なるようで、うまくキャッシュが効きませんでした。 試行錯誤しながらビルドバラメータを頻繁に変更したり、とあるソースファイルのバグフィックスを単発で実施した場合のリビルドの場合にはかなり強力なパフォーマンスを発揮します。 複数の開発者間でビルド済みのバイナリを共有してデバッグ効率を少しだけ上げるような用途には向いていそうです。 次回は、nginxを使用した完全無料ローカルキャッシュ環境構築にトライしてみようと思います。

Build Tensorflow super fast using Bazel's Remote Caching feature. It is 24 times faster than the standard build method. However, according to the examination result of this article, it seems that the difference in the version upgrade of the repository can not be compensated. Provides a shared and incremental compilation environment for pre-built binaries among multiple developers. Next time, I would like to try the local cache procedure by nginx.

2.Environment

  • Ubuntu 16.04 x86_64
  • Tensorflow v1.13.2 + Bazel 0.20.0
  • Tensorflow v1.14.0 + Bazel 0.24.1
  • Google Cloud Storage

bazel-caching (1).png

3.Prepare Google Cloud Storage for Caching

You need a Google Cloud account with billing enabled.

3−1.Create storage bucket

https://docs.bazel.build/versions/master/remote-caching.html#google-cloud-storage

FireShot Capture 056 - ストレージ バケットの作成  -  Cloud Storage  -  Google Cloud - cloud.google.com.png
FireShot Capture 057 - 参照 - My First Project - Google Cloud Platform - console.cloud.google.com.png
FireShot Capture 058 - Storage – My First Project – Google Cloud Platform - console.cloud.google.com.png
FireShot Capture 059 - Storage – My First Project – Google Cloud Platform - console.cloud.google.com.png
FireShot Capture 060 - Storage – My First Project – Google Cloud Platform - console.cloud.google.com.png
FireShot Capture 061 - Storage – My First Project – Google Cloud Platform - console.cloud.google.com.png
FireShot Capture 062 - バケットの詳細 - My First Project - Google Cloud Platform - console.cloud.google.com.png

3−2.Create service account

https://cloud.google.com/iam/docs/creating-managing-service-accounts#creating_a_service_account

FireShot Capture 063 - サービス アカウントの作成と管理  -  Cloud Identity and Access Management のドキュメント  - _ - cloud.google.com.png
FireShot Capture 064 - サービス アカウント – IAM と管理 – Google Cloud Platform - console.cloud.google.com.png
FireShot Capture 065 - サービス アカウント – IAM と管理 – Google Cloud Platform - console.cloud.google.com.png
FireShot Capture 066 - サービス アカウント – IAM と管理 – My First Project – Google Cloud Platform_ - console.cloud.google.com.png
FireShot Capture 067 - サービス アカウントの作成 – IAM と管理 – My First Project – Google Cloud Platform_ - console.cloud.google.com.png
FireShot Capture 068 - サービス アカウントの作成 – IAM と管理 – My First Project – Google Cloud Platform_ - console.cloud.google.com.png
FireShot Capture 070 - サービス アカウントの作成 – IAM と管理 – My First Project – Google Cloud Platform_ - console.cloud.google.com.png
FireShot Capture 071 - サービス アカウントの作成 – IAM と管理 – My First Project – Google Cloud Platform_ - console.cloud.google.com.png
FireShot Capture 072 - サービス アカウントの作成 – IAM と管理 – My First Project – Google Cloud Platform_ - console.cloud.google.com.png
FireShot Capture 074 - サービス アカウントの作成 – IAM と管理 – My First Project – Google Cloud Platform_ - console.cloud.google.com.png

A json file with a name such as xxxxx-xxxxxxxxxxxx.json will be downloaded automatically.

FireShot Capture 075 - サービス アカウントの作成 – IAM と管理 – My First Project – Google Cloud Platform_ - console.cloud.google.com.png
FireShot Capture 076 - サービス アカウント – IAM と管理 – My First Project – Google Cloud Platform_ - console.cloud.google.com.png

3−3.Preparation for connection to Remote cache (Google Cloud Storage)

$ cd ~
$ mkdir bazel-caching
$ mv ~/Downloads/xxxxx-xxxxxxxxxxxx.json ${HOME}/bazel-caching

4.(First time) Building Tensorflow v1.13.2

$ sudo apt-get install -y libhdf5-dev libc-ares-dev libeigen3-dev
$ sudo pip3 install keras_applications==1.0.7 --no-deps
$ sudo pip3 install keras_preprocessing==1.0.9 --no-deps
$ sudo pip3 install h5py==2.9.0
$ sudo apt-get install -y openmpi-bin libopenmpi-dev
$ sudo -H pip3 install -U --user six numpy wheel mock
$ sudo apt update;sudo apt upgrade

$ cd ~
$ git clone https://github.com/PINTO0309/Bazel_bin.git
$ Bazel_bin/0.20.0/Ubuntu1604_x86_64/install.sh

$ cd ~
$ git clone https://github.com/tensorflow/tensorflow.git
$ cd tensorflow
$ git branch -a
* master
  remotes/origin/0.6.0
  remotes/origin/HEAD -> origin/master
  remotes/origin/bananabowl-patch-1
  remotes/origin/ewilderj-patch-1
  remotes/origin/jvishnuvardhan-patch-1
  remotes/origin/jvishnuvardhan-patch-9
  remotes/origin/master
  remotes/origin/patch-cherry-pick-tf-data
  remotes/origin/r0.10
  remotes/origin/r0.11
  remotes/origin/r0.12
  remotes/origin/r0.7
  remotes/origin/r0.8
  remotes/origin/r0.9
  remotes/origin/r1.0
  remotes/origin/r1.1
  remotes/origin/r1.10
  remotes/origin/r1.11
  remotes/origin/r1.12
  remotes/origin/r1.13
  remotes/origin/r1.14
  remotes/origin/r1.2
  remotes/origin/r1.3
  remotes/origin/r1.4
  remotes/origin/r1.5
  remotes/origin/r1.6
  remotes/origin/r1.7
  remotes/origin/r1.8
  remotes/origin/r1.9
  remotes/origin/r2.0
  remotes/origin/release_1.14.0
  remotes/origin/rthadur-patch-1

$ git checkout -b r1.13 origin/r1.13
$ sudo bazel clean

$ ./configure

WARNING: --batch mode is deprecated. Please instead explicitly shut down your Bazel server using the command "bazel shutdown".
INFO: Invocation ID: 202dac03-9d5d-4544-ba79-90001b7b2ca9
You have bazel 0.20.0- (@non-git) installed.
Please specify the location of python. [Default is /usr/bin/python]: /usr/bin/python3


Found possible Python library paths:
  /opt/intel/openvino_2019.2.242/python/python3
  /opt/intel/openvino_2019.2.242/python/python3.5
  /home/b920405/git/caffe-jacinto/python
  /opt/intel/openvino_2019.2.242/deployment_tools/model_optimizer
  .
  /opt/movidius/caffe/python
  /usr/lib/python3/dist-packages
  /usr/local/lib/python3.5/dist-packages
  /usr/local/lib
Please input the desired Python library path to use.  Default is [/opt/intel/openvino_2019.2.242/python/python3]
/usr/local/lib/python3.5/dist-packages
Do you wish to build TensorFlow with XLA JIT support? [Y/n]: n
No XLA JIT support will be enabled for TensorFlow.

Do you wish to build TensorFlow with OpenCL SYCL support? [y/N]: n
No OpenCL SYCL support will be enabled for TensorFlow.

Do you wish to build TensorFlow with ROCm support? [y/N]: n
No ROCm support will be enabled for TensorFlow.

Do you wish to build TensorFlow with CUDA support? [y/N]: n
No CUDA support will be enabled for TensorFlow.

Do you wish to download a fresh release of clang? (Experimental) [y/N]: n
Clang will not be downloaded.

Do you wish to build TensorFlow with MPI support? [y/N]: n
No MPI support will be enabled for TensorFlow.

Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -march=native -Wno-sign-compare]: 


Would you like to interactively configure ./WORKSPACE for Android builds? [y/N]: n
Not configuring the WORKSPACE for Android builds.

Preconfigured Bazel build configs. You can use any of the below by adding "--config=<>" to your build command. See .bazelrc for more details.
    --config=mkl            # Build with MKL support.
    --config=monolithic     # Config for mostly static monolithic build.
    --config=gdr            # Build with GDR support.
    --config=verbs          # Build with libverbs support.
    --config=ngraph         # Build with Intel nGraph support.
    --config=dynamic_kernels    # (Experimental) Build kernels into separate shared objects.
Preconfigured Bazel build configs to DISABLE default on features:
    --config=noaws          # Disable AWS S3 filesystem support.
    --config=nogcp          # Disable GCP support.
    --config=nohdfs         # Disable HDFS support.
    --config=noignite       # Disable Apacha Ignite support.
    --config=nokafka        # Disable Apache Kafka support.
    --config=nonccl         # Disable NVIDIA NCCL support.
Configuration finished

$ sudo bazel build \
--config=opt \
--config=noaws \
--config=nogcp \
--config=nohdfs \
--config=noignite \
--config=nokafka \
--config=nonccl \
--copt=-ftree-vectorize \
--copt=-funsafe-math-optimizations \
--copt=-ftree-loop-vectorize \
--copt=-fomit-frame-pointer \
--remote_http_cache=https://storage.googleapis.com/bucket-bazel-tensorflow \
--google_credentials=${HOME}/bazel-caching/xxxxx-xxxxxxxxxxxx.json \
//tensorflow/tools/pip_package:build_pip_package

Screenshot 2019-08-04 00:15:46.png
Action result metadata is stored under the path /ac/
Output files are stored under the path /cas/

FireShot Capture 077 - バケットの詳細 - My First Project - Google Cloud Platform - console.cloud.google.com.png
FireShot Capture 078 - バケットの詳細 - My First Project - Google Cloud Platform - console.cloud.google.com.png

5.(Second time) Building Tensorflow v1.14.0

$ cd ~
$ Bazel_bin/0.24.1/Ubuntu1604_x86_64/install.sh

$ cd ~
$ cd tensorflow
$ git checkout -b r1.14 origin/r1.14
$ sudo bazel clean

$ ./configure

WARNING: --batch mode is deprecated. Please instead explicitly shut down your Bazel server using the command "bazel shutdown".
You have bazel 0.24.1- (@non-git) installed.
Please specify the location of python. [Default is /usr/bin/python]: /usr/bin/python3


Found possible Python library paths:
  /opt/intel/openvino_2019.2.242/python/python3
  /opt/intel/openvino_2019.2.242/python/python3.5
  /home/b920405/git/caffe-jacinto/python
  /opt/intel/openvino_2019.2.242/deployment_tools/model_optimizer
  .
  /opt/movidius/caffe/python
  /usr/lib/python3/dist-packages
  /usr/local/lib/python3.5/dist-packages
  /usr/local/lib
Please input the desired Python library path to use.  Default is [/opt/intel/openvino_2019.2.242/python/python3]
/usr/local/lib/python3.5/dist-packages
Do you wish to build TensorFlow with XLA JIT support? [Y/n]: n
No XLA JIT support will be enabled for TensorFlow.

Do you wish to build TensorFlow with OpenCL SYCL support? [y/N]: n
No OpenCL SYCL support will be enabled for TensorFlow.

Do you wish to build TensorFlow with ROCm support? [y/N]: n
No ROCm support will be enabled for TensorFlow.

Do you wish to build TensorFlow with CUDA support? [y/N]: n
No CUDA support will be enabled for TensorFlow.

Do you wish to download a fresh release of clang? (Experimental) [y/N]: n
Clang will not be downloaded.

Do you wish to build TensorFlow with MPI support? [y/N]: n
No MPI support will be enabled for TensorFlow.

Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -march=native -Wno-sign-compare]: 


Would you like to interactively configure ./WORKSPACE for Android builds? [y/N]: n
Not configuring the WORKSPACE for Android builds.

Preconfigured Bazel build configs. You can use any of the below by adding "--config=<>" to your build command. See .bazelrc for more details.
    --config=mkl            # Build with MKL support.
    --config=monolithic     # Config for mostly static monolithic build.
    --config=gdr            # Build with GDR support.
    --config=verbs          # Build with libverbs support.
    --config=ngraph         # Build with Intel nGraph support.
    --config=numa           # Build with NUMA support.
    --config=dynamic_kernels    # (Experimental) Build kernels into separate shared objects.
Preconfigured Bazel build configs to DISABLE default on features:
    --config=noaws          # Disable AWS S3 filesystem support.
    --config=nogcp          # Disable GCP support.
    --config=nohdfs         # Disable HDFS support.
    --config=noignite       # Disable Apache Ignite support.
    --config=nokafka        # Disable Apache Kafka support.
    --config=nonccl         # Disable NVIDIA NCCL support.
Configuration finished

$ sudo bazel build \
--config=opt \
--config=noaws \
--config=nogcp \
--config=nohdfs \
--config=noignite \
--config=nokafka \
--config=nonccl \
--copt=-ftree-vectorize \
--copt=-funsafe-math-optimizations \
--copt=-ftree-loop-vectorize \
--copt=-fomit-frame-pointer \
--remote_http_cache=https://storage.googleapis.com/bucket-bazel-tensorflow \
--google_credentials=${HOME}/bazel-caching/xxxxx-xxxxxxxxxxxx.json \
//tensorflow/tools/pip_package:build_pip_package

It seems that caching does not work well if you switch the version or branch of the OSS to be built.
Screenshot 2019-08-04 14:20:18.png

6.(Third time) Building Tensorflow v1.14.0 (Fix the program and rebuild it after removing Bazel's prebuilt binaries)

tensorflow/lite/python/interpreter.py
# Add the following two lines to the last line
  def set_num_threads(self, i):
    return self._interpreter.SetNumThreads(i)
tensorflow/lite/python/interpreter_wrapper/interpreter_wrapper.cc
// Corrected the vicinity of the last line as follows
PyObject* InterpreterWrapper::ResetVariableTensors() {
  TFLITE_PY_ENSURE_VALID_INTERPRETER();
  TFLITE_PY_CHECK(interpreter_->ResetVariableTensors());
  Py_RETURN_NONE;
}

PyObject* InterpreterWrapper::SetNumThreads(int i) {
  interpreter_->SetNumThreads(i);
  Py_RETURN_NONE;
}

}  // namespace interpreter_wrapper
}  // namespace tflite
tensorflow/lite/python/interpreter_wrapper/interpreter_wrapper.h
  // should be the interpreter object providing the memory.
  PyObject* tensor(PyObject* base_object, int i);

  PyObject* SetNumThreads(int i);

 private:
  // Helper function to construct an `InterpreterWrapper` object.
  // It only returns InterpreterWrapper if it can construct an `Interpreter`.
tensorflow/lite/tools/make/Makefile
BUILD_WITH_NNAPI=false
tensorflow/contrib/__init__.py
from tensorflow.contrib import checkpoint
#if os.name != "nt" and platform.machine() != "s390x":
#  from tensorflow.contrib import cloud
from tensorflow.contrib import cluster_resolver
tensorflow/contrib/__init__.py
from tensorflow.contrib.summary import summary

if os.name != "nt" and platform.machine() != "s390x":
  try:
    from tensorflow.contrib import cloud
  except ImportError:
    pass

from tensorflow.python.util.lazy_loader import LazyLoader
ffmpeg = LazyLoader("ffmpeg", globals(),
                    "tensorflow.contrib.ffmpeg")
$ sudo bazel clean

$ ./configure

WARNING: --batch mode is deprecated. Please instead explicitly shut down your Bazel server using the command "bazel shutdown".
You have bazel 0.24.1- (@non-git) installed.
Please specify the location of python. [Default is /usr/bin/python]: /usr/bin/python3


Found possible Python library paths:
  /opt/intel/openvino_2019.2.242/python/python3
  /opt/intel/openvino_2019.2.242/python/python3.5
  /home/b920405/git/caffe-jacinto/python
  /opt/intel/openvino_2019.2.242/deployment_tools/model_optimizer
  .
  /opt/movidius/caffe/python
  /usr/lib/python3/dist-packages
  /usr/local/lib/python3.5/dist-packages
  /usr/local/lib
Please input the desired Python library path to use.  Default is [/opt/intel/openvino_2019.2.242/python/python3]
/usr/local/lib/python3.5/dist-packages
Do you wish to build TensorFlow with XLA JIT support? [Y/n]: n
No XLA JIT support will be enabled for TensorFlow.

Do you wish to build TensorFlow with OpenCL SYCL support? [y/N]: n
No OpenCL SYCL support will be enabled for TensorFlow.

Do you wish to build TensorFlow with ROCm support? [y/N]: n
No ROCm support will be enabled for TensorFlow.

Do you wish to build TensorFlow with CUDA support? [y/N]: n
No CUDA support will be enabled for TensorFlow.

Do you wish to download a fresh release of clang? (Experimental) [y/N]: n
Clang will not be downloaded.

Do you wish to build TensorFlow with MPI support? [y/N]: n
No MPI support will be enabled for TensorFlow.

Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -march=native -Wno-sign-compare]: 


Would you like to interactively configure ./WORKSPACE for Android builds? [y/N]: n
Not configuring the WORKSPACE for Android builds.

Preconfigured Bazel build configs. You can use any of the below by adding "--config=<>" to your build command. See .bazelrc for more details.
    --config=mkl            # Build with MKL support.
    --config=monolithic     # Config for mostly static monolithic build.
    --config=gdr            # Build with GDR support.
    --config=verbs          # Build with libverbs support.
    --config=ngraph         # Build with Intel nGraph support.
    --config=numa           # Build with NUMA support.
    --config=dynamic_kernels    # (Experimental) Build kernels into separate shared objects.
Preconfigured Bazel build configs to DISABLE default on features:
    --config=noaws          # Disable AWS S3 filesystem support.
    --config=nogcp          # Disable GCP support.
    --config=nohdfs         # Disable HDFS support.
    --config=noignite       # Disable Apache Ignite support.
    --config=nokafka        # Disable Apache Kafka support.
    --config=nonccl         # Disable NVIDIA NCCL support.
Configuration finished

$ sudo bazel build \
--config=opt \
--config=noaws \
--config=nogcp \
--config=nohdfs \
--config=noignite \
--config=nokafka \
--config=nonccl \
--copt=-ftree-vectorize \
--copt=-funsafe-math-optimizations \
--copt=-ftree-loop-vectorize \
--copt=-fomit-frame-pointer \
--remote_http_cache=https://storage.googleapis.com/bucket-bazel-tensorflow \
--google_credentials=${HOME}/bazel-caching/xxxxx-xxxxxxxxxxxx.json \
//tensorflow/tools/pip_package:build_pip_package

If you just fix a bug fix or build sequence without switching the OSS version or branch to build, caching seems to work well.
Screenshot 2019-08-04 14:37:03.png

7.Appendix

bucket_object_delete_gsutil_command
gsutil -m rm -r gs://bucket-bazel-tensorflow/ac
gsutil -m rm -r gs://bucket-bazel-tensorflow/cas

8.Reference articles

https://docs.bazel.build/versions/master/remote-caching.html

7
4
1

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
7
4