Build Tensorflow v2.1.0 v1-API version full installer with TensorRT 7 enabled [Docker version]

This is the procedure to build all by yourself without using NGC containers.

1. Environment

  • Ubuntu 18.04 x86_64 RAM:16GB
  • Geforce GTX 1070
  • NVIDIA Driver 440.59
  • CUDA 10 (V10.0.130)
  • cuDNN
  • Docker 19.03.6, build 369ce74a3c
  • Tensorflow v2.1.0
  • TensorRT 7
  • TF-TRT
  • Bazel 0.29.1
  • Python 3.6

2. Procedure

$ cd ~
$ mkdir work/tensorrt && cd work/tensorrt

Download TensorRT- and copy to the work/tensorrt directory.

$ nano Dockerfile
FROM nvidia/cuda:10.0-cudnn7-devel-ubuntu18.04

RUN apt-get update && \
    DEBIAN_FRONTEND=noninteractive apt-get install -y \
    protobuf-compiler python-pil python-lxml python-tk cython \
    autoconf automake libtool curl make g++ unzip wget git nano \
    libgflags-dev libgoogle-glog-dev liblmdb-dev libleveldb-dev \
    libhdf5-serial-dev libhdf5-dev python3-opencv python-opencv \
    python3-dev python3-numpy python3-skimage gfortran libturbojpeg \
    python-dev python-numpy python-skimage python3-pip python-pip \
    libboost-all-dev libopenblas-dev libsnappy-dev software-properties-common \
    protobuf-compiler python-pil python-lxml python-tk libfreetype6-dev pkg-config \
    libpng-dev libhdf5-100 libhdf5-cpp-100 libc-ares-dev libblas-dev \
    libeigen3-dev libatlas-base-dev openjdk-8-jdk libopenblas-base \
    openmpi-bin libopenmpi-dev gcc libgfortran5 libatlas3-base liblapack-dev

RUN pip3 install pip --upgrade && \
    pip3 install Cython && \
    pip3 install contextlib2 && \
    pip3 install pillow && \
    pip3 install lxml && \
    pip3 install jupyter && \
    pip3 install matplotlib && \
    pip3 install keras_applications==1.0.8 --no-deps && \
    pip3 install keras_preprocessing==1.1.0 --no-deps && \
    pip3 install h5py==2.9.0 && \
    pip3 install -U --user six numpy wheel mock && \
    pip3 install pybind11 && \
    pip2 install Cython && \
    pip2 install contextlib2 && \
    pip2 install pillow && \
    pip2 install lxml && \
    pip2 install jupyter && \
    pip2 install matplotlib

# Create working directory
RUN mkdir -p /tensorrt && \
    cd /tensorrt
ARG work_dir=/tensorrt
WORKDIR ${work_dir}

# Clone Tensorflow v2.1.0, TF-TRT and install Bazel 
RUN git clone -b v2.1.0 --depth 1 https://github.com/tensorflow/tensorflow.git && \
    git clone --recursive https://github.com/NobuoTsukamoto/tf_trt_models.git && \
    wget https://github.com/bazelbuild/bazel/releases/download/0.29.1/bazel-0.29.1-installer-linux-x86_64.sh && \
    chmod +x bazel-0.29.1-installer-linux-x86_64.sh && \
    bash ./bazel-0.29.1-installer-linux-x86_64.sh

# Install TensorRT-7
COPY TensorRT- ${work_dir}
RUN tar -xvzf TensorRT- && \
    rm TensorRT-

# Setting environment variables
ENV TRT_RELEASE=${work_dir}/TensorRT-
ENV PATH=/usr/local/cuda-10.0/bin:$TRT_RELEASE:$TRT_RELEASE/bin:$PATH \
    LD_LIBRARY_PATH=/usr/local/cuda-10.0/lib64:$TRT_RELEASE/lib:$LD_LIBRARY_PATH \
    TF_CUDA_VERSION=10.0 \
$ docker build --tag tensorrt .
$ docker images

tensorrt    latest  cb6f0fc656d1  17 seconds ago  9.04GB
$ docker run \
  --gpus all \
  --name tensorrt \
  -it \
  --privileged \
  -p 8888:8888 \
  tensorrt \
# nvcc -V

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2018 NVIDIA Corporation
Built on Sat_Aug_25_21:08:01_CDT_2018
Cuda compilation tools, release 10.0, V10.0.130

# cat /usr/include/cudnn.h | grep '#define'

#define CUDNN_MAJOR 7
#define CUDNN_MINOR 6

# nvidia-smi

| NVIDIA-SMI 440.59       Driver Version: 440.59       CUDA Version: 10.2     |
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|   0  GeForce GTX 107...  Off  | 00000000:01:00.0 Off |                  N/A |
| N/A   60C    P5    11W /  N/A |    410MiB /  8119MiB |      0%      Default |

| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |

# bazel version

Extracting Bazel installation...
WARNING: --batch mode is deprecated. Please instead explicitly shut down your Bazel server using the command "bazel shutdown".
Build label: 0.29.1
Build target: bazel-out/k8-opt/bin/src/main/java/com/google/devtools/build/lib/bazel/BazelServer_deploy.jar
Build time: Tue Sep 10 13:44:39 2019 (1568123079)
Build timestamp: 1568123079
Build timestamp as int: 1568123079
# echo $PWD

# ls -l
total 12
drwxr-xr-x 10 root root 4096 Dec 17 02:30 TensorRT-
drwxr-xr-x  7 root root 4096 Feb 22 16:16 tensorflow
drwxr-xr-x  8 root root 4096 Feb 22 16:16 tf_trt_models
# cd tensorflow
# ./configure

Extracting Bazel installation...
WARNING: --batch mode is deprecated. Please instead explicitly shut down your Bazel server using the command "bazel shutdown".
You have bazel 0.29.1 installed.
Please specify the location of python. [Default is /usr/bin/python]: /usr/bin/python3

Found possible Python library paths:
Please input the desired Python library path to use.  Default is [/usr/lib/python3/dist-packages]

Do you wish to build TensorFlow with XLA JIT support? [Y/n]: n
No XLA JIT support will be enabled for TensorFlow.

Do you wish to build TensorFlow with OpenCL SYCL support? [y/N]: n
No OpenCL SYCL support will be enabled for TensorFlow.

Do you wish to build TensorFlow with ROCm support? [y/N]: n
No ROCm support will be enabled for TensorFlow.

Do you wish to build TensorFlow with CUDA support? [y/N]: y
CUDA support will be enabled for TensorFlow.

Do you wish to build TensorFlow with TensorRT support? [y/N]: y
TensorRT support will be enabled for TensorFlow.

Found CUDA 10.0 in:
Found cuDNN 7 in:
Found TensorRT 7 in:

Please specify a list of comma-separated CUDA compute capabilities you want to build with.
You can find the compute capability of your device at: https://developer.nvidia.com/cuda-gpus.
Please note that each additional compute capability significantly increases your build time and binary size, and that TensorFlow only supports compute capabilities >= 3.5 [Default is: 3.5,7.0]: 6.1

Do you want to use clang as CUDA compiler? [y/N]: n
nvcc will be used as CUDA compiler.

Please specify which gcc should be used by nvcc as the host compiler. [Default is /usr/bin/gcc]: 

Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -march=native -Wno-sign-compare]: 

Would you like to interactively configure ./WORKSPACE for Android builds? [y/N]: n
Not configuring the WORKSPACE for Android builds.

Preconfigured Bazel build configs. You can use any of the below by adding "--config=<>" to your build command. See .bazelrc for more details.
    --config=mkl            # Build with MKL support.
    --config=monolithic     # Config for mostly static monolithic build.
    --config=ngraph         # Build with Intel nGraph support.
    --config=numa           # Build with NUMA support.
    --config=dynamic_kernels    # (Experimental) Build kernels into separate shared objects.
    --config=v2             # Build TensorFlow 2.x instead of 1.x.
Preconfigured Bazel build configs to DISABLE default on features:
    --config=noaws          # Disable AWS S3 filesystem support.
    --config=nogcp          # Disable GCP support.
    --config=nohdfs         # Disable HDFS support.
    --config=nonccl         # Disable NVIDIA NCCL support.
Configuration finished

Build with 16GB of RAM and 8 cores. You need to adjust according to the resources of your PC environment. Calculate assuming that about 2GB of RAM is consumed for each core.

# bazel build \
  --config=opt \
  --config=cuda \
  --config=noaws \
  --config=nohdfs \
  --config=nonccl \
  --config=v1 \
  --local_resources=16384.0,8.0,1.0 \
  --host_force_python=PY3 \
  --noincompatible_do_not_split_linking_cmdline \
# ./bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg

Sun Feb 23 11:13:32 UTC 2020 : === Preparing sources in dir: /tmp/tmp.2tVlFrnLCQ
/tensorrt/tensorflow /tensorrt/tensorflow
/tmp/tmp.2tVlFrnLCQ/tensorflow/include /tensorrt/tensorflow
Sun Feb 23 11:13:45 UTC 2020 : === Building wheel
warning: no files found matching 'README'
warning: no files found matching '*.pyd' under directory '*'
warning: no files found matching '*.pd' under directory '*'
warning: no files found matching '*.dylib' under directory '*'
warning: no files found matching '*.dll' under directory '*'
warning: no files found matching '*.lib' under directory '*'
warning: no files found matching '*.csv' under directory '*'
warning: no files found matching '*.h' under directory 'tensorflow_core/include/tensorflow'
warning: no files found matching '*' under directory 'tensorflow_core/include/third_party'
Sun Feb 23 11:14:06 UTC 2020 : === Output wheel file is in: /tmp/tensorflow_pkg

# cp /tmp/tensorflow_pkg/tensorflow-2.1.0-cp36-cp36m-linux_x86_64.whl /tensorrt
# cd ..
# ls -l

total 179920
drwxr-xr-x 10 root root      4096 Dec 17 02:30 TensorRT-
-rwxr-xr-x  1 root root  43791980 Sep 10 13:57 bazel-0.29.1-installer-linux-x86_64.sh
drwxr-xr-x  1 root root      4096 Feb 23 06:43 tensorflow
-rw-r--r--  1 root root 140419710 Feb 23 11:15 tensorflow-2.1.0-cp36-cp36m-linux_x86_64.whl
drwxr-xr-x  8 root root      4096 Feb 23 06:09 tf_trt_models
# pip3 uninstall tensorflow-gpu tensorflow
# pip3 install tensorflow-2.1.0-cp36-cp36m-linux_x86_64.whl
# cd tf_trt_models
# ./install.sh python3
# python3
Python 3.6.9 (default, Nov  7 2019, 10:44:02) 
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.

>>> import tensorflow as tf
2020-02-23 12:01:46.274803: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0
2020-02-23 12:01:46.768715: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libnvinfer.so.7
2020-02-23 12:01:46.769356: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libnvinfer_plugin.so.7

>>> from tensorflow.python.client import device_lib
>>> device_lib.list_local_devices()
2020-02-23 12:05:28.927561: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2020-02-23 12:05:28.987702: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-02-23 12:05:28.988503: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 0 with properties: 
pciBusID: 0000:01:00.0 name: GeForce GTX 1070 with Max-Q Design computeCapability: 6.1
coreClock: 1.2655GHz coreCount: 16 deviceMemorySize: 7.93GiB deviceMemoryBandwidth: 238.66GiB/s
2020-02-23 12:05:28.988540: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0
2020-02-23 12:05:28.988579: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0
2020-02-23 12:05:29.006807: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10.0
2020-02-23 12:05:29.013005: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10.0
2020-02-23 12:05:29.058509: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10.0
2020-02-23 12:05:29.087138: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10.0
2020-02-23 12:05:29.087290: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-02-23 12:05:29.087578: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-02-23 12:05:29.090019: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-02-23 12:05:29.091523: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1697] Adding visible gpu devices: 0
2020-02-23 12:05:29.091629: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0
2020-02-23 12:05:30.075656: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1096] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-02-23 12:05:30.075707: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102]      0 
2020-02-23 12:05:30.075740: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] 0:   N 
2020-02-23 12:05:30.075938: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-02-23 12:05:30.076505: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-02-23 12:05:30.077038: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-02-23 12:05:30.077521: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1241] Created TensorFlow device (/device:GPU:0 with 7225 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1070 with Max-Q Design, pci bus id: 0000:01:00.0, compute capability: 6.1)
[name: "/device:CPU:0"
device_type: "CPU"
memory_limit: 268435456
locality {
incarnation: 11937878305894780308
, name: "/device:GPU:0"
device_type: "GPU"
memory_limit: 7576322048
locality {
  bus_id: 1
  links {
incarnation: 7297894123203666970
physical_device_desc: "device: 0, name: GeForce GTX 1070 with Max-Q Design, pci bus id: 0000:01:00.0, compute capability: 6.1"

3. Appendix

3-1. Commit container image

Escape from Docker Container.
Ctrl + P
Ctrl + Q

$ docker commit tensorrt tensorrt

3-2. Extract_wheel_from_Docker_container

$ docker cp cb6f0fc656d1:/tensorrt/tensorflow-2.1.0-cp36-cp36m-linux_x86_64.whl .

3-3. Download Pre-build wheel

$ curl -sc /tmp/cookie "https://drive.google.com/uc?export=download&id=1G2beyMH1_g2nYjF8uYtKnw79DYYoa0vK" > /dev/null
$ CODE="$(awk '/_warning_/ {print $NF}' /tmp/cookie)"
$ curl -Lb /tmp/cookie "https://drive.google.com/uc?export=download&confirm=${CODE}&id=1G2beyMH1_g2nYjF8uYtKnw79DYYoa0vK" -o tensorflow-2.1.0-cp36-cp36m-linux_x86_64.whl

4. Reference articles

4-1. Various

  1. https://github.com/NVIDIA/TensorRT
  2. https://github.com/NobuoTsukamoto/tf_trt_models
  3. https://docs.nvidia.com/deeplearning/frameworks/tf-trt-user-guide/index.html
  4. https://www.google.com/search?q=NvInferVersion.h+-www.sejuku.net&oq=NvInferVersion.h&aqs=chrome..69i57.1147j0j8&sourceid=chrome&ie=UTF-8
  5. https://hub.docker.com/layers/nvidia/cuda/10.0-cudnn7-devel-ubuntu18.04/images/sha256-e277b9eef79d6995b10d07e30228daa9e7d42f49bcfc29d512c1534b42d91841?context=explore
  6. https://qiita.com/ksasaki/items/b20a785e1a0f610efa08
  7. https://github.com/tensorflow/tensorrt
  8. Jetson NanoでTF-TRTを試す(Image Classification) - nb.oの日記
  9. Jetson NanoでTF-TRTを試す(Object detection) - nb.oの日記
  10. AttributeError: module 'tensorflow' has no attribute 'version' #31576

4-2. CUDA/cuDNN/TensorRT Header files and libraries search path logic

  1. https://github.com/tensorflow/tensorflow/blob/v2.1.0/configure.py
  2. https://github.com/tensorflow/tensorflow/blob/v2.1.0/third_party/gpus/find_cuda_config.py

4-3. Check GPU Compute Capability

  1. https://developer.nvidia.com/cuda-gpus

