This is the procedure to build all by yourself without using NGC containers.
1. Environment
- Ubuntu 18.04 x86_64 RAM:16GB
- Geforce GTX 1070
- NVIDIA Driver 440.59
- CUDA 10 (V10.0.130)
- cuDNN 7.6.5.32
- Docker 19.03.6, build 369ce74a3c
- Tensorflow v2.1.0
- TensorRT 7
- TF-TRT
- Bazel 0.29.1
- Python 3.6
2. Procedure
Create_working_directory
$ cd ~
$ mkdir work/tensorrt && cd work/tensorrt
Download TensorRT-7.0.0.11 and copy to the work/tensorrt
directory.
https://developer.nvidia.com/compute/machine-learning/tensorrt/secure/7.0/7.0.0.11/tars/TensorRT-7.0.0.11.Ubuntu-18.04.x86_64-gnu.cuda-10.0.cudnn7.6.tar.gz
Create_Dockerfile
$ nano Dockerfile
Dockerfile
FROM nvidia/cuda:10.0-cudnn7-devel-ubuntu18.04
RUN apt-get update && \
DEBIAN_FRONTEND=noninteractive apt-get install -y \
protobuf-compiler python-pil python-lxml python-tk cython \
autoconf automake libtool curl make g++ unzip wget git nano \
libgflags-dev libgoogle-glog-dev liblmdb-dev libleveldb-dev \
libhdf5-serial-dev libhdf5-dev python3-opencv python-opencv \
python3-dev python3-numpy python3-skimage gfortran libturbojpeg \
python-dev python-numpy python-skimage python3-pip python-pip \
libboost-all-dev libopenblas-dev libsnappy-dev software-properties-common \
protobuf-compiler python-pil python-lxml python-tk libfreetype6-dev pkg-config \
libpng-dev libhdf5-100 libhdf5-cpp-100 libc-ares-dev libblas-dev \
libeigen3-dev libatlas-base-dev openjdk-8-jdk libopenblas-base \
openmpi-bin libopenmpi-dev gcc libgfortran5 libatlas3-base liblapack-dev
RUN pip3 install pip --upgrade && \
pip3 install Cython && \
pip3 install contextlib2 && \
pip3 install pillow && \
pip3 install lxml && \
pip3 install jupyter && \
pip3 install matplotlib && \
pip3 install keras_applications==1.0.8 --no-deps && \
pip3 install keras_preprocessing==1.1.0 --no-deps && \
pip3 install h5py==2.9.0 && \
pip3 install -U --user six numpy wheel mock && \
pip3 install pybind11 && \
pip2 install Cython && \
pip2 install contextlib2 && \
pip2 install pillow && \
pip2 install lxml && \
pip2 install jupyter && \
pip2 install matplotlib
# Create working directory
RUN mkdir -p /tensorrt && \
cd /tensorrt
ARG work_dir=/tensorrt
WORKDIR ${work_dir}
# Clone Tensorflow v2.1.0, TF-TRT and install Bazel
RUN git clone -b v2.1.0 --depth 1 https://github.com/tensorflow/tensorflow.git && \
git clone --recursive https://github.com/NobuoTsukamoto/tf_trt_models.git && \
wget https://github.com/bazelbuild/bazel/releases/download/0.29.1/bazel-0.29.1-installer-linux-x86_64.sh && \
chmod +x bazel-0.29.1-installer-linux-x86_64.sh && \
bash ./bazel-0.29.1-installer-linux-x86_64.sh
# Install TensorRT-7
COPY TensorRT-7.0.0.11.Ubuntu-18.04.x86_64-gnu.cuda-10.0.cudnn7.6.tar.gz ${work_dir}
RUN tar -xvzf TensorRT-7.0.0.11.Ubuntu-18.04.x86_64-gnu.cuda-10.0.cudnn7.6.tar.gz && \
rm TensorRT-7.0.0.11.Ubuntu-18.04.x86_64-gnu.cuda-10.0.cudnn7.6.tar.gz
# Setting environment variables
ENV TRT_RELEASE=${work_dir}/TensorRT-7.0.0.11
ENV PATH=/usr/local/cuda-10.0/bin:$TRT_RELEASE:$TRT_RELEASE/bin:$PATH \
LD_LIBRARY_PATH=/usr/local/cuda-10.0/lib64:$TRT_RELEASE/lib:$LD_LIBRARY_PATH \
TF_CUDA_VERSION=10.0 \
TF_CUDNN_VERSION=7 \
TENSORRT_INSTALL_PATH=$TRT_RELEASE \
TF_TENSORRT_VERSION=7
Create_DockerImage
$ docker build --tag tensorrt .
Check_DockerImage_generation_status
$ docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
tensorrt latest cb6f0fc656d1 17 seconds ago 9.04GB
Start_Docker_container
$ docker run \
--gpus all \
--name tensorrt \
-it \
--privileged \
-p 8888:8888 \
tensorrt \
/bin/bash
CUDA_and_cuDNN_version_check
# nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2018 NVIDIA Corporation
Built on Sat_Aug_25_21:08:01_CDT_2018
Cuda compilation tools, release 10.0, V10.0.130
# cat /usr/include/cudnn.h | grep '#define'
# define CUDNN_MAJOR 7
# define CUDNN_MINOR 6
# define CUDNN_PATCHLEVEL 5
# nvidia-smi
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 440.59 Driver Version: 440.59 CUDA Version: 10.2 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 107... Off | 00000000:01:00.0 Off | N/A |
| N/A 60C P5 11W / N/A | 410MiB / 8119MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
+-----------------------------------------------------------------------------+
# bazel version
Extracting Bazel installation...
WARNING: --batch mode is deprecated. Please instead explicitly shut down your Bazel server using the command "bazel shutdown".
Build label: 0.29.1
Build target: bazel-out/k8-opt/bin/src/main/java/com/google/devtools/build/lib/bazel/BazelServer_deploy.jar
Build time: Tue Sep 10 13:44:39 2019 (1568123079)
Build timestamp: 1568123079
Build timestamp as int: 1568123079
Check_folder_hierarchy
# echo $PWD
/tensorrt
# ls -l
total 12
drwxr-xr-x 10 root root 4096 Dec 17 02:30 TensorRT-7.0.0.11
drwxr-xr-x 7 root root 4096 Feb 22 16:16 tensorflow
drwxr-xr-x 8 root root 4096 Feb 22 16:16 tf_trt_models
Initial_configuration_of_Tensorflow_v2.1.0
# cd tensorflow
# ./configure
Extracting Bazel installation...
WARNING: --batch mode is deprecated. Please instead explicitly shut down your Bazel server using the command "bazel shutdown".
You have bazel 0.29.1 installed.
Please specify the location of python. [Default is /usr/bin/python]: /usr/bin/python3
Found possible Python library paths:
/usr/lib/python3/dist-packages
/usr/local/lib/python3.6/dist-packages
Please input the desired Python library path to use. Default is [/usr/lib/python3/dist-packages]
/usr/local/lib/python3.6/dist-packages
Do you wish to build TensorFlow with XLA JIT support? [Y/n]: n
No XLA JIT support will be enabled for TensorFlow.
Do you wish to build TensorFlow with OpenCL SYCL support? [y/N]: n
No OpenCL SYCL support will be enabled for TensorFlow.
Do you wish to build TensorFlow with ROCm support? [y/N]: n
No ROCm support will be enabled for TensorFlow.
Do you wish to build TensorFlow with CUDA support? [y/N]: y
CUDA support will be enabled for TensorFlow.
Do you wish to build TensorFlow with TensorRT support? [y/N]: y
TensorRT support will be enabled for TensorFlow.
Found CUDA 10.0 in:
/usr/local/cuda-10.0/lib64
/usr/local/cuda-10.0/include
Found cuDNN 7 in:
/usr/lib/x86_64-linux-gnu
/usr/include
Found TensorRT 7 in:
/tensorrt/TensorRT-7.0.0.11/lib
/tensorrt/TensorRT-7.0.0.11/include
Please specify a list of comma-separated CUDA compute capabilities you want to build with.
You can find the compute capability of your device at: https://developer.nvidia.com/cuda-gpus.
Please note that each additional compute capability significantly increases your build time and binary size, and that TensorFlow only supports compute capabilities >= 3.5 [Default is: 3.5,7.0]: 6.1
Do you want to use clang as CUDA compiler? [y/N]: n
nvcc will be used as CUDA compiler.
Please specify which gcc should be used by nvcc as the host compiler. [Default is /usr/bin/gcc]:
Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -march=native -Wno-sign-compare]:
Would you like to interactively configure ./WORKSPACE for Android builds? [y/N]: n
Not configuring the WORKSPACE for Android builds.
Preconfigured Bazel build configs. You can use any of the below by adding "--config=<>" to your build command. See .bazelrc for more details.
--config=mkl # Build with MKL support.
--config=monolithic # Config for mostly static monolithic build.
--config=ngraph # Build with Intel nGraph support.
--config=numa # Build with NUMA support.
--config=dynamic_kernels # (Experimental) Build kernels into separate shared objects.
--config=v2 # Build TensorFlow 2.x instead of 1.x.
Preconfigured Bazel build configs to DISABLE default on features:
--config=noaws # Disable AWS S3 filesystem support.
--config=nogcp # Disable GCP support.
--config=nohdfs # Disable HDFS support.
--config=nonccl # Disable NVIDIA NCCL support.
Configuration finished
Build with 16GB of RAM and 8 cores. You need to adjust according to the resources of your PC environment. Calculate assuming that about 2GB of RAM is consumed for each core.
Build_Tensorflow_v2.1.0
# bazel build \
--config=opt \
--config=cuda \
--config=noaws \
--config=nohdfs \
--config=nonccl \
--config=v1 \
--local_resources=16384.0,8.0,1.0 \
--host_force_python=PY3 \
--noincompatible_do_not_split_linking_cmdline \
//tensorflow/tools/pip_package:build_pip_package
Build_wheel
# ./bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg
Sun Feb 23 11:13:32 UTC 2020 : === Preparing sources in dir: /tmp/tmp.2tVlFrnLCQ
/tensorrt/tensorflow /tensorrt/tensorflow
/tensorrt/tensorflow
/tmp/tmp.2tVlFrnLCQ/tensorflow/include /tensorrt/tensorflow
/tensorrt/tensorflow
Sun Feb 23 11:13:45 UTC 2020 : === Building wheel
warning: no files found matching 'README'
warning: no files found matching '*.pyd' under directory '*'
warning: no files found matching '*.pd' under directory '*'
warning: no files found matching '*.dylib' under directory '*'
warning: no files found matching '*.dll' under directory '*'
warning: no files found matching '*.lib' under directory '*'
warning: no files found matching '*.csv' under directory '*'
warning: no files found matching '*.h' under directory 'tensorflow_core/include/tensorflow'
warning: no files found matching '*' under directory 'tensorflow_core/include/third_party'
Sun Feb 23 11:14:06 UTC 2020 : === Output wheel file is in: /tmp/tensorflow_pkg
# cp /tmp/tensorflow_pkg/tensorflow-2.1.0-cp36-cp36m-linux_x86_64.whl /tensorrt
# cd ..
# ls -l
total 179920
drwxr-xr-x 10 root root 4096 Dec 17 02:30 TensorRT-7.0.0.11
-rwxr-xr-x 1 root root 43791980 Sep 10 13:57 bazel-0.29.1-installer-linux-x86_64.sh
drwxr-xr-x 1 root root 4096 Feb 23 06:43 tensorflow
-rw-r--r-- 1 root root 140419710 Feb 23 11:15 tensorflow-2.1.0-cp36-cp36m-linux_x86_64.whl
drwxr-xr-x 8 root root 4096 Feb 23 06:09 tf_trt_models
Install_tensorflow_v2.1.0_v1-API_with_TensorRT_CUDA10.0_cuDNN7.6.5
# pip3 uninstall tensorflow-gpu tensorflow
# pip3 install tensorflow-2.1.0-cp36-cp36m-linux_x86_64.whl
Install_TF-TRT
# cd tf_trt_models
# ./install.sh python3
Import_test
# python3
Python 3.6.9 (default, Nov 7 2019, 10:44:02)
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf
2020-02-23 12:01:46.274803: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0
2020-02-23 12:01:46.768715: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libnvinfer.so.7
2020-02-23 12:01:46.769356: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libnvinfer_plugin.so.7
>>> from tensorflow.python.client import device_lib
>>> device_lib.list_local_devices()
2020-02-23 12:05:28.927561: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2020-02-23 12:05:28.987702: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-02-23 12:05:28.988503: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: GeForce GTX 1070 with Max-Q Design computeCapability: 6.1
coreClock: 1.2655GHz coreCount: 16 deviceMemorySize: 7.93GiB deviceMemoryBandwidth: 238.66GiB/s
2020-02-23 12:05:28.988540: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0
2020-02-23 12:05:28.988579: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0
2020-02-23 12:05:29.006807: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10.0
2020-02-23 12:05:29.013005: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10.0
2020-02-23 12:05:29.058509: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10.0
2020-02-23 12:05:29.087138: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10.0
2020-02-23 12:05:29.087290: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-02-23 12:05:29.087578: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-02-23 12:05:29.090019: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-02-23 12:05:29.091523: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1697] Adding visible gpu devices: 0
2020-02-23 12:05:29.091629: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0
2020-02-23 12:05:30.075656: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1096] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-02-23 12:05:30.075707: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102] 0
2020-02-23 12:05:30.075740: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] 0: N
2020-02-23 12:05:30.075938: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-02-23 12:05:30.076505: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-02-23 12:05:30.077038: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-02-23 12:05:30.077521: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1241] Created TensorFlow device (/device:GPU:0 with 7225 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1070 with Max-Q Design, pci bus id: 0000:01:00.0, compute capability: 6.1)
[name: "/device:CPU:0"
device_type: "CPU"
memory_limit: 268435456
locality {
}
incarnation: 11937878305894780308
, name: "/device:GPU:0"
device_type: "GPU"
memory_limit: 7576322048
locality {
bus_id: 1
links {
}
}
incarnation: 7297894123203666970
physical_device_desc: "device: 0, name: GeForce GTX 1070 with Max-Q Design, pci bus id: 0000:01:00.0, compute capability: 6.1"
]
>>>
3. Appendix
3-1. Commit container image
Escape from Docker Container.
Ctrl + P
Ctrl + Q
Commit_container_image
$ docker commit tensorrt tensorrt
3-2. Extract_wheel_from_Docker_container
Extract_wheel_from_Docker_container
$ docker cp cb6f0fc656d1:/tensorrt/tensorflow-2.1.0-cp36-cp36m-linux_x86_64.whl .
3-3. Download Pre-build wheel
Download_Pre-build_wheel
$ curl -sc /tmp/cookie "https://drive.google.com/uc?export=download&id=1G2beyMH1_g2nYjF8uYtKnw79DYYoa0vK" > /dev/null
$ CODE="$(awk '/_warning_/ {print $NF}' /tmp/cookie)"
$ curl -Lb /tmp/cookie "https://drive.google.com/uc?export=download&confirm=${CODE}&id=1G2beyMH1_g2nYjF8uYtKnw79DYYoa0vK" -o tensorflow-2.1.0-cp36-cp36m-linux_x86_64.whl
4. Reference articles
4-1. Various
- https://github.com/NVIDIA/TensorRT
- https://github.com/NobuoTsukamoto/tf_trt_models
- https://docs.nvidia.com/deeplearning/frameworks/tf-trt-user-guide/index.html
- https://www.google.com/search?q=NvInferVersion.h+-www.sejuku.net&oq=NvInferVersion.h&aqs=chrome..69i57.1147j0j8&sourceid=chrome&ie=UTF-8
- https://hub.docker.com/layers/nvidia/cuda/10.0-cudnn7-devel-ubuntu18.04/images/sha256-e277b9eef79d6995b10d07e30228daa9e7d42f49bcfc29d512c1534b42d91841?context=explore
- https://qiita.com/ksasaki/items/b20a785e1a0f610efa08
- https://github.com/tensorflow/tensorrt
- Jetson NanoでTF-TRTを試す(Image Classification) - nb.oの日記
- Jetson NanoでTF-TRTを試す(Object detection) - nb.oの日記
- AttributeError: module 'tensorflow' has no attribute 'version' #31576
4-2. CUDA/cuDNN/TensorRT Header files and libraries search path logic
- https://github.com/tensorflow/tensorflow/blob/v2.1.0/configure.py
- https://github.com/tensorflow/tensorflow/blob/v2.1.0/third_party/gpus/find_cuda_config.py