More than 5 years have passed since last update.

Jetson NanoでDockerを動かす(後編)

Last updated at 2019-06-12Posted at 2019-06-11

はじめに

この記事は「Jetson NanoでDockerを動かす(前編)」の続きである。Jetson NanoでDockerを動かしつつ、さらにコンテナ内からGPUが使えないか模索する。似たようなことで悩む人がいることも考えて、読みにくいと思いつつログをそのまま貼り付けることにした。

本記事の後に「Jetson NanoでDockerを動かす(実践編)」を追加した。

確認方法

「TensorFlowからGPUが認識できているかを2行コードで確認する」を参考に、公式のTensorFlowを使って利用可能なデバイス一覧を取得する。すなわち下記の2行を実行する。

from tensorflow.python.client import device_lib
device_lib.list_local_devices()

通常のJetson Nanoの場合、「CPU」「XLA_CPU」「XLA_GPU」「GPU」の4つが表示された。

ターミナル

$ python3
Python 3.6.7 (default, Oct 22 2018, 11:32:17) 
[GCC 8.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from tensorflow.python.client import device_lib
>>> device_lib.list_local_devices()
2019-06-11 18:23:03.920863: W tensorflow/core/platform/profile_utils/cpu_utils.cc:98] Failed to find bogomips in /proc/cpuinfo; cannot determine CPU frequency
2019-06-11 18:23:03.921884: I tensorflow/compiler/xla/service/service.cc:161] XLA service 0x2f146c10 executing computations on platform Host. Devices:
2019-06-11 18:23:03.921969: I tensorflow/compiler/xla/service/service.cc:168]   StreamExecutor device (0): <undefined>, <undefined>
2019-06-11 18:23:03.993855: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:965] ARM64 does not support NUMA - returning NUMA node zero
2019-06-11 18:23:03.994153: I tensorflow/compiler/xla/service/service.cc:161] XLA service 0x2db1ff90 executing computations on platform CUDA. Devices:
2019-06-11 18:23:03.994224: I tensorflow/compiler/xla/service/service.cc:168]   StreamExecutor device (0): NVIDIA Tegra X1, Compute Capability 5.3
2019-06-11 18:23:03.994645: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1433] Found device 0 with properties: 
name: NVIDIA Tegra X1 major: 5 minor: 3 memoryClockRate(GHz): 0.9216
pciBusID: 0000:00:00.0
totalMemory: 3.87GiB freeMemory: 240.11MiB
2019-06-11 18:23:03.994719: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1512] Adding visible gpu devices: 0
2019-06-11 18:23:05.018979: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-06-11 18:23:05.019069: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990]      0 
2019-06-11 18:23:05.019110: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 0:   N 
2019-06-11 18:23:05.019268: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/device:GPU:0 with 63 MB memory) -> physical GPU (device: 0, name: NVIDIA Tegra X1, pci bus id: 0000:00:00.0, compute capability: 5.3)
[name: "/device:CPU:0"
device_type: "CPU"
memory_limit: 268435456
locality {
}
incarnation: 18185780001274122513
, name: "/device:XLA_CPU:0"
device_type: "XLA_CPU"
memory_limit: 17179869184
locality {
}
incarnation: 9173129262831143728
physical_device_desc: "device: XLA_CPU device"
, name: "/device:XLA_GPU:0"
device_type: "XLA_GPU"
memory_limit: 17179869184
locality {
}
incarnation: 5535528350812324515
physical_device_desc: "device: XLA_GPU device"
, name: "/device:GPU:0"
device_type: "GPU"
memory_limit: 66424832
locality {
  bus_id: 1
  links {
  }
}
incarnation: 17884479537418457558
physical_device_desc: "device: 0, name: NVIDIA Tegra X1, pci bus id: 0000:00:00.0, compute capability: 5.3"
]
>>>

Docker内のコンテナで試す

TensorFlowのインストール

まずはDocker内に立ち上げたコンテナにTensorFlowをインストールする。前編で動かした「クジラ」の例はAlpine Linuxだったが、動かすことを第一としホストと同じUbuntu:18.04を使う。用意したDockerfileは下記。ポイントとしては下記の通り。

ダウンロード高速化のためにaptの先をjpに向けた
libhdf5-devをインストールしているにもかかわらず、h5pyが「libhdf5.soがない」と言うのでpkg-configをインストールした(これに気がつくのに時間が掛かった)
動かすことを最優先にしRUNをまとめることは後回し

Dockerfile

FROM ubuntu:18.04
  
RUN sed -i.org -e 's|ports.ubuntu.com|jp.archive.ubuntu.com|g' /etc/apt/sources.list

RUN set -x \
  && apt update \
  && apt upgrade -y --no-install-recommends \
  && apt install -y pkg-config

RUN apt install -y bash python3-pip libhdf5-serial-dev hdf5-tools zlib1g-dev zip libjpeg8-dev libhdf5-dev
RUN pip3 install -U numpy grpcio absl-py py-cpuinfo psutil portpicker six mock requests gast h5py astor termcolor
RUN pip3 install --pre --extra-index-url https://developer.download.nvidia.com/compute/redist/jp/v42 tensorflow-gpu

COPY files/cuda-10-0.conf /etc/ld.so.conf.d/
COPY files/nvidia-tegra.conf /etc/ld.so.conf.d/

CMD ["/bin/bash"]

またcuda-10-0.confおよびnvidia-tegra.confはホストの/etc/ld.so.conf.d/にあるものを使用した。すなわち下記のコマンドを実行する。

ターミナル

mkdir files
cp /etc/ld.so.conf.d/cuda-10-0.conf files/
cp /etc/ld.so.conf.d/nvidia-tegra.conf files/

この状態でビルドすることができる。タグはgpu_testとした。次のビルドコマンドは時間が掛かるので寝る前に実行することを推奨。

ターミナル

docker build . -t gpu_test

コンテナの起動

関連するライブラリはボリュームマウントすることで解決する。長くなるのでスクリプトtf_gpu_test.shを書いた。その他cudaとcudnnもマウントする。

tf_gpu_test.sh

# !/bin/sh

docker run -it --rm \
 -v /usr/local/cuda-10.0:/usr/local/cuda-10.0 \
 -v /usr/lib/aarch64-linux-gnu/tegra:/usr/lib/aarch64-linux-gnu/tegra \
 -v /usr/lib/aarch64-linux-gnu/libcudnn.so.7.3.1:/usr/lib/aarch64-linux-gnu/libcudnn.so.7.3.1 \
 -v /usr/lib/aarch64-linux-gnu/libcudnn.so.7:/usr/lib/aarch64-linux-gnu/libcudnn.so.7 \
 gpu_test /bin/bash

これを実行すると次のようにプロンプトが表示される。

ターミナル

$ chmod +x tf_gpu_test.sh
$ ./tf_gpu_test.sh
root@37b199b34731:/#

ldconfigを実行したあとで、python3を立ち上げ確認を行う。

Dockerコンテナ内

root@37b199b34731:/# ldconfig
root@37b199b34731:/# python3
Python 3.6.8 (default, Jan 14 2019, 11:02:34) 
[GCC 8.0.1 20180414 (experimental) [trunk revision 259383]] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from tensorflow.python.client import device_lib
>>> device_lib.list_local_devices()
2019-06-11 08:21:03.292745: W tensorflow/core/platform/profile_utils/cpu_utils.cc:98] Failed to find bogomips in /proc/cpuinfo; cannot determine CPU frequency
2019-06-11 08:21:03.293961: I tensorflow/compiler/xla/service/service.cc:161] XLA service 0x1ae47440 executing computations on platform Host. Devices:
2019-06-11 08:21:03.294030: I tensorflow/compiler/xla/service/service.cc:168]   StreamExecutor device (0): <undefined>, <undefined>
2019-06-11 08:21:03.300220: E tensorflow/stream_executor/cuda/cuda_driver.cc:300] failed call to cuInit: CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detected
2019-06-11 08:21:03.300308: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:148] kernel driver does not appear to be running on this host (37b199b34731): /proc/driver/nvidia/version does not exist
[name: "/device:CPU:0"
device_type: "CPU"
memory_limit: 268435456
locality {
}
incarnation: 7750011564848998984
, name: "/device:XLA_CPU:0"
device_type: "XLA_CPU"
memory_limit: 17179869184
locality {
}
incarnation: 13754630370914000461
physical_device_desc: "device: XLA_CPU device"
]
>>>

CPUだけになってしまった・・・。

とりあえず/dev/以下にあるnvから始まるデバイスを片っ端から指定するよう起動スクリプトtf_gpu_test.shを変更する。

tf_gpu_test.shにデバイス指定を追加

# !/bin/sh

docker run -it --rm \
 -v /usr/local/cuda-10.0:/usr/local/cuda-10.0 \
 -v /usr/lib/aarch64-linux-gnu/tegra:/usr/lib/aarch64-linux-gnu/tegra \
 -v /usr/lib/aarch64-linux-gnu/libcudnn.so.7.3.1:/usr/lib/aarch64-linux-gnu/libcudnn.so.7.3.1 \
 -v /usr/lib/aarch64-linux-gnu/libcudnn.so.7:/usr/lib/aarch64-linux-gnu/libcudnn.so.7 \
 --device=/dev/nvhost-as-gpu \
 --device=/dev/nvhost-ctrl \
 --device=/dev/nvhost-ctrl-gpu \
 --device=/dev/nvhost-ctrl-isp \
 --device=/dev/nvhost-ctrl-isp.1 \
 --device=/dev/nvhost-ctrl-nvdec \
 --device=/dev/nvhost-ctrl-vi \
 --device=/dev/nvhost-ctxsw-gpu \
 --device=/dev/nvhost-dbg-gpu \
 --device=/dev/nvhost-gpu \
 --device=/dev/nvhost-isp \
 --device=/dev/nvhost-isp.1 \
 --device=/dev/nvhost-msenc \
 --device=/dev/nvhost-nvdec \
 --device=/dev/nvhost-nvjpg \
 --device=/dev/nvhost-prof-gpu \
 --device=/dev/nvhost-sched-gpu \
 --device=/dev/nvhost-tsec \
 --device=/dev/nvhost-tsecb \
 --device=/dev/nvhost-tsg-gpu \
 --device=/dev/nvhost-vi \
 --device=/dev/nvhost-vic \
 --device=/dev/nvidiactl \
 --device=/dev/nvmap \
 gpu_test /bin/bash

もう一度実行。

ターミナル

$ ./tf_gpu_test.sh
root@37b199b34731:/#

Dockerコンテナ内

root@e3f399786fb4:/# ldconfig
root@e3f399786fb4:/# python3
Python 3.6.8 (default, Jan 14 2019, 11:02:34) 
[GCC 8.0.1 20180414 (experimental) [trunk revision 259383]] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from tensorflow.python.client import device_lib
>>> device_lib.list_local_devices()
2019-06-11 09:44:31.509413: W tensorflow/core/platform/profile_utils/cpu_utils.cc:98] Failed to find bogomips in /proc/cpuinfo; cannot determine CPU frequency
2019-06-11 09:44:31.510031: I tensorflow/compiler/xla/service/service.cc:161] XLA service 0x11384440 executing computations on platform Host. Devices:
2019-06-11 09:44:31.510095: I tensorflow/compiler/xla/service/service.cc:168]   StreamExecutor device (0): <undefined>, <undefined>
2019-06-11 09:44:31.590013: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:965] ARM64 does not support NUMA - returning NUMA node zero
2019-06-11 09:44:31.590557: I tensorflow/compiler/xla/service/service.cc:161] XLA service 0xf46d280 executing computations on platform CUDA. Devices:
2019-06-11 09:44:31.590622: I tensorflow/compiler/xla/service/service.cc:168]   StreamExecutor device (0): NVIDIA Tegra X1, Compute Capability 5.3
2019-06-11 09:44:31.591110: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1433] Found device 0 with properties: 
name: NVIDIA Tegra X1 major: 5 minor: 3 memoryClockRate(GHz): 0.9216
pciBusID: 0000:00:00.0
totalMemory: 3.87GiB freeMemory: 354.46MiB
2019-06-11 09:44:31.591176: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1512] Adding visible gpu devices: 0
2019-06-11 09:44:32.630566: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-06-11 09:44:32.630642: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990]      0 
2019-06-11 09:44:32.630692: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 0:   N 
2019-06-11 09:44:32.630899: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/device:GPU:0 with 110 MB memory) -> physical GPU (device: 0, name: NVIDIA Tegra X1, pci bus id: 0000:00:00.0, compute capability: 5.3)
[name: "/device:CPU:0"
device_type: "CPU"
memory_limit: 268435456
locality {
}
incarnation: 14327199124955239069
, name: "/device:XLA_CPU:0"
device_type: "XLA_CPU"
memory_limit: 17179869184
locality {
}
incarnation: 12487769703329181913
physical_device_desc: "device: XLA_CPU device"
, name: "/device:XLA_GPU:0"
device_type: "XLA_GPU"
memory_limit: 17179869184
locality {
}
incarnation: 15345880495943358632
physical_device_desc: "device: XLA_GPU device"
, name: "/device:GPU:0"
device_type: "GPU"
memory_limit: 116113408
locality {
  bus_id: 1
  links {
  }
}
incarnation: 12763557053701895456
physical_device_desc: "device: 0, name: NVIDIA Tegra X1, pci bus id: 0000:00:00.0, compute capability: 5.3"
]
>>>

認識した！

まとめ

Jetson NanoのJetPack 4.2に標準でインストールされているDockerコンテナはGPUを認識した(と思われる)。実際に使ってみて足りない部分を確認したい。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up