はじめに
tensorflowをやろうと思いましたが、GPUが使えませんでした。
GPUが使用可能か確認するコマンドを実行したところ、False
となっていたので、True
にするまでの流れを残します。
GPUが使用可能か確認するコマンド
>>> import tensorflow as tf
>>> tf.test.is_gpu_available()
上記コマンドの結果
result
>>> import tensorflow as tf
>>> tf.test.is_gpu_available()
2019-11-18 20:52:24.788675: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-11-18 20:52:24.789879: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 0 with properties:
name: GeForce RTX 2070 SUPER major: 7 minor: 5 memoryClockRate(GHz): 1.77
pciBusID: 0000:0a:00.0
2019-11-18 20:52:24.789931: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudart.so.10.0
2019-11-18 20:52:24.789948: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcublas.so.10.0
2019-11-18 20:52:24.789962: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcufft.so.10.0
2019-11-18 20:52:24.789976: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcurand.so.10.0
2019-11-18 20:52:24.789989: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcusolver.so.10.0
2019-11-18 20:52:24.790003: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcusparse.so.10.0
2019-11-18 20:52:24.790104: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Could not dlopen library 'libcudnn.so.7'; dlerror: libcudnn.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: :/usr/local/cuda/lib64
2019-11-18 20:52:24.790115: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1663] Cannot dlopen some GPU libraries. Skipping registering GPU devices...
2019-11-18 20:52:24.790130: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1181] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-11-18 20:52:24.790138: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1187] 0
2019-11-18 20:52:24.790144: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1200] 0: N
False
False
となっていたので、悲しい。
原因
コイツが悪い
tensorflow/stream_executor/platform/default/dso_loader.cc:53] Could not dlopen library 'libcudnn.so.7'; dlerror: libcudnn.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: :/usr/local/cuda/lib64
つまり、libcudnn.so.7
が無いとの事であった。
ファイルが存在するか確認するためには、以下を試すと良い。
check_libcudnn
$ find / -name libcudnn 2>/dev/null
私の場合は、結果が何も表示されなかったので、インストール忘れであった。
解決方法
libcudnn
をインストールすると良い。
NVIDIAのダウンロードページで自分の環境に適したバージョンをダウンロードする。
ダウンロードしてきた.deb
をインストールする。
$ cd ~/Downloads
$ dpkg -i libcudnn...deb
$ sudo apt install libcudnn...
注意点
libcudnn
のバージョンをCUDA
のバージョンに合わせるのだが、ここで注意がある。$ nvidia-smi
で示されるCUDA Version: ...
は、実際のバージョンとは異なるかもしれない。その場合は、以下を試すと良い。
check_cuda_version
cat /usr/local/cuda/version.txt
CUDA Version 10.0.130
確認
同じコマンドで確認する。
>>> import tensorflow as tf
>>> tf.test.is_gpu_available()
2019-11-18 21:08:51.949278: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2019-11-18 21:08:51.967978: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcuda.so.1
2019-11-18 21:08:52.205707: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-11-18 21:08:52.206162: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x5a1b5d0 executing computations on platform CUDA. Devices:
2019-11-18 21:08:52.206191: I tensorflow/compiler/xla/service/service.cc:175] StreamExecutor device (0): GeForce RTX 2070 SUPER, Compute Capability 7.5
2019-11-18 21:08:52.228279: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 3792885000 Hz
2019-11-18 21:08:52.228901: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x5dfa2b0 executing computations on platform Host. Devices:
2019-11-18 21:08:52.228923: I tensorflow/compiler/xla/service/service.cc:175] StreamExecutor device (0): <undefined>, <undefined>
2019-11-18 21:08:52.229110: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-11-18 21:08:52.229777: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 0 with properties:
name: GeForce RTX 2070 SUPER major: 7 minor: 5 memoryClockRate(GHz): 1.77
pciBusID: 0000:0a:00.0
2019-11-18 21:08:52.229983: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudart.so.10.0
2019-11-18 21:08:52.231064: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcublas.so.10.0
2019-11-18 21:08:52.232104: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcufft.so.10.0
2019-11-18 21:08:52.232402: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcurand.so.10.0
2019-11-18 21:08:52.234146: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcusolver.so.10.0
2019-11-18 21:08:52.235367: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcusparse.so.10.0
2019-11-18 21:08:52.238753: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudnn.so.7
2019-11-18 21:08:52.238888: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-11-18 21:08:52.239618: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-11-18 21:08:52.240219: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1763] Adding visible gpu devices: 0
2019-11-18 21:08:52.240264: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudart.so.10.0
2019-11-18 21:08:52.240983: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1181] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-11-18 21:08:52.240993: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1187] 0
2019-11-18 21:08:52.240997: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1200] 0: N
2019-11-18 21:08:52.241069: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-11-18 21:08:52.241444: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-11-18 21:08:52.241797: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1326] Created TensorFlow device (/device:GPU:0 with 6983 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2070 SUPER, pci bus id: 0000:0a:00.0, compute capability: 7.5)
True
True
になった!