備忘録
突然GPUではなくCPUで学習されていたため、調査
python開いて
from tensorflow.python.client import device_lib
device_lib.list_local_devices()
を実行、すると
2018-08-01 09:53:52.674706: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2018-08-01 09:53:52.674743: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2018-08-01 09:53:52.674755: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2018-08-01 09:53:52.674765: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
2018-08-01 09:53:52.674775: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
2018-08-01 09:53:52.698489: E tensorflow/stream_executor/cuda/cuda_driver.cc:406] failed call to cuInit: CUDA_ERROR_NO_DEVICE
2018-08-01 09:53:52.698581: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:158] retrieving CUDA diagnostic information for host: chopin
2018-08-01 09:53:52.698624: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:165] hostname: chopin
2018-08-01 09:53:52.698699: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:189] libcuda reported version is: 384.111.0
2018-08-01 09:53:52.699152: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:369] driver version file contents: """NVRM version: NVIDIA UNIX x86_64 Kernel Module 384.111 Tue Dec 19 23:51:45 PST 2017
GCC version: gcc version 4.8.4 (Ubuntu 4.8.4-2ubuntu1~14.04.3)
"""
2018-08-01 09:53:52.699223: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:193] kernel reported version is: 384.111.0
2018-08-01 09:53:52.699246: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:300] kernel version seems to match DSO: 384.111.0
[name: "/cpu:0"
device_type: "CPU"
memory_limit: 268435456
locality {
}
incarnation: 1341842
と出る。どうやら使用するGPUが指定されていないらしい。
なので
export CUDA_VISIBLE_DEVICES='0'
で使用するGPUを直接指定してやれば動作する。
筆者はなぜか.bashrcにexport CUDA_VISIBLE_DEVICES='' が書き込まれていて発狂していた。