環境
Windows10
python3.7.4
tensorflow2.1.0
cuda10.2
cudnn7.6.5
はじめに
GPUは認識されている
from tensorflow.python.client import device_lib
device_lib.list_local_devices()
# 出力結果
[name: "/device:CPU:0"
device_type: "CPU"
memory_limit: 268435456
locality {
}
incarnation: 12939604985444578121
, name: "/device:GPU:0"
device_type: "GPU"
memory_limit: 4990763008
locality {
bus_id: 1
links {
}
}
incarnation: 15893135237303968832
physical_device_desc: "device: 0, name: GeForce GTX 1660 SUPER, pci bus id: 0000:09:00.0, compute capability: 7.5"
]
線形層などでは大丈夫だったが、Conv2Dを含むコードを実行時にエラーが発生
~~~略~~~
2020-09-06 02:09:49.391503: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2020-09-06 02:09:50.835593: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_ALLOC_FAILED
2020-09-06 02:09:50.836159: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_ALLOC_FAILED
~~~略~~~
tensorflow.python.framework.errors_impl.UnknownError: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above. [Op:Conv2D]
どうやらメモリーの問題のようで、GPUメモリーの使用に制限をかけると上手くいくらしい
試したこと1
使用を抑えるコード
import tensorflow as tf
tf.config.gpu.set_per_process_memory_fraction(0.75)
tf.config.gpu.set_per_process_memory_growth(True)
→結果
AttributeError: module 'tensorflow_core._api.v2.config' has no attribute 'gpu'
バージョンが違うのか、無いと言われてしまった
試したこと2
他の方法もあったので試した
以下のコードを頭につける
import tensorflow as tf
gpus = tf.config.experimental.list_physical_devices('GPU')
if gpus:
try:
# Currently, memory growth needs to be the same across GPUs
for gpu in gpus:
tf.config.experimental.set_memory_growth(gpu, True)
logical_gpus = tf.config.experimental.list_logical_devices('GPU')
print(len(gpus), "Physical GPUs,", len(logical_gpus), "Logical GPUs")
except RuntimeError as e:
# Memory growth must be set before GPUs have been initialized
print(e)
→上手くいった!!