概要
Windows10にTensorFlow with GPU Support環境を構築する
公式に書いてある通りにやればたぶん大丈夫だけど、CUDA, cuDNNとかのバージョン違いによって苦しんだのでそこら辺を中心にメモ
https://www.tensorflow.org/install/install_windows
ドキュメントの流れを見ていると環境構築の流れも変わりやすいようなので永続的なチームでの利用を考えているのであればDockerで管理するほうがいい
詳細
GPU利用のためにCUDA, cuDNNのインストール
CUDA
- CUDA Toolkitをダウンロードした後、CUDAをインストール
- インストール後GPUが認識できるか、deviceQueryを使って必ず確認する
C:\ProgramData\NVIDIA Corporation\CUDA Samples\v8.0\bin\win64\Release
deviceQuery.exe
- こんな感じでエラーになったらGPUのアップデートが必要、デバイスマネージャからアップデートしましょう
C:\ProgramData\NVIDIA Corporation\CUDA Samples\v8.0\bin\win64\Release>deviceQuery.exe
deviceQuery.exe Starting...
CUDA Device Query (Runtime API) version (CUDART static linking)
cudaGetDeviceCount returned 38
-> no CUDA-capable device is detected
Result = FAIL
- こんな感じに出力されれば読み取れている
C:\ProgramData\NVIDIA Corporation\CUDA Samples\v8.0\bin\win64\Release>deviceQuery.exe
deviceQuery.exe Starting...
CUDA Device Query (Runtime API) version (CUDART static linking)
Detected 1 CUDA Capable device(s)
Device 0: "GeForce GTX 1080"
CUDA Driver Version / Runtime Version 8.0 / 8.0
CUDA Capability Major/Minor version number: 6.1
Total amount of global memory: 8192 MBytes (8589934592 bytes)
(20) Multiprocessors, (128) CUDA Cores/MP: 2560 CUDA Cores
GPU Max Clock rate: 1847 MHz (1.85 GHz)
Memory Clock rate: 5005 Mhz
Memory Bus Width: 256-bit
L2 Cache Size: 2097152 bytes
Maximum Texture Dimension Size (x,y,z) 1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384)
Maximum Layered 1D Texture Size, (num) layers 1D=(32768), 2048 layers
Maximum Layered 2D Texture Size, (num) layers 2D=(32768, 32768), 2048 layers
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 49152 bytes
Total number of registers available per block: 65536
Warp size: 32
Maximum number of threads per multiprocessor: 2048
Maximum number of threads per block: 1024
Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
Max dimension size of a grid size (x,y,z): (2147483647, 65535, 65535)
Maximum memory pitch: 2147483647 bytes
Texture alignment: 512 bytes
Concurrent copy and kernel execution: Yes with 2 copy engine(s)
Run time limit on kernels: Yes
Integrated GPU sharing Host Memory: No
Support host page-locked memory mapping: Yes
Alignment requirement for Surfaces: Yes
Device has ECC support: Disabled
CUDA Device Driver Mode (TCC or WDDM): WDDM (Windows Display Driver Model)
Device supports Unified Addressing (UVA): Yes
Device PCI Domain ID / Bus ID / location ID: 0 / 1 / 0
Compute Mode:
< Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >
deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 8.0, CUDA Runtime Version = 8.0, NumDevs = 1, Device0 = GeForce GTX 1080
R
cuDNN
cuDNNをダウンロードしてきてPATHを通す
https://developer.nvidia.com/cudnn
最新バージョンだと必要なdllが入っていなくて実行時エラーとなる、とりあえず5.1を使うことによって回避できる
追記
TensorFlow1.3から6.0対応予定
Thank you! @maru3
ダウンロード後PATHにcuDNNをダウンロード、展開したディレクトリにある/binファイルをPATHに設定する
python install
TensorFlow install
pip3を使ってinstall
Anacondaを使ったインストール方法もあるが、今回はpip3を使う
C:\> pip3 install --upgrade tensorflow-gpu
動作確認
$ python
>>> import tensorflow as tf
>>> hello = tf.constant('Hello, TensorFlow!')
>>> sess = tf.Session()
>>> print(sess.run(hello))
Hello, TensorFlow!
的な感じになればOK
ちなみに私の環境ではtf.Session()のタイミングで以下のWarningメッセージが出る
2017-07-18 12:01:40.776055: W c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\35\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE instructions, but these are available on your machine and could speed up CPU computations.
2017-07-18 12:01:40.776384: W c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\35\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE2 instructions, but these are available on your machine and could speed up CPU computations.
2017-07-18 12:01:40.776650: W c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\35\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE3 instructions, but these are available on your machine and could speed up CPU computations.
2017-07-18 12:01:40.776869: W c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\35\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2017-07-18 12:01:40.777132: W c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\35\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2017-07-18 12:01:40.777399: W c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\35\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2017-07-18 12:01:40.777660: W c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\35\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
2017-07-18 12:01:40.777907: W c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\35\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
いろいろ設定すればspeed upできるらしいがとりあえずはこのままで
トラブルシューティング
import tensorflow時エラー
Traceback (most recent call last):
File "D:\DevSDK\python\lib\site-packages\tensorflow\python\pywrap_tensorflow_internal.py", line 18, in swig_import_helper
return importlib.import_module(mname)
File "D:\DevSDK\python\lib\importlib__init__.py", line 126, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "<frozen importlib._bootstrap>", line 986, in _gcd_import
File "<frozen importlib._bootstrap>", line 969, in _find_and_load
File "<frozen importlib._bootstrap>", line 958, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 666, in _load_unlocked
File "<frozen importlib._bootstrap>", line 577, in module_from_spec
File "<frozen importlib._bootstrap_external>", line 906, in create_module
File "<frozen importlib._bootstrap>", line 222, in _call_with_frames_removed
ImportError: DLL load failed: 指定されたモジュールが見つかりません。
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "D:\DevSDK\python\lib\site-packages\tensorflow\python\pywrap_tensorflow.py", line 41, in <module>
from tensorflow.python.pywrap_tensorflow_internal import *
File "D:\DevSDK\python\lib\site-packages\tensorflow\python\pywrap_tensorflow_internal.py", line 21, in <module>
_pywrap_tensorflow_internal = swig_import_helper()
File "D:\DevSDK\python\lib\site-packages\tensorflow\python\pywrap_tensorflow_internal.py", line 20, in swig_import_helper
return importlib.import_module('_pywrap_tensorflow_internal')
File "D:\DevSDK\python\lib\importlib__init__.py", line 126, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
ImportError: DLL load failed: 指定されたモジュールが見つかりません。
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "D:\DevSDK\python\lib\site-packages\tensorflow__init.py", line 24, in <module>
from tensorflow.python import *
File "D:\DevSDK\python\lib\site-packages\tensorflow\python\init.py", line 49, in <module>
from tensorflow.python import pywrap_tensorflow
File "D:\DevSDK\python\lib\site-packages\tensorflow\python\pywrap_tensorflow.py", line 52, in <module>
raise ImportError(msg)
ImportError: Traceback (most recent call last):
File "D:\DevSDK\python\lib\site-packages\tensorflow\python\pywrap_tensorflow_internal.py", line 18, in swig_import_helper
return importlib.import_module(mname)
File "D:\DevSDK\python\lib\importlib\init__.py", line 126, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "<frozen importlib._bootstrap>", line 986, in _gcd_import
File "<frozen importlib._bootstrap>", line 969, in _find_and_load
File "<frozen importlib._bootstrap>", line 958, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 666, in _load_unlocked
File "<frozen importlib._bootstrap>", line 577, in module_from_spec
File "<frozen importlib._bootstrap_external>", line 906, in create_module
File "<frozen importlib._bootstrap>", line 222, in _call_with_frames_removed
ImportError: DLL load failed: 指定されたモジュールが見つかりません。
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "D:\DevSDK\python\lib\site-packages\tensorflow\python\pywrap_tensorflow.py", line 41, in <module>
from tensorflow.python.pywrap_tensorflow_internal import *
File "D:\DevSDK\python\lib\site-packages\tensorflow\python\pywrap_tensorflow_internal.py", line 21, in <module>
_pywrap_tensorflow_internal = swig_import_helper()
File "D:\DevSDK\python\lib\site-packages\tensorflow\python\pywrap_tensorflow_internal.py", line 20, in swig_import_helper
return importlib.import_module('_pywrap_tensorflow_internal')
File "D:\DevSDK\python\lib\importlib__init__.py", line 126, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
ImportError: DLL load failed: 指定されたモジュールが見つかりません。
Failed to load the native TensorFlow runtime.
See https://www.tensorflow.org/install/install_sources#common_installation_problems
for some common reasons and solutions. Include the entire stack trace
above this error message when asking for help.
このエラーはいろいろなパターンがあるようだが、私の場合はcuDNNの最新バージョンを使っていたのが問題だった、5.1を利用することにより必要なDLLを取得できた
GPUが読み込めない
sess = tf.Session()
2017-07-18 11:28:17.078990: E c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\35\tensorflow\stream_executor\cuda\cuda_driver.cc:406] failed call to cuInit: CUDA_ERROR_NO_DEVICE
2017-07-18 11:28:17.093631: I c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\35\tensorflow\stream_executor\cuda\cuda_diagnostics.cc:158] retrieving CUDA diagnostic information for host: MyComputer
2017-07-18 11:28:17.094340: I c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\35\tensorflow\stream_executor\cuda\cuda_diagnostics.cc:165] hostname: MyComputer
GPUが認識できていない
ドライバをアップデートしたら解決した
参考
http://h-sao.com/blog/2017/04/10/how-to-install-tensorflow-gpu-on-windows/
https://blog.keiji.io/2016/05/cuda_error_no_device.html