LoginSignup
12
22

More than 5 years have passed since last update.

Windows10上にTensorFlow_GPU環境構築

Last updated at Posted at 2017-07-18

概要

Windows10にTensorFlow with GPU Support環境を構築する
公式に書いてある通りにやればたぶん大丈夫だけど、CUDA, cuDNNとかのバージョン違いによって苦しんだのでそこら辺を中心にメモ
https://www.tensorflow.org/install/install_windows

ドキュメントの流れを見ていると環境構築の流れも変わりやすいようなので永続的なチームでの利用を考えているのであればDockerで管理するほうがいい

詳細

GPU利用のためにCUDA, cuDNNのインストール

CUDA

  • CUDA Toolkitをダウンロードした後、CUDAをインストール
  • インストール後GPUが認識できるか、deviceQueryを使って必ず確認する
C:\ProgramData\NVIDIA Corporation\CUDA Samples\v8.0\bin\win64\Release
deviceQuery.exe
  • こんな感じでエラーになったらGPUのアップデートが必要、デバイスマネージャからアップデートしましょう
C:\ProgramData\NVIDIA Corporation\CUDA Samples\v8.0\bin\win64\Release>deviceQuery.exe
deviceQuery.exe Starting...

CUDA Device Query (Runtime API) version (CUDART static linking)

cudaGetDeviceCount returned 38
-> no CUDA-capable device is detected
Result = FAIL
  • こんな感じに出力されれば読み取れている
C:\ProgramData\NVIDIA Corporation\CUDA Samples\v8.0\bin\win64\Release>deviceQuery.exe
deviceQuery.exe Starting...

 CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 1 CUDA Capable device(s)

Device 0: "GeForce GTX 1080"
  CUDA Driver Version / Runtime Version          8.0 / 8.0
  CUDA Capability Major/Minor version number:    6.1
  Total amount of global memory:                 8192 MBytes (8589934592 bytes)
  (20) Multiprocessors, (128) CUDA Cores/MP:     2560 CUDA Cores
  GPU Max Clock rate:                            1847 MHz (1.85 GHz)
  Memory Clock rate:                             5005 Mhz
  Memory Bus Width:                              256-bit
  L2 Cache Size:                                 2097152 bytes
  Maximum Texture Dimension Size (x,y,z)         1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384)
  Maximum Layered 1D Texture Size, (num) layers  1D=(32768), 2048 layers
  Maximum Layered 2D Texture Size, (num) layers  2D=(32768, 32768), 2048 layers
  Total amount of constant memory:               65536 bytes
  Total amount of shared memory per block:       49152 bytes
  Total number of registers available per block: 65536
  Warp size:                                     32
  Maximum number of threads per multiprocessor:  2048
  Maximum number of threads per block:           1024
  Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
  Max dimension size of a grid size    (x,y,z): (2147483647, 65535, 65535)
  Maximum memory pitch:                          2147483647 bytes
  Texture alignment:                             512 bytes
  Concurrent copy and kernel execution:          Yes with 2 copy engine(s)
  Run time limit on kernels:                     Yes
  Integrated GPU sharing Host Memory:            No
  Support host page-locked memory mapping:       Yes
  Alignment requirement for Surfaces:            Yes
  Device has ECC support:                        Disabled
  CUDA Device Driver Mode (TCC or WDDM):         WDDM (Windows Display Driver Model)
  Device supports Unified Addressing (UVA):      Yes
  Device PCI Domain ID / Bus ID / location ID:   0 / 1 / 0
  Compute Mode:
     < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 8.0, CUDA Runtime Version = 8.0, NumDevs = 1, Device0 = GeForce GTX 1080
R

cuDNN

cuDNNをダウンロードしてきてPATHを通す
https://developer.nvidia.com/cudnn
最新バージョンだと必要なdllが入っていなくて実行時エラーとなる、とりあえず5.1を使うことによって回避できる

追記
TensorFlow1.3から6.0対応予定
Thank you! @maru3

ダウンロード後PATHにcuDNNをダウンロード、展開したディレクトリにある/binファイルをPATHに設定する
cuDNN.PNG

python install

TensorFlow install

pip3を使ってinstall
Anacondaを使ったインストール方法もあるが、今回はpip3を使う

C:\> pip3 install --upgrade tensorflow-gpu

動作確認

$ python
>>> import tensorflow as tf
>>> hello = tf.constant('Hello, TensorFlow!')
>>> sess = tf.Session()
>>> print(sess.run(hello))

Hello, TensorFlow!

的な感じになればOK

ちなみに私の環境ではtf.Session()のタイミングで以下のWarningメッセージが出る

2017-07-18 12:01:40.776055: W c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\35\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE instructions, but these are available on your machine and could speed up CPU computations.
2017-07-18 12:01:40.776384: W c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\35\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE2 instructions, but these are available on your machine and could speed up CPU computations.
2017-07-18 12:01:40.776650: W c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\35\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE3 instructions, but these are available on your machine and could speed up CPU computations.
2017-07-18 12:01:40.776869: W c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\35\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2017-07-18 12:01:40.777132: W c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\35\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2017-07-18 12:01:40.777399: W c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\35\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2017-07-18 12:01:40.777660: W c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\35\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
2017-07-18 12:01:40.777907: W c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\35\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.

いろいろ設定すればspeed upできるらしいがとりあえずはこのままで

トラブルシューティング

import tensorflow時エラー

Traceback (most recent call last):
File "D:\DevSDK\python\lib\site-packages\tensorflow\python\pywrap_tensorflow_internal.py", line 18, in swig_import_helper
return importlib.import_module(mname)
File "D:\DevSDK\python\lib\importlib__init__.py", line 126, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "<frozen importlib._bootstrap>", line 986, in _gcd_import
File "<frozen importlib._bootstrap>", line 969, in _find_and_load
File "<frozen importlib._bootstrap>", line 958, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 666, in _load_unlocked
File "<frozen importlib._bootstrap>", line 577, in module_from_spec
File "<frozen importlib._bootstrap_external>", line 906, in create_module
File "<frozen importlib._bootstrap>", line 222, in _call_with_frames_removed
ImportError: DLL load failed: 指定されたモジュールが見つかりません。

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "D:\DevSDK\python\lib\site-packages\tensorflow\python\pywrap_tensorflow.py", line 41, in <module>
from tensorflow.python.pywrap_tensorflow_internal import *
File "D:\DevSDK\python\lib\site-packages\tensorflow\python\pywrap_tensorflow_internal.py", line 21, in <module>
_pywrap_tensorflow_internal = swig_import_helper()
File "D:\DevSDK\python\lib\site-packages\tensorflow\python\pywrap_tensorflow_internal.py", line 20, in swig_import_helper
return importlib.import_module('_pywrap_tensorflow_internal')
File "D:\DevSDK\python\lib\importlib__init__.py", line 126, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
ImportError: DLL load failed: 指定されたモジュールが見つかりません。

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "D:\DevSDK\python\lib\site-packages\tensorflow__init.py", line 24, in <module>
from tensorflow.python import *
File "D:\DevSDK\python\lib\site-packages\tensorflow\python\init.py", line 49, in <module>
from tensorflow.python import pywrap_tensorflow
File "D:\DevSDK\python\lib\site-packages\tensorflow\python\pywrap_tensorflow.py", line 52, in <module>
raise ImportError(msg)
ImportError: Traceback (most recent call last):
File "D:\DevSDK\python\lib\site-packages\tensorflow\python\pywrap_tensorflow_internal.py", line 18, in swig_import_helper
return importlib.import_module(mname)
File "D:\DevSDK\python\lib\importlib\init__.py", line 126, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "<frozen importlib._bootstrap>", line 986, in _gcd_import
File "<frozen importlib._bootstrap>", line 969, in _find_and_load
File "<frozen importlib._bootstrap>", line 958, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 666, in _load_unlocked
File "<frozen importlib._bootstrap>", line 577, in module_from_spec
File "<frozen importlib._bootstrap_external>", line 906, in create_module
File "<frozen importlib._bootstrap>", line 222, in _call_with_frames_removed
ImportError: DLL load failed: 指定されたモジュールが見つかりません。

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "D:\DevSDK\python\lib\site-packages\tensorflow\python\pywrap_tensorflow.py", line 41, in <module>
from tensorflow.python.pywrap_tensorflow_internal import *
File "D:\DevSDK\python\lib\site-packages\tensorflow\python\pywrap_tensorflow_internal.py", line 21, in <module>
_pywrap_tensorflow_internal = swig_import_helper()
File "D:\DevSDK\python\lib\site-packages\tensorflow\python\pywrap_tensorflow_internal.py", line 20, in swig_import_helper
return importlib.import_module('_pywrap_tensorflow_internal')
File "D:\DevSDK\python\lib\importlib__init__.py", line 126, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
ImportError: DLL load failed: 指定されたモジュールが見つかりません。

Failed to load the native TensorFlow runtime.

See https://www.tensorflow.org/install/install_sources#common_installation_problems

for some common reasons and solutions. Include the entire stack trace
above this error message when asking for help.

このエラーはいろいろなパターンがあるようだが、私の場合はcuDNNの最新バージョンを使っていたのが問題だった、5.1を利用することにより必要なDLLを取得できた

GPUが読み込めない

sess = tf.Session()

2017-07-18 11:28:17.078990: E c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\35\tensorflow\stream_executor\cuda\cuda_driver.cc:406] failed call to cuInit: CUDA_ERROR_NO_DEVICE
2017-07-18 11:28:17.093631: I c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\35\tensorflow\stream_executor\cuda\cuda_diagnostics.cc:158] retrieving CUDA diagnostic information for host: MyComputer
2017-07-18 11:28:17.094340: I c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\35\tensorflow\stream_executor\cuda\cuda_diagnostics.cc:165] hostname: MyComputer

GPUが認識できていない
ドライバをアップデートしたら解決した

参考

http://h-sao.com/blog/2017/04/10/how-to-install-tensorflow-gpu-on-windows/
https://blog.keiji.io/2016/05/cuda_error_no_device.html

12
22
4

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
12
22