Windows10上にTensorFlow_GPU環境構築

  • 0
    いいね
  • 3
    コメント

    概要

    Windows10にTensorFlow with GPU Support環境を構築する
    公式に書いてある通りにやればたぶん大丈夫だけど、CUDA, cuDNNとかのバージョン違いによって苦しんだのでそこら辺を中心にメモ
    https://www.tensorflow.org/install/install_windows

    ドキュメントの流れを見ていると環境構築の流れも変わりやすいようなので永続的なチームでの利用を考えているのであればDockerで管理するほうがいい

    詳細

    GPU利用のためにCUDA, cuDNNのインストール

    CUDA

    http://docs.nvidia.com/cuda/cuda-installation-guide-microsoft-windows/#axzz4n8zhtk3x

    • CUDA Toolkitをダウンロードした後、CUDAをインストール
    • インストール後GPUが認識できるか、deviceQueryを使って必ず確認する
    C:\ProgramData\NVIDIA Corporation\CUDA Samples\v8.0\bin\win64\Release
    deviceQuery.exe
    
    • こんな感じでエラーになったらGPUのアップデートが必要、デバイスマネージャからアップデートしましょう
    C:\ProgramData\NVIDIA Corporation\CUDA Samples\v8.0\bin\win64\Release>deviceQuery.exe
    deviceQuery.exe Starting...
    
    CUDA Device Query (Runtime API) version (CUDART static linking)
    
    cudaGetDeviceCount returned 38
    -> no CUDA-capable device is detected
    Result = FAIL
    
    • こんな感じに出力されれば読み取れている
    C:\ProgramData\NVIDIA Corporation\CUDA Samples\v8.0\bin\win64\Release>deviceQuery.exe
    deviceQuery.exe Starting...
    
     CUDA Device Query (Runtime API) version (CUDART static linking)
    
    Detected 1 CUDA Capable device(s)
    
    Device 0: "GeForce GTX 1080"
      CUDA Driver Version / Runtime Version          8.0 / 8.0
      CUDA Capability Major/Minor version number:    6.1
      Total amount of global memory:                 8192 MBytes (8589934592 bytes)
      (20) Multiprocessors, (128) CUDA Cores/MP:     2560 CUDA Cores
      GPU Max Clock rate:                            1847 MHz (1.85 GHz)
      Memory Clock rate:                             5005 Mhz
      Memory Bus Width:                              256-bit
      L2 Cache Size:                                 2097152 bytes
      Maximum Texture Dimension Size (x,y,z)         1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384)
      Maximum Layered 1D Texture Size, (num) layers  1D=(32768), 2048 layers
      Maximum Layered 2D Texture Size, (num) layers  2D=(32768, 32768), 2048 layers
      Total amount of constant memory:               65536 bytes
      Total amount of shared memory per block:       49152 bytes
      Total number of registers available per block: 65536
      Warp size:                                     32
      Maximum number of threads per multiprocessor:  2048
      Maximum number of threads per block:           1024
      Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
      Max dimension size of a grid size    (x,y,z): (2147483647, 65535, 65535)
      Maximum memory pitch:                          2147483647 bytes
      Texture alignment:                             512 bytes
      Concurrent copy and kernel execution:          Yes with 2 copy engine(s)
      Run time limit on kernels:                     Yes
      Integrated GPU sharing Host Memory:            No
      Support host page-locked memory mapping:       Yes
      Alignment requirement for Surfaces:            Yes
      Device has ECC support:                        Disabled
      CUDA Device Driver Mode (TCC or WDDM):         WDDM (Windows Display Driver Model)
      Device supports Unified Addressing (UVA):      Yes
      Device PCI Domain ID / Bus ID / location ID:   0 / 1 / 0
      Compute Mode:
         < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >
    
    deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 8.0, CUDA Runtime Version = 8.0, NumDevs = 1, Device0 = GeForce GTX 1080
    R
    

    cuDNN

    cuDNNをダウンロードしてきてPATHを通す
    https://developer.nvidia.com/cudnn
    最新バージョンだと必要なdllが入っていなくて実行時エラーとなる、とりあえず5.1を使うことによって回避できる

    追記
    TensorFlow1.3から6.0対応予定
    Thank you! @maru3

    ダウンロード後PATHにcuDNNをダウンロード、展開したディレクトリにある/binファイルをPATHに設定する
    cuDNN.PNG

    python install

    https://www.python.org/downloads/release/python-352/

    TensorFlow install

    pip3を使ってinstall
    Anacondaを使ったインストール方法もあるが、今回はpip3を使う

    C:\> pip3 install --upgrade tensorflow-gpu
    

    動作確認

    $ python
    >>> import tensorflow as tf
    >>> hello = tf.constant('Hello, TensorFlow!')
    >>> sess = tf.Session()
    >>> print(sess.run(hello))
    
    Hello, TensorFlow!
    

    的な感じになればOK

    ちなみに私の環境ではtf.Session()のタイミングで以下のWarningメッセージが出る

    2017-07-18 12:01:40.776055: W c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\35\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE instructions, but these are available on your machine and could speed up CPU computations.
    2017-07-18 12:01:40.776384: W c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\35\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE2 instructions, but these are available on your machine and could speed up CPU computations.
    2017-07-18 12:01:40.776650: W c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\35\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE3 instructions, but these are available on your machine and could speed up CPU computations.
    2017-07-18 12:01:40.776869: W c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\35\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
    2017-07-18 12:01:40.777132: W c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\35\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
    2017-07-18 12:01:40.777399: W c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\35\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
    2017-07-18 12:01:40.777660: W c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\35\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
    2017-07-18 12:01:40.777907: W c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\35\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
    

    いろいろ設定すればspeed upできるらしいがとりあえずはこのままで

    トラブルシューティング

    import tensorflow時エラー

    Traceback (most recent call last):
    File "D:\DevSDK\python\lib\site-packages\tensorflow\python\pywrap_tensorflow_internal.py", line 18, in swig_import_helper
    return importlib.import_module(mname)
    File "D:\DevSDK\python\lib\importlib__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
    File "<frozen importlib._bootstrap>", line 986, in _gcd_import
    File "<frozen importlib._bootstrap>", line 969, in _find_and_load
    File "<frozen importlib._bootstrap>", line 958, in _find_and_load_unlocked
    File "<frozen importlib._bootstrap>", line 666, in _load_unlocked
    File "<frozen importlib._bootstrap>", line 577, in module_from_spec
    File "<frozen importlib._bootstrap_external>", line 906, in create_module
    File "<frozen importlib._bootstrap>", line 222, in _call_with_frames_removed
    ImportError: DLL load failed: 指定されたモジュールが見つかりません。
    
    During handling of the above exception, another exception occurred:
    
    Traceback (most recent call last):
    File "D:\DevSDK\python\lib\site-packages\tensorflow\python\pywrap_tensorflow.py", line 41, in <module>
    from tensorflow.python.pywrap_tensorflow_internal import *
    File "D:\DevSDK\python\lib\site-packages\tensorflow\python\pywrap_tensorflow_internal.py", line 21, in <module>
    _pywrap_tensorflow_internal = swig_import_helper()
    File "D:\DevSDK\python\lib\site-packages\tensorflow\python\pywrap_tensorflow_internal.py", line 20, in swig_import_helper
    return importlib.import_module('_pywrap_tensorflow_internal')
    File "D:\DevSDK\python\lib\importlib__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
    ImportError: DLL load failed: 指定されたモジュールが見つかりません。
    
    During handling of the above exception, another exception occurred:
    
    Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
    File "D:\DevSDK\python\lib\site-packages\tensorflow__init.py", line 24, in <module>
    from tensorflow.python import *
    File "D:\DevSDK\python\lib\site-packages\tensorflow\python\init.py", line 49, in <module>
    from tensorflow.python import pywrap_tensorflow
    File "D:\DevSDK\python\lib\site-packages\tensorflow\python\pywrap_tensorflow.py", line 52, in <module>
    raise ImportError(msg)
    ImportError: Traceback (most recent call last):
    File "D:\DevSDK\python\lib\site-packages\tensorflow\python\pywrap_tensorflow_internal.py", line 18, in swig_import_helper
    return importlib.import_module(mname)
    File "D:\DevSDK\python\lib\importlib\init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
    File "<frozen importlib._bootstrap>", line 986, in _gcd_import
    File "<frozen importlib._bootstrap>", line 969, in _find_and_load
    File "<frozen importlib._bootstrap>", line 958, in _find_and_load_unlocked
    File "<frozen importlib._bootstrap>", line 666, in _load_unlocked
    File "<frozen importlib._bootstrap>", line 577, in module_from_spec
    File "<frozen importlib._bootstrap_external>", line 906, in create_module
    File "<frozen importlib._bootstrap>", line 222, in _call_with_frames_removed
    ImportError: DLL load failed: 指定されたモジュールが見つかりません。
    
    During handling of the above exception, another exception occurred:
    
    Traceback (most recent call last):
    File "D:\DevSDK\python\lib\site-packages\tensorflow\python\pywrap_tensorflow.py", line 41, in <module>
    from tensorflow.python.pywrap_tensorflow_internal import *
    File "D:\DevSDK\python\lib\site-packages\tensorflow\python\pywrap_tensorflow_internal.py", line 21, in <module>
    _pywrap_tensorflow_internal = swig_import_helper()
    File "D:\DevSDK\python\lib\site-packages\tensorflow\python\pywrap_tensorflow_internal.py", line 20, in swig_import_helper
    return importlib.import_module('_pywrap_tensorflow_internal')
    File "D:\DevSDK\python\lib\importlib__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
    ImportError: DLL load failed: 指定されたモジュールが見つかりません。
    
    Failed to load the native TensorFlow runtime.
    
    See https://www.tensorflow.org/install/install_sources#common_installation_problems
    
    for some common reasons and solutions. Include the entire stack trace
    above this error message when asking for help.
    

    このエラーはいろいろなパターンがあるようだが、私の場合はcuDNNの最新バージョンを使っていたのが問題だった、5.1を利用することにより必要なDLLを取得できた

    GPUが読み込めない

    sess = tf.Session()
    
    2017-07-18 11:28:17.078990: E c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\35\tensorflow\stream_executor\cuda\cuda_driver.cc:406] failed call to cuInit: CUDA_ERROR_NO_DEVICE
    2017-07-18 11:28:17.093631: I c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\35\tensorflow\stream_executor\cuda\cuda_diagnostics.cc:158] retrieving CUDA diagnostic information for host: MyComputer
    2017-07-18 11:28:17.094340: I c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\35\tensorflow\stream_executor\cuda\cuda_diagnostics.cc:165] hostname: MyComputer
    

    GPUが認識できていない
    ドライバをアップデートしたら解決した

    参考

    http://h-sao.com/blog/2017/04/10/how-to-install-tensorflow-gpu-on-windows/
    https://blog.keiji.io/2016/05/cuda_error_no_device.html