3
2

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

More than 3 years have passed since last update.

Windows PC に conda だけでGPU計算可能な TensorFlow 2.1 をセットアップする

Last updated at Posted at 2021-05-18

TensorFlow >2.1 を試したい方は,こちらもご参照ください:

ポイント

  • わりと大胆に conda に任せられる.お手軽
  • CUDA やら CUDNN やらの個別インストールは不要
  • なんだったら管理者権限もいらなかった
  • ただし,この方法はいまのところTF2.1までのみ対応
    • 2.3とか2.4とかは Anaconda 様が対応してくれるのを待ちましょう

環境

  • Windows 10 Pro (20H2, 10.0.19042)
  • nVidia GeForce RTX 3090 (457.51)
(base) PS C:\Users\****> Get-WmiObject Win32_OperatingSystem


SystemDirectory : C:\Windows\system32
Organization    :
BuildNumber     : 19042
RegisteredUser  : ****
SerialNumber    : 
Version         : 10.0.19042


(base) PS C:\Users\****>nvidia-smi
Tue May 18 23:05:02 2021
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 457.51       Driver Version: 457.51       CUDA Version: 11.1     |
|-------------------------------+----------------------+----------------------+
| GPU  Name            TCC/WDDM | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  GeForce RTX 3090   WDDM  | 00000000:01:00.0  On |                  N/A |
| 44%   59C    P2   113W / 350W |    892MiB / 24576MiB |      1%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      1360    C+G   Insufficient Permissions        N/A      |
|    0   N/A  N/A      1560    C+G   Insufficient Permissions        N/A      |
|    0   N/A  N/A      3368    C+G   Insufficient Permissions        N/A      |
|    0   N/A  N/A      7884    C+G   Insufficient Permissions        N/A      |
|    0   N/A  N/A      8384    C+G   C:\Windows\explorer.exe         N/A      |
|    0   N/A  N/A      9208    C+G   ...ekyb3d8bbwe\YourPhone.exe    N/A      |
|    0   N/A  N/A      9952    C+G   ...5n1h2txyewy\SearchApp.exe    N/A      |
|    0   N/A  N/A     10400    C+G   ...nputApp\TextInputHost.exe    N/A      |
|    0   N/A  N/A     11312      C   ...vs\py37tf21gpu\python.exe    N/A      |
+-----------------------------------------------------------------------------+

手順

  1. GeForceドライバをできればアップデートしておく1

  2. condaを使えるようにする2

  3. Python 3.7の環境を conda で作成する3

    conda create -n py37tf21gpu python=3.7
    
  4. TensorFlow 2.1, CUDA toolkit, CuDNN などを conda で仮想環境にインストールする

    conda install tensorflow=2.1.0=gpu_py37h7db9008_0 cudnn cudatoolkit jupyter matplotlib
    
    • ここで重要なのは tensorflow のバージョンをビルドまで含めて指定すること4
      • Windows向けにGPU版が提供されているものの2021/5/18現在の最新版はTF 2.1.0 gpu_py37h7db9008_0.最新版は conda search tensorflow とかでビルドに gpu がついてるやつを探せばよい.
    • cudnn と cuda toolkit は特に未指定でも適切なものが選択されるっぽい → 今回は cudatoolkit 10.1.243 h74a9793_0 と cudnn 7.6.5 cuda10.1_0 が勝手に入った.
  5. 適当なコードで動作確認する.

    • 今回は,TensorFlow2の動作確認用のコードサンプル(非チュートリアル)
      を利用させてもらう
    • 学習がはじまるまですごーく待たされるが,はじまると一瞬でおわる.すげーこんなんで構築終わって動くのか.なんか騙されてないか?5
    • エポック数を100回くらいにして別窓でnvidia-smiを連打していると GPU 使用率 9% くらいでした.
(base) C:\Users\****>conda activate py37tf21gpu

(py37tf21gpu) C:\Users\****>ipython
Python 3.7.10 (default, Feb 26 2021, 13:06:18) [MSC v.1916 64 bit (AMD64)]
Type 'copyright', 'credits' or 'license' for more information
IPython 7.22.0 -- An enhanced Interactive Python. Type '?' for help.

In [1]: import numpy as np
   ...: import tensorflow as tf
   ...:
   ...: print(tf.__version__)
   ...:
   ...: gpus = tf.config.experimental.list_physical_devices('GPU')
   ...: if gpus:
   ...:     logical_gpus = tf.config.experimental.list_logical_devices('GPU')
   ...:     print("Physical GPUs: {}, Logical GPUs: {}".format(len(gpus), len(logical_gpus)))
   ...: else:
   ...:     print("CPU only")
   ...:
   ...: x = np.arange(-1, 1, 0.0001)
   ...: y = 0.8 * x + 0.2
   ...:
   ...: model = tf.keras.Sequential([tf.keras.layers.Dense(1, activation=None)])
   ...: model.compile("sgd", "mse")
   ...: model.build(input_shape=(0,1))
   ...: model.summary()
   ...: model.fit(x, y, epochs=5)
   ...:
   ...: print("ground truth: 0.8, 0.2")
   ...: print("estimated: ", model.variables[0][0,0].numpy(), model.variables[1][0].numpy())
2021-05-18 23:03:01.111120: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
2.1.0
2021-05-18 23:03:03.492644: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library nvcuda.dll
2021-05-18 23:03:03.621490: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: GeForce RTX 3090 computeCapability: 8.6
coreClock: 1.725GHz coreCount: 82 deviceMemorySize: 24.00GiB deviceMemoryBandwidth: 871.81GiB/s
2021-05-18 23:03:03.621670: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
2021-05-18 23:03:03.696771: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_10.dll
2021-05-18 23:03:03.733664: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_10.dll
2021-05-18 23:03:03.746518: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_10.dll
2021-05-18 23:03:03.781143: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_10.dll
2021-05-18 23:03:03.800854: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_10.dll
2021-05-18 23:03:03.857241: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2021-05-18 23:03:03.857677: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1697] Adding visible gpu devices: 0
2021-05-18 23:03:03.860583: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX AVX2
2021-05-18 23:03:03.865619: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: GeForce RTX 3090 computeCapability: 8.6
coreClock: 1.725GHz coreCount: 82 deviceMemorySize: 24.00GiB deviceMemoryBandwidth: 871.81GiB/s
2021-05-18 23:03:03.865659: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
2021-05-18 23:03:03.865682: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_10.dll
2021-05-18 23:03:03.865702: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_10.dll
2021-05-18 23:03:03.865723: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_10.dll
2021-05-18 23:03:03.865743: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_10.dll
2021-05-18 23:03:03.865764: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_10.dll
2021-05-18 23:03:03.865784: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2021-05-18 23:03:03.865825: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1697] Adding visible gpu devices: 0
2021-05-18 23:05:25.974908: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1096] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-05-18 23:05:25.974975: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102]      0
2021-05-18 23:05:25.975247: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] 0:   N
2021-05-18 23:05:25.976321: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1241] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 22065 MB memory) -> physical GPU (device: 0, name: GeForce RTX 3090, pci bus id: 0000:01:00.0, compute capability: 8.6)
Physical GPUs: 1, Logical GPUs: 1
Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #
=================================================================
dense (Dense)                multiple                  2
=================================================================
Total params: 2
Trainable params: 2
Non-trainable params: 0
_________________________________________________________________
Train on 20000 samples
Epoch 1/5
2021-05-18 23:05:26.881185: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_10.dll
20000/20000 [==============================] - 64s 3ms/sample - loss: 0.0081
Epoch 2/5
20000/20000 [==============================] - 0s 22us/sample - loss: 1.5123e-06
Epoch 3/5
20000/20000 [==============================] - 0s 21us/sample - loss: 3.5227e-10
Epoch 4/5
20000/20000 [==============================] - 0s 21us/sample - loss: 3.3563e-12
Epoch 5/5
20000/20000 [==============================] - 0s 22us/sample - loss: 3.0308e-12
ground truth: 0.8, 0.2
estimated:  0.80000293 0.19999996

In [2]:

最新版とまではいかずとも,とりあえず試す分にはお手軽でいいんじゃないでしょうか.

  1. 互換性がおおむね保たれている範囲では大丈夫なんでしょう.

  2. 今回はユーザーモードで miniconda をインストール・利用した.具体的には https://repo.anaconda.com/miniconda/Miniconda3-py39_4.9.2-Windows-x86_64.exe.

  3. 後述のように GPU対応版バイナリは TF2.1 のみが提供されているのでそれに応じて Python 3.7 にしておいた……が,次のステップでの tensorflow のバージョン・ビルド指定でのインストールのふるまいをみると Python 3.7にしなくてもよいのかもしれない.

  4. バージョン未指定だと Windows の最新バイナリの 2.3 などがインストールされる,がこれには gpu 版がないので,CPUのみのMKL版がインストールされる.そして巷間流布されているように別途 cuda やら cudnn やら tensorflow やら入れないと動かなかったりする.入れてもバージョン不適合で動かなかったりもする.

  5. ごりごり使っているわけではないので,何か不具合があっても不思議には思わない

3
2
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
3
2

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?