Pythonの深層学習用プラットフォームのtensorflowにはCPU版とGPU版があります。
今回はCondaでGPU版tensorflowを使用する仮想環境を構築する方法の備忘録。
環境
python 3.9.7
conda 22.9.0
cudatoolkit 11.2
cudnn 8.1.0
方法
Windows用64bit版anacondaをインストールし、anaconda promptを起動
conda create -n 仮想環境名
activate 仮想環境名
conda install -c conda-forge cudattolkit=11.2 cudnn=8.1.0
conda install pip
pip install tensorflow-gpu
nvidia-smi
確認
pythonコマンドで以下を実行。
from tensorflow.python.client import device_lib
device_lib.list_local_devices()
下のようにGPUが表示されていればGPUが認識されている。CPUのみの場合、GPUが認識されてない。
[name: "/device:CPU:0"
device_type: "CPU"
memory_limit: 268435456
locality {
}
incarnation: 5675548518759226956
xla_global_id: -1,
name: "/device:GPU:0"
device_type: "GPU"
memory_limit: 14287896576
locality {
bus_id: 1
links {
}
}
incarnation: 1655257362142395030
physical_device_desc: "device: 0, name: NVIDIA RTX A5000 Laptop GPU, pci bus id: 0000:01:00.0, compute capability: 8.6"
xla_global_id: 416903419]
テスト
import tensorflow as tf
mnist = tf.keras.datasets.mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train = x_train.reshape(60000, 28, 28, 1)
x_test = x_test.reshape(10000, 28, 28, 1)
y_train = tf.keras.utils.to_categorical(y_train, 10)
y_test = tf.keras.utils.to_categorical(y_test, 10)
x_train, x_test = x_train / 255.0, x_test / 255.0
model = tf.keras.models.Sequential()
model.add(tf.keras.layers.Conv2D(filters = 64, kernel_size = (3,3), activation = 'relu', padding = 'same', input_shape = (28, 28, 1), name = 'b1_conv1'))
model.add(tf.keras.layers.Conv2D(filters = 64, kernel_size = (3,3), activation = 'relu', padding = 'same', name = 'b1_conv2'))
model.add(tf.keras.layers.MaxPool2D(name = 'b1_pool1'))
model.add(tf.keras.layers.Conv2D(filters = 64, kernel_size = (3,3), activation = 'relu', padding = 'same', name = 'b2_conv1'))
model.add(tf.keras.layers.MaxPool2D(name = 'b2_pool1'))
model.add(tf.keras.layers.Flatten(name = 'flatten'))
model.add(tf.keras.layers.Dense(units = 64, activation = 'relu', name = 'dense1'))
model.add(tf.keras.layers.Dense(units = 10, activation = 'softmax', name = 'dense2'))
model.summary()
Model: "sequential_11"
Layer (type) Output Shape Param #
b1_conv1 (Conv2D) (None, 28, 28, 64) 640
b1_conv2 (Conv2D) (None, 28, 28, 64) 36928
b1_pool1 (MaxPooling2D) (None, 14, 14, 64) 0
b2_conv1 (Conv2D) (None, 14, 14, 64) 36928
b2_pool1 (MaxPooling2D) (None, 7, 7, 64) 0
flatten (Flatten) (None, 3136) 0
dense1 (Dense) (None, 64) 200768
dense2 (Dense) (None, 10) 650
=================================================================
Total params: 275,914
Trainable params: 275,914
Non-trainable params: 0
model.compile(loss = 'categorical_crossentropy', optimizer = 'adam', metrics = ['accuracy'])
model.fit(x_train, y_train, batch_size = 32, epochs = 8, validation_split = 0.2)
Epoch 1/8
1500/1500 [==============================] - 10s 3ms/step - loss: 0.1369 - accuracy: 0.9573 - val_loss: 0.0475 - val_accuracy: 0.9865
Epoch 2/8
1500/1500 [==============================] - 5s 3ms/step - loss: 0.0441 - accuracy: 0.9863 - val_loss: 0.0457 - val_accuracy: 0.9865
Epoch 3/8
1500/1500 [==============================] - 5s 3ms/step - loss: 0.0302 - accuracy: 0.9898 - val_loss: 0.0426 - val_accuracy: 0.9879
Epoch 4/8
1500/1500 [==============================] - 5s 3ms/step - loss: 0.0212 - accuracy: 0.9929 - val_loss: 0.0390 - val_accuracy: 0.9882
Epoch 5/8
1500/1500 [==============================] - 5s 3ms/step - loss: 0.0167 - accuracy: 0.9946 - val_loss: 0.0439 - val_accuracy: 0.9889
Epoch 6/8
1500/1500 [==============================] - 5s 3ms/step - loss: 0.0135 - accuracy: 0.9956 - val_loss: 0.0339 - val_accuracy: 0.9915
Epoch 7/8
1500/1500 [==============================] - 5s 3ms/step - loss: 0.0105 - accuracy: 0.9964 - val_loss: 0.0408 - val_accuracy: 0.9910
Epoch 8/8
1500/1500 [==============================] - 5s 3ms/step - loss: 0.0097 - accuracy: 0.9967 - val_loss: 0.0364 - val_accuracy: 0.9912
1 epochにつき3~5秒で実行できます。
注意点
tensorflowはpipでインストールする必要があります。condaでインストールするとGPUが認識されません。