Python
Ubuntu
機械学習
TensorFlow

【Ubuntu Server 16.04 LTS】機械学習用にPCを組んでみた その5(TensorFlowのチュートリアルを動かしてみる)

More than 1 year has passed since last update.

tensorflow-gpuを動かしてみます。
gpuで動かすためにはCUDAを導入する必要があるので、まだな方は下記をご参照ください。
【Ubuntu Server 16.04 LTS】機械学習用にPCを組んでみた その2(GTX-1080TiドライバとCUDA9.0をインストール)

そもそもpythonをまだインストールしてないよって方は下記をどうぞ。
【Ubuntu Server 16.04 LTS】機械学習用にPCを組んでみた その4(pythonインストール)

仮想環境を構築

はじめにtensorflow-gpu1.0.1をインストールした仮想環境を構築します。
※tensorflow-gpuのバージョンを1.0.1に指定しているのは普段使っているmacbook airの環境と合わせるためです。特に深い意味はありません。

$ conda create -n tensorflow-1.0.1 tensorflow-gpu=1.0.1

tensorflow-gpu以外は最新のバージョンがインストールされます。これでインストールを進めてよいか確認されるので問題なければそのまま進めます。

Fetching package metadata ...........
Solving package specifications: .

Package plan for installation in environment /home/hoge/.pyenv/versions/anaconda3-5.0.0/envs/tensorflow-1.0.1:

The following NEW packages will be INSTALLED:

    ca-certificates: 2017.08.26-h1d4fec5_0
    certifi:         2017.7.27.1-py36h8b7b77e_0
    cudatoolkit:     7.5-2
    cudnn:           5.1-0
    intel-openmp:    2018.0.0-h15fc484_7
    libedit:         3.1-heed3624_0
    libffi:          3.2.1-h4deb6c0_3
    libgcc-ng:       7.2.0-h7cc24e2_2
    libprotobuf:     3.4.0-0
    libstdcxx-ng:    7.2.0-h7a57d05_2
    mkl:             2018.0.0-hb491cac_4
    ncurses:         6.0-h06874d7_1
    numpy:           1.13.3-py36ha12f23b_0
    openssl:         1.0.2m-h8cfc7e7_0
    pip:             9.0.1-py36h6c6f9ce_4
    protobuf:        3.4.0-py36_0
    python:          3.6.3-h0ef2715_3
    readline:        7.0-hac23ff0_3
    setuptools:      36.5.0-py36he42e2e1_0
    six:             1.11.0-py36h372c433_1
    sqlite:          3.20.1-h6d8b0f3_1
    tensorflow-gpu:  1.0.1-py36_4
    tk:              8.6.7-h5979e9b_1
    wheel:           0.29.0-py36he7f4e38_1
    xz:              5.2.3-h2bcbf08_1
    zlib:            1.2.11-hfbfcf68_1

Proceed ([y]/n)? y

openssl-1.0.2m 100% |##########################| Time: 0:00:00  11.22 MB/s
python-3.6.3-h 100% |##########################| Time: 0:00:01  25.71 MB/s
numpy-1.13.3-p 100% |##########################| Time: 0:00:00  17.57 MB/s
six-1.11.0-py3 100% |##########################| Time: 0:00:00  12.91 MB/s
protobuf-3.4.0 100% |##########################| Time: 0:00:00  23.47 MB/s
tensorflow-gpu 100% |##########################| Time: 0:00:03  21.53 MB/s
pip-9.0.1-py36 100% |##########################| Time: 0:00:00  19.19 MB/s
#
# To activate this environment, use:
# > source activate tensorflow-1.0.1
#
# To deactivate an active environment, use:
# > source deactivate
#

tensorflowが問題なく動作するか試してみます。
まずは先程作成した仮想環境に切り替えます。

$ source activate tensorflow-1.0.1

次の簡単なプログラムで実行してみます。

import tensorflow as tf
hello = tf.constant('Hello, TensorFlow!')
sess = tf.Session()
print(sess.run(hello)

下記が実際に実行したときの結果です。

(tensorflow-1.0.1) $ python
Python 3.5.4 |Anaconda, Inc.| (default, Oct 13 2017, 11:22:58)
[GCC 7.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcublas.so.7.5 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcudnn.so.5 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcufft.so.7.5 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcurand.so.7.5 locally
>>> hello = tf.constant('Hello, Tensorflow')
>>> sess = tf.Session()
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:910] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
I tensorflow/core/common_runtime/gpu/gpu_device.cc:885] Found device 0 with properties:
name: GeForce GTX 1080 Ti
major: 6 minor: 1 memoryClockRate (GHz) 1.683
pciBusID 0000:02:00.0
Total memory: 10.91GiB
Free memory: 10.75GiB
I tensorflow/core/common_runtime/gpu/gpu_device.cc:906] DMA: 0
I tensorflow/core/common_runtime/gpu/gpu_device.cc:916] 0:   Y
I tensorflow/core/common_runtime/gpu/gpu_device.cc:975] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:02:00.0)
>>> print(sess.run(hello))
b'Hello, Tensorflow'

問題なく動作し、GPUも認識されているようです。

チュートリアルのMNISTを試してみる

最後にチュートリアルを試してみて、macbook airで動かしたときと比較してみます。
それぞれのスペックを載せておきます。

ubuntu(機械学習用) macbook
CPU Intel Core i7-7700K 4.2GHz Intel Core i5 1.6GHz
Memory 32GB 8GB
GPU GTX 1080ti 11GB Intel HD Graphics 6000 1536 MB

チュートリアルの内容に関する詳細はこちら
まずはプログラムファイルをGitHubから取得します。

$ git clone https://github.com/tensorflow/tensorflow.git
$ cd tensorflow/tensorflow/examples/tutorials/mnist

そのまま実行してもいいですが、処理時間を計測したいので少し修正します。

$ vim mnist_deep.py

main関数の最初と最後に計測用のコードを追記します。追加するコードは【Python】処理にかかる時間を計測して表示を参照してください。

プログラムを実行します。

$ python mnist_deep.py
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcublas.so.7.5 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcudnn.so.5 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcufft.so.7.5 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcurand.so.7.5 locally
test
Extracting /tmp/tensorflow/mnist/input_data/train-images-idx3-ubyte.gz
Extracting /tmp/tensorflow/mnist/input_data/train-labels-idx1-ubyte.gz
Extracting /tmp/tensorflow/mnist/input_data/t10k-images-idx3-ubyte.gz
Extracting /tmp/tensorflow/mnist/input_data/t10k-labels-idx1-ubyte.gz
Saving graph to: /tmp/tmpr3vt07e8
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:910] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
I tensorflow/core/common_runtime/gpu/gpu_device.cc:885] Found device 0 with properties:
name: GeForce GTX 1080 Ti
major: 6 minor: 1 memoryClockRate (GHz) 1.683
pciBusID 0000:02:00.0
Total memory: 10.91GiB
Free memory: 10.75GiB
I tensorflow/core/common_runtime/gpu/gpu_device.cc:906] DMA: 0
I tensorflow/core/common_runtime/gpu/gpu_device.cc:916] 0:   Y
I tensorflow/core/common_runtime/gpu/gpu_device.cc:975] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:02:00.0)
step 0, training accuracy 0.14
step 100, training accuracy 0.82
step 200, training accuracy 0.96
:
:(省略)
:
step 19900, training accuracy 1
test accuracy 0.9915
elapsed_time:121.55799126625061[sec]

処理時間:121秒

比較のため普段使っているmacbook airで実行してみます。

$ python mnist_deep.py
Extracting /tmp/tensorflow/mnist/input_data/train-images-idx3-ubyte.gz
Extracting /tmp/tensorflow/mnist/input_data/train-labels-idx1-ubyte.gz
Extracting /tmp/tensorflow/mnist/input_data/t10k-images-idx3-ubyte.gz
Extracting /tmp/tensorflow/mnist/input_data/t10k-labels-idx1-ubyte.gz
Saving graph to: /var/folders/hk/134y674s3t989cbc01hm54r40000gn/T/tmppqu54tya
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
step 0, training accuracy 0.08
step 100, training accuracy 0.88
step 200, training accuracy 0.9
:
:(省略)
:
step 19900, training accuracy 1
test accuracy 0.9908
elapsed_time:4494.314824104309[sec]

処理時間:4494秒
約1時間15分と結構な時間がかかります。
やはり本格的に機械学習に取り組むなら専用のPCが必須ですね。