こちらは実行時のコマンドと、そのログを記載しています。
記事の内容はこちらをご参照ください。
計測結果一覧
- mnist_mlp.py (customized)
framework | CPU load | elapsed time |
---|---|---|
TensorFlow v2.3.0 | 89 % | 16.170 sec |
PlaidML + Keras | 42 % | 23.334 sec |
- mnist_cnn.py (customized)
framework | CPU load | elapsed time |
---|---|---|
TensorFlow v2.3.0 | 92 % | 188.279 sec |
PlaidML + Keras | 37 % | 316.005 sec |
measure : MLP
- using keras/examples/mnist_mlp.py (customized)
TensorFlow v2.3.0 (CPU)
- time : 16.170s
(tf2) $ time python3 mnist_mlp.py
60000 train samples
10000 valid samples
2020-07-29 13:26:05.609231: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7f9143ee1640 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-07-29 13:26:05.609262: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense (Dense) (None, 512) 401920
_________________________________________________________________
dropout (Dropout) (None, 512) 0
_________________________________________________________________
dense_1 (Dense) (None, 512) 262656
_________________________________________________________________
dropout_1 (Dropout) (None, 512) 0
_________________________________________________________________
dense_2 (Dense) (None, 10) 5130
=================================================================
Total params: 669,706
Trainable params: 669,706
Non-trainable params: 0
_________________________________________________________________
469/469 [==============================] - 11s 23ms/step - loss: 0.2473 - accuracy: 0.9233 - val_loss: 0.1034 - val_accuracy: 0.9680
Valid loss: 0.10344783961772919
Valid acc.: 0.9679999947547913
real 0m16.170s
user 0m31.537s
sys 0m4.165s
- iostat : CPU load : 89 %
$ iostat 5
disk0 cpu load average
KB/t tps MB/s us sy id 1m 5m 15m
4.00 0 0.00 19 8 73 4.74 3.39 2.58
24.89 9 0.22 56 12 33 4.68 3.40 2.59
0.00 0 0.00 75 14 11 4.79 3.44 2.61
5.33 1 0.00 61 11 28 4.89 3.48 2.63
21.97 47 1.00 13 8 79 4.73 3.47 2.63
PlaidML v0.6.4 (GPU) and Keras v2.2.4
-
PLAIDML_DEVICE_IDS
- opencl_amd_ati_radeon_hd_6630m.0
-
time : 23.334s
(tf2) $ time python3 mnist_mlp.py
60000 train samples
10000 valid samples
INFO:plaidml:Opening device "opencl_amd_ati_radeon_hd_6630m.0"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense_1 (Dense) (None, 512) 401920
_________________________________________________________________
dropout_1 (Dropout) (None, 512) 0
_________________________________________________________________
dense_2 (Dense) (None, 512) 262656
_________________________________________________________________
dropout_2 (Dropout) (None, 512) 0
_________________________________________________________________
dense_3 (Dense) (None, 10) 5130
=================================================================
Total params: 669,706
Trainable params: 669,706
Non-trainable params: 0
_________________________________________________________________
Train on 60000 samples, validate on 10000 samples
Epoch 1/1
60000/60000 [==============================] - 18s 306us/step - loss: 0.2518 - acc: 0.9220 - val_loss: 0.0986 - val_acc: 0.9714
Valid loss: 0.09862979149818421
Valid acc.: 0.9714
real 0m23.334s
user 0m17.709s
sys 0m6.655s
- iostat : CPU load : 42 %
$ iostat 5
disk0 cpu load average
KB/t tps MB/s us sy id 1m 5m 15m
45.59 8 0.37 5 6 90 1.70 2.14 2.31
25.30 9 0.21 18 10 72 1.89 2.17 2.32
29.01 16 0.45 29 10 62 1.97 2.19 2.32
4.00 0 0.00 27 12 61 1.98 2.18 2.32
0.00 0 0.00 27 12 61 2.06 2.19 2.33
4.00 0 0.00 29 12 59 1.97 2.17 2.32
14.74 48 0.69 29 13 58 1.97 2.17 2.32
0.00 0 0.00 13 5 82 2.06 2.19 2.32
4.00 0 0.00 12 5 83 1.97 2.17 2.31
measure : CNN
- using keras/examples/mnist_cnn.py (customized)
TensorFlow v2.3.0 (CPU)
- time : 188.279s
(tf2) $ time python3 mnist_cnn.py
x_train shape: (60000, 28, 28, 1)
60000 train samples
10000 valid samples
2020-07-29 16:23:59.387600: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7fce05190fc0 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-07-29 16:23:59.387652: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d (Conv2D) (None, 26, 26, 32) 320
_________________________________________________________________
conv2d_1 (Conv2D) (None, 24, 24, 64) 18496
_________________________________________________________________
average_pooling2d (AveragePo (None, 12, 12, 64) 0
_________________________________________________________________
dropout (Dropout) (None, 12, 12, 64) 0
_________________________________________________________________
flatten (Flatten) (None, 9216) 0
_________________________________________________________________
dense (Dense) (None, 128) 1179776
_________________________________________________________________
dropout_1 (Dropout) (None, 128) 0
_________________________________________________________________
dense_1 (Dense) (None, 10) 1290
=================================================================
Total params: 1,199,882
Trainable params: 1,199,882
Non-trainable params: 0
_________________________________________________________________
469/469 [==============================] - 153s 327ms/step - loss: 2.2950 - accuracy: 0.1257 - val_loss: 2.2737 - val_accuracy: 0.2647
Valid loss: 2.273723840713501
Valid acc.: 0.2646999955177307
real 3m8.279s
user 8m30.347s
sys 0m29.370s
- iostat : CPU load : 92 %
$ iostat 5
disk0 cpu load average
KB/t tps MB/s us sy id 1m 5m 15m
56.38 179 9.87 6 6 87 2.18 2.48 2.45
15.38 217 3.26 7 6 87 2.32 2.50 2.46
30.12 211 6.22 9 8 83 2.30 2.49 2.46
64.89 56 3.53 42 8 50 2.83 2.60 2.50
6.00 0 0.00 83 8 8 3.09 2.66 2.52
12.80 5 0.06 83 8 8 3.40 2.73 2.54
8.00 1 0.00 83 8 8 3.53 2.77 2.56
21.78 13 0.27 84 8 8 3.73 2.82 2.58
0.00 0 0.00 83 8 8 3.83 2.86 2.59
4.00 0 0.00 84 8 8 3.92 2.89 2.60
0.00 0 0.00 84 8 8 4.01 2.93 2.62
13.80 6 0.08 84 8 8 4.25 2.99 2.64
20.96 14 0.29 84 8 8 4.31 3.03 2.66
18.07 23 0.41 84 8 8 4.52 3.09 2.68
0.00 0 0.00 84 8 8 6.00 3.42 2.80
38.34 8 0.31 84 8 8 5.84 3.43 2.81
0.00 0 0.00 84 8 8 6.01 3.51 2.84
0.00 0 0.00 84 8 7 6.17 3.58 2.87
16.00 0 0.00 84 8 7 6.40 3.67 2.90
18.50 17 0.30 84 8 8 6.45 3.73 2.93
disk0 cpu load average
KB/t tps MB/s us sy id 1m 5m 15m
5.33 1 0.00 84 8 8 6.33 3.75 2.94
4.00 0 0.00 84 8 8 6.14 3.75 2.95
6.40 2 0.01 84 8 8 6.21 3.81 2.97
0.00 0 0.00 84 8 8 6.11 3.83 2.98
0.00 0 0.00 84 8 8 6.26 3.89 3.01
21.03 14 0.30 84 8 8 6.32 3.95 3.03
0.00 0 0.00 84 8 8 6.38 4.00 3.06
35.60 8 0.28 84 8 8 6.19 4.00 3.06
0.00 0 0.00 84 8 8 6.17 4.03 3.08
80.00 0 0.02 84 9 8 6.40 4.11 3.11
0.00 0 0.00 84 8 8 6.04 4.08 3.11
18.29 14 0.25 84 8 8 6.12 4.12 3.13
8.33 2 0.02 81 8 12 5.87 4.11 3.13
4.00 1 0.00 85 7 8 5.80 4.12 3.14
17.28 31 0.53 81 8 11 5.90 4.17 3.16
37.58 40 1.47 71 8 21 5.66 4.15 3.16
9.14 1 0.01 4 6 90 5.21 4.08 3.14
20.53 17 0.34 3 6 91 4.87 4.03 3.13
PlaidML v0.6.4 (GPU) and Keras v2.2.4
- time : 316.005s
(tf2) $ time python3 mnist_cnn.py
x_train shape: (60000, 28, 28, 1)
60000 train samples
10000 valid samples
INFO:plaidml:Opening device "opencl_amd_ati_radeon_hd_6630m.0"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_1 (Conv2D) (None, 26, 26, 32) 320
_________________________________________________________________
conv2d_2 (Conv2D) (None, 24, 24, 64) 18496
_________________________________________________________________
average_pooling2d_1 (Average (None, 12, 12, 64) 0
_________________________________________________________________
dropout_1 (Dropout) (None, 12, 12, 64) 0
_________________________________________________________________
flatten_1 (Flatten) (None, 9216) 0
_________________________________________________________________
dense_1 (Dense) (None, 128) 1179776
_________________________________________________________________
dropout_2 (Dropout) (None, 128) 0
_________________________________________________________________
dense_2 (Dense) (None, 10) 1290
=================================================================
Total params: 1,199,882
Trainable params: 1,199,882
Non-trainable params: 0
_________________________________________________________________
Train on 60000 samples, validate on 10000 samples
Epoch 1/1
60000/60000 [==============================] - 298s 5ms/step - loss: 0.3072 - acc: 0.9063 - val_loss: 0.0794 - val_acc: 0.9753
Valid loss: 0.07935381038188934
Valid acc.: 0.9753
real 5m16.005s
user 4m50.810s
sys 0m11.139s
- iostat : CPU load : 37 %
$ iostat 5
disk0 cpu load average
KB/t tps MB/s us sy id 1m 5m 15m
15.44 27 0.41 6 7 87 2.13 2.60 2.40
4.00 0 0.00 4 6 90 2.12 2.59 2.39
4.00 0 0.00 20 7 73 2.27 2.61 2.40
0.00 0 0.00 27 4 69 2.33 2.62 2.41
4.00 0 0.00 27 4 69 2.62 2.67 2.43
0.00 0 0.00 28 4 68 2.65 2.68 2.43
20.46 25 0.50 26 4 70 2.60 2.67 2.43
0.00 0 0.00 27 4 69 2.79 2.71 2.44
23.30 11 0.26 27 4 69 3.05 2.76 2.46
32.07 12 0.37 28 4 68 2.96 2.75 2.46
36.24 10 0.36 29 6 65 3.13 2.79 2.47
6.00 1 0.00 27 4 69 3.04 2.77 2.47
16.55 36 0.58 27 4 68 3.03 2.78 2.47
4.00 0 0.00 28 5 68 2.95 2.76 2.47
4.00 0 0.00 28 5 67 2.95 2.77 2.47
31.86 17 0.53 27 4 69 2.88 2.75 2.47
14.05 8 0.12 29 6 64 2.89 2.76 2.47
20.72 8 0.16 27 4 69 2.90 2.76 2.48
22.78 29 0.65 28 4 68 2.82 2.75 2.47
26.55 2 0.06 27 3 69 2.76 2.74 2.47
disk0 cpu load average
KB/t tps MB/s us sy id 1m 5m 15m
27.79 12 0.31 27 4 70 2.70 2.72 2.47
22.35 3 0.07 28 4 69 2.64 2.71 2.46
36.31 10 0.36 30 7 62 2.67 2.72 2.47
34.19 13 0.43 30 4 66 2.86 2.75 2.48
21.58 31 0.66 31 6 64 2.87 2.76 2.49
18.50 2 0.03 28 5 67 2.96 2.78 2.49
4.00 0 0.00 26 3 70 2.88 2.76 2.49
21.76 7 0.14 27 4 69 2.89 2.77 2.49
27.33 1 0.03 27 4 69 2.82 2.75 2.49
35.08 5 0.18 28 4 68 2.75 2.74 2.49
16.68 28 0.45 28 5 67 2.77 2.75 2.49
4.00 0 0.00 27 4 69 2.87 2.77 2.50
9.33 1 0.01 27 3 70 2.80 2.75 2.50
0.00 0 0.00 26 4 70 2.74 2.74 2.49
41.42 12 0.48 28 4 68 2.84 2.76 2.50
35.20 1 0.03 28 5 68 3.09 2.81 2.52
19.91 12 0.24 27 4 69 3.16 2.83 2.53
0.00 0 0.00 26 3 71 3.07 2.82 2.53
9.27 23 0.20 27 4 70 2.98 2.81 2.52
0.00 0 0.00 26 3 71 2.98 2.81 2.53
disk0 cpu load average
KB/t tps MB/s us sy id 1m 5m 15m
4.00 0 0.00 27 4 70 2.99 2.81 2.53
0.00 0 0.00 26 3 71 2.99 2.82 2.53
28.24 8 0.23 26 4 70 2.99 2.82 2.53
0.00 0 0.00 26 3 70 2.91 2.80 2.53
4.00 0 0.00 27 4 69 2.84 2.79 2.53
34.93 8 0.28 27 4 69 2.77 2.78 2.52
0.00 0 0.00 26 3 71 2.63 2.75 2.51
0.00 0 0.00 26 3 71 2.58 2.74 2.51
22.71 10 0.23 26 4 70 2.53 2.72 2.51
0.00 0 0.00 26 3 71 2.49 2.71 2.50
4.00 1 0.00 26 4 70 2.45 2.70 2.50
4.00 1 0.00 27 4 70 2.41 2.69 2.50
4.00 1 0.00 26 4 70 2.46 2.69 2.50
27.88 21 0.57 26 4 70 2.50 2.70 2.50
28.75 8 0.22 26 4 70 2.54 2.70 2.51
4.00 0 0.00 26 4 70 2.58 2.71 2.51
4.00 1 0.00 26 3 71 2.61 2.71 2.51
4.00 1 0.00 26 3 71 2.56 2.70 2.51
4.00 0 0.00 26 3 71 2.60 2.70 2.51
4.00 0 0.00 28 4 68 2.55 2.69 2.51
disk0 cpu load average
KB/t tps MB/s us sy id 1m 5m 15m
37.43 1 0.05 26 5 68 2.50 2.68 2.50
4.00 1 0.00 26 4 70 2.78 2.73 2.52
4.00 1 0.00 26 5 68 2.72 2.72 2.52
33.07 9 0.29 24 7 69 2.90 2.76 2.53
14.00 2 0.02 26 6 68 2.99 2.78 2.54
7.47 3 0.02 16 5 79 2.83 2.75 2.53
17.47 17 0.29 2 6 92 2.68 2.72 2.52
0.00 0 0.00 2 6 92 2.47 2.68 2.51
setup log
(tf2) $ plaidml-setup
PlaidML Setup (0.6.4)
Thanks for using PlaidML!
Some Notes:
* Bugs and other issues: https://github.com/plaidml/plaidml
* Questions: https://stackoverflow.com/questions/tagged/plaidml
* Say hello: https://groups.google.com/forum/#!forum/plaidml-dev
* PlaidML is licensed under the Apache License 2.0
Default Config Devices:
No devices.
Experimental Config Devices:
llvm_cpu.0 : CPU (LLVM)
opencl_amd_ati_radeon_hd_6630m.0 : AMD ATI Radeon HD 6630M (OpenCL)
opencl_cpu.0 : Intel CPU (OpenCL)
Using experimental devices can cause poor performance, crashes, and other nastiness.
Enable experimental device support? (y,n)[n]:y
Multiple devices detected (You can override by setting PLAIDML_DEVICE_IDS).
Please choose a default device:
1 : llvm_cpu.0
2 : opencl_amd_ati_radeon_hd_6630m.0
3 : opencl_cpu.0
Default device? (1,2,3)[1]:2
Selected device:
opencl_amd_ati_radeon_hd_6630m.0
Almost done. Multiplying some matrices...
Tile code:
function (B[X,Z], C[Z,Y]) -> (A) { A[x,y : X,Y] = +(B[x,z] * C[z,y]); }
Whew. That worked.
Save settings to /Users/nobi/.plaidml? (y,n)[y]:
Success!
error log
AttributeError: module 'tensorflow' has no attribute 'get_default_graph'
-
AttributeError: module 'tensorflow' has no attribute 'get_default_graph'
- it caused by compatibility btwn Keras 2.2.4 and TF.