More than 3 years have passed since last update.

Mac mini で TensorFlow v2.3.0 と PlaidML を比較計測してみました（実行ログ）

Last updated at 2020-07-29Posted at 2020-07-29

こちらは実行時のコマンドと、そのログを記載しています。
記事の内容はこちらをご参照ください。

計測結果一覧

mnist_mlp.py (customized)

framework	CPU load	elapsed time
TensorFlow v2.3.0	89 %	16.170 sec
PlaidML + Keras	42 %	23.334 sec

mnist_cnn.py (customized)

framework	CPU load	elapsed time
TensorFlow v2.3.0	92 %	188.279 sec
PlaidML + Keras	37 %	316.005 sec

measure : MLP

using keras/examples/mnist_mlp.py (customized)

TensorFlow v2.3.0 (CPU)

time : 16.170s

(tf2) $ time python3 mnist_mlp.py
60000 train samples
10000 valid samples
2020-07-29 13:26:05.609231: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7f9143ee1640 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-07-29 13:26:05.609262: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense (Dense)                (None, 512)               401920    
_________________________________________________________________
dropout (Dropout)            (None, 512)               0         
_________________________________________________________________
dense_1 (Dense)              (None, 512)               262656    
_________________________________________________________________
dropout_1 (Dropout)          (None, 512)               0         
_________________________________________________________________
dense_2 (Dense)              (None, 10)                5130      
=================================================================
Total params: 669,706
Trainable params: 669,706
Non-trainable params: 0
_________________________________________________________________
469/469 [==============================] - 11s 23ms/step - loss: 0.2473 - accuracy: 0.9233 - val_loss: 0.1034 - val_accuracy: 0.9680
Valid loss: 0.10344783961772919
Valid acc.: 0.9679999947547913

real	0m16.170s
user	0m31.537s
sys 	0m4.165s

iostat : CPU load : 89 %

$ iostat 5
              disk0       cpu    load average
    KB/t  tps  MB/s  us sy id   1m   5m   15m
    4.00    0  0.00  19  8 73  4.74 3.39 2.58
   24.89    9  0.22  56 12 33  4.68 3.40 2.59
    0.00    0  0.00  75 14 11  4.79 3.44 2.61
    5.33    1  0.00  61 11 28  4.89 3.48 2.63
   21.97   47  1.00  13  8 79  4.73 3.47 2.63

PlaidML v0.6.4 (GPU) and Keras v2.2.4

PLAIDML_DEVICE_IDS
- opencl_amd_ati_radeon_hd_6630m.0
time : 23.334s

(tf2) $ time python3 mnist_mlp.py
60000 train samples
10000 valid samples
INFO:plaidml:Opening device "opencl_amd_ati_radeon_hd_6630m.0"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense_1 (Dense)              (None, 512)               401920    
_________________________________________________________________
dropout_1 (Dropout)          (None, 512)               0         
_________________________________________________________________
dense_2 (Dense)              (None, 512)               262656    
_________________________________________________________________
dropout_2 (Dropout)          (None, 512)               0         
_________________________________________________________________
dense_3 (Dense)              (None, 10)                5130      
=================================================================
Total params: 669,706
Trainable params: 669,706
Non-trainable params: 0
_________________________________________________________________
Train on 60000 samples, validate on 10000 samples
Epoch 1/1
60000/60000 [==============================] - 18s 306us/step - loss: 0.2518 - acc: 0.9220 - val_loss: 0.0986 - val_acc: 0.9714
Valid loss: 0.09862979149818421
Valid acc.: 0.9714

real	0m23.334s
user	0m17.709s
sys 	0m6.655s

iostat : CPU load : 42 %

$ iostat 5
              disk0       cpu    load average
    KB/t  tps  MB/s  us sy id   1m   5m   15m
   45.59    8  0.37   5  6 90  1.70 2.14 2.31
   25.30    9  0.21  18 10 72  1.89 2.17 2.32
   29.01   16  0.45  29 10 62  1.97 2.19 2.32
    4.00    0  0.00  27 12 61  1.98 2.18 2.32
    0.00    0  0.00  27 12 61  2.06 2.19 2.33
    4.00    0  0.00  29 12 59  1.97 2.17 2.32
   14.74   48  0.69  29 13 58  1.97 2.17 2.32
    0.00    0  0.00  13  5 82  2.06 2.19 2.32
    4.00    0  0.00  12  5 83  1.97 2.17 2.31

measure : CNN

using keras/examples/mnist_cnn.py (customized)

TensorFlow v2.3.0 (CPU)

time : 188.279s

(tf2) $ time python3 mnist_cnn.py
x_train shape: (60000, 28, 28, 1)
60000 train samples
10000 valid samples
2020-07-29 16:23:59.387600: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7fce05190fc0 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-07-29 16:23:59.387652: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d (Conv2D)              (None, 26, 26, 32)        320       
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 24, 24, 64)        18496     
_________________________________________________________________
average_pooling2d (AveragePo (None, 12, 12, 64)        0         
_________________________________________________________________
dropout (Dropout)            (None, 12, 12, 64)        0         
_________________________________________________________________
flatten (Flatten)            (None, 9216)              0         
_________________________________________________________________
dense (Dense)                (None, 128)               1179776   
_________________________________________________________________
dropout_1 (Dropout)          (None, 128)               0         
_________________________________________________________________
dense_1 (Dense)              (None, 10)                1290      
=================================================================
Total params: 1,199,882
Trainable params: 1,199,882
Non-trainable params: 0
_________________________________________________________________
469/469 [==============================] - 153s 327ms/step - loss: 2.2950 - accuracy: 0.1257 - val_loss: 2.2737 - val_accuracy: 0.2647
Valid loss: 2.273723840713501
Valid acc.: 0.2646999955177307

real	3m8.279s
user	8m30.347s
sys 	0m29.370s

iostat : CPU load : 92 %

$ iostat 5
              disk0       cpu    load average
    KB/t  tps  MB/s  us sy id   1m   5m   15m
   56.38  179  9.87   6  6 87  2.18 2.48 2.45
   15.38  217  3.26   7  6 87  2.32 2.50 2.46
   30.12  211  6.22   9  8 83  2.30 2.49 2.46
   64.89   56  3.53  42  8 50  2.83 2.60 2.50
    6.00    0  0.00  83  8  8  3.09 2.66 2.52
   12.80    5  0.06  83  8  8  3.40 2.73 2.54
    8.00    1  0.00  83  8  8  3.53 2.77 2.56
   21.78   13  0.27  84  8  8  3.73 2.82 2.58
    0.00    0  0.00  83  8  8  3.83 2.86 2.59
    4.00    0  0.00  84  8  8  3.92 2.89 2.60
    0.00    0  0.00  84  8  8  4.01 2.93 2.62
   13.80    6  0.08  84  8  8  4.25 2.99 2.64
   20.96   14  0.29  84  8  8  4.31 3.03 2.66
   18.07   23  0.41  84  8  8  4.52 3.09 2.68
    0.00    0  0.00  84  8  8  6.00 3.42 2.80
   38.34    8  0.31  84  8  8  5.84 3.43 2.81
    0.00    0  0.00  84  8  8  6.01 3.51 2.84
    0.00    0  0.00  84  8  7  6.17 3.58 2.87
   16.00    0  0.00  84  8  7  6.40 3.67 2.90
   18.50   17  0.30  84  8  8  6.45 3.73 2.93
              disk0       cpu    load average
    KB/t  tps  MB/s  us sy id   1m   5m   15m
    5.33    1  0.00  84  8  8  6.33 3.75 2.94
    4.00    0  0.00  84  8  8  6.14 3.75 2.95
    6.40    2  0.01  84  8  8  6.21 3.81 2.97
    0.00    0  0.00  84  8  8  6.11 3.83 2.98
    0.00    0  0.00  84  8  8  6.26 3.89 3.01
   21.03   14  0.30  84  8  8  6.32 3.95 3.03
    0.00    0  0.00  84  8  8  6.38 4.00 3.06
   35.60    8  0.28  84  8  8  6.19 4.00 3.06
    0.00    0  0.00  84  8  8  6.17 4.03 3.08
   80.00    0  0.02  84  9  8  6.40 4.11 3.11
    0.00    0  0.00  84  8  8  6.04 4.08 3.11
   18.29   14  0.25  84  8  8  6.12 4.12 3.13
    8.33    2  0.02  81  8 12  5.87 4.11 3.13
    4.00    1  0.00  85  7  8  5.80 4.12 3.14
   17.28   31  0.53  81  8 11  5.90 4.17 3.16
   37.58   40  1.47  71  8 21  5.66 4.15 3.16
    9.14    1  0.01   4  6 90  5.21 4.08 3.14
   20.53   17  0.34   3  6 91  4.87 4.03 3.13

PlaidML v0.6.4 (GPU) and Keras v2.2.4

time : 316.005s

(tf2) $ time python3 mnist_cnn.py
x_train shape: (60000, 28, 28, 1)
60000 train samples
10000 valid samples
INFO:plaidml:Opening device "opencl_amd_ati_radeon_hd_6630m.0"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d_1 (Conv2D)            (None, 26, 26, 32)        320       
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 24, 24, 64)        18496     
_________________________________________________________________
average_pooling2d_1 (Average (None, 12, 12, 64)        0         
_________________________________________________________________
dropout_1 (Dropout)          (None, 12, 12, 64)        0         
_________________________________________________________________
flatten_1 (Flatten)          (None, 9216)              0         
_________________________________________________________________
dense_1 (Dense)              (None, 128)               1179776   
_________________________________________________________________
dropout_2 (Dropout)          (None, 128)               0         
_________________________________________________________________
dense_2 (Dense)              (None, 10)                1290      
=================================================================
Total params: 1,199,882
Trainable params: 1,199,882
Non-trainable params: 0
_________________________________________________________________
Train on 60000 samples, validate on 10000 samples
Epoch 1/1
60000/60000 [==============================] - 298s 5ms/step - loss: 0.3072 - acc: 0.9063 - val_loss: 0.0794 - val_acc: 0.9753
Valid loss: 0.07935381038188934
Valid acc.: 0.9753

real	5m16.005s
user	4m50.810s
sys 	0m11.139s

iostat : CPU load : 37 %

$ iostat 5
              disk0       cpu    load average
    KB/t  tps  MB/s  us sy id   1m   5m   15m
   15.44   27  0.41   6  7 87  2.13 2.60 2.40
    4.00    0  0.00   4  6 90  2.12 2.59 2.39
    4.00    0  0.00  20  7 73  2.27 2.61 2.40
    0.00    0  0.00  27  4 69  2.33 2.62 2.41
    4.00    0  0.00  27  4 69  2.62 2.67 2.43
    0.00    0  0.00  28  4 68  2.65 2.68 2.43
   20.46   25  0.50  26  4 70  2.60 2.67 2.43
    0.00    0  0.00  27  4 69  2.79 2.71 2.44
   23.30   11  0.26  27  4 69  3.05 2.76 2.46
   32.07   12  0.37  28  4 68  2.96 2.75 2.46
   36.24   10  0.36  29  6 65  3.13 2.79 2.47
    6.00    1  0.00  27  4 69  3.04 2.77 2.47
   16.55   36  0.58  27  4 68  3.03 2.78 2.47
    4.00    0  0.00  28  5 68  2.95 2.76 2.47
    4.00    0  0.00  28  5 67  2.95 2.77 2.47
   31.86   17  0.53  27  4 69  2.88 2.75 2.47
   14.05    8  0.12  29  6 64  2.89 2.76 2.47
   20.72    8  0.16  27  4 69  2.90 2.76 2.48
   22.78   29  0.65  28  4 68  2.82 2.75 2.47
   26.55    2  0.06  27  3 69  2.76 2.74 2.47
              disk0       cpu    load average
    KB/t  tps  MB/s  us sy id   1m   5m   15m
   27.79   12  0.31  27  4 70  2.70 2.72 2.47
   22.35    3  0.07  28  4 69  2.64 2.71 2.46
   36.31   10  0.36  30  7 62  2.67 2.72 2.47
   34.19   13  0.43  30  4 66  2.86 2.75 2.48
   21.58   31  0.66  31  6 64  2.87 2.76 2.49
   18.50    2  0.03  28  5 67  2.96 2.78 2.49
    4.00    0  0.00  26  3 70  2.88 2.76 2.49
   21.76    7  0.14  27  4 69  2.89 2.77 2.49
   27.33    1  0.03  27  4 69  2.82 2.75 2.49
   35.08    5  0.18  28  4 68  2.75 2.74 2.49
   16.68   28  0.45  28  5 67  2.77 2.75 2.49
    4.00    0  0.00  27  4 69  2.87 2.77 2.50
    9.33    1  0.01  27  3 70  2.80 2.75 2.50
    0.00    0  0.00  26  4 70  2.74 2.74 2.49
   41.42   12  0.48  28  4 68  2.84 2.76 2.50
   35.20    1  0.03  28  5 68  3.09 2.81 2.52
   19.91   12  0.24  27  4 69  3.16 2.83 2.53
    0.00    0  0.00  26  3 71  3.07 2.82 2.53
    9.27   23  0.20  27  4 70  2.98 2.81 2.52
    0.00    0  0.00  26  3 71  2.98 2.81 2.53
              disk0       cpu    load average
    KB/t  tps  MB/s  us sy id   1m   5m   15m
    4.00    0  0.00  27  4 70  2.99 2.81 2.53
    0.00    0  0.00  26  3 71  2.99 2.82 2.53
   28.24    8  0.23  26  4 70  2.99 2.82 2.53
    0.00    0  0.00  26  3 70  2.91 2.80 2.53
    4.00    0  0.00  27  4 69  2.84 2.79 2.53
   34.93    8  0.28  27  4 69  2.77 2.78 2.52
    0.00    0  0.00  26  3 71  2.63 2.75 2.51
    0.00    0  0.00  26  3 71  2.58 2.74 2.51
   22.71   10  0.23  26  4 70  2.53 2.72 2.51
    0.00    0  0.00  26  3 71  2.49 2.71 2.50
    4.00    1  0.00  26  4 70  2.45 2.70 2.50
    4.00    1  0.00  27  4 70  2.41 2.69 2.50
    4.00    1  0.00  26  4 70  2.46 2.69 2.50
   27.88   21  0.57  26  4 70  2.50 2.70 2.50
   28.75    8  0.22  26  4 70  2.54 2.70 2.51
    4.00    0  0.00  26  4 70  2.58 2.71 2.51
    4.00    1  0.00  26  3 71  2.61 2.71 2.51
    4.00    1  0.00  26  3 71  2.56 2.70 2.51
    4.00    0  0.00  26  3 71  2.60 2.70 2.51
    4.00    0  0.00  28  4 68  2.55 2.69 2.51
              disk0       cpu    load average
    KB/t  tps  MB/s  us sy id   1m   5m   15m
   37.43    1  0.05  26  5 68  2.50 2.68 2.50
    4.00    1  0.00  26  4 70  2.78 2.73 2.52
    4.00    1  0.00  26  5 68  2.72 2.72 2.52
   33.07    9  0.29  24  7 69  2.90 2.76 2.53
   14.00    2  0.02  26  6 68  2.99 2.78 2.54
    7.47    3  0.02  16  5 79  2.83 2.75 2.53
   17.47   17  0.29   2  6 92  2.68 2.72 2.52
    0.00    0  0.00   2  6 92  2.47 2.68 2.51

setup log

(tf2) $ plaidml-setup 

PlaidML Setup (0.6.4)

Thanks for using PlaidML!

Some Notes:
  * Bugs and other issues: https://github.com/plaidml/plaidml
  * Questions: https://stackoverflow.com/questions/tagged/plaidml
  * Say hello: https://groups.google.com/forum/#!forum/plaidml-dev
  * PlaidML is licensed under the Apache License 2.0
 

Default Config Devices:
   No devices.

Experimental Config Devices:
   llvm_cpu.0 : CPU (LLVM)
   opencl_amd_ati_radeon_hd_6630m.0 : AMD ATI Radeon HD 6630M (OpenCL)
   opencl_cpu.0 : Intel CPU (OpenCL)

Using experimental devices can cause poor performance, crashes, and other nastiness.

Enable experimental device support? (y,n)[n]:y

Multiple devices detected (You can override by setting PLAIDML_DEVICE_IDS).
Please choose a default device:

   1 : llvm_cpu.0
   2 : opencl_amd_ati_radeon_hd_6630m.0
   3 : opencl_cpu.0

Default device? (1,2,3)[1]:2

Selected device:
    opencl_amd_ati_radeon_hd_6630m.0

Almost done. Multiplying some matrices...
Tile code:
  function (B[X,Z], C[Z,Y]) -> (A) { A[x,y : X,Y] = +(B[x,z] * C[z,y]); }
Whew. That worked.

Save settings to /Users/nobi/.plaidml? (y,n)[y]:
Success!

error log

AttributeError: module 'tensorflow' has no attribute 'get_default_graph'

AttributeError: module 'tensorflow' has no attribute 'get_default_graph'
- it caused by compatibility btwn Keras 2.2.4 and TF.

EOF

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up