LoginSignup
1
0

More than 3 years have passed since last update.

Mac mini で TensorFlow v2.3.0 と PlaidML を比較計測してみました(実行ログ)

Last updated at Posted at 2020-07-29

こちらは実行時のコマンドと、そのログを記載しています。
記事の内容はこちらをご参照ください。

計測結果一覧

  • mnist_mlp.py (customized)
framework CPU load elapsed time
TensorFlow v2.3.0 89 % 16.170 sec
PlaidML + Keras 42 % 23.334 sec
  • mnist_cnn.py (customized)
framework CPU load elapsed time
TensorFlow v2.3.0 92 % 188.279 sec
PlaidML + Keras 37 % 316.005 sec

measure : MLP

  • using keras/examples/mnist_mlp.py (customized)

TensorFlow v2.3.0 (CPU)

  • time : 16.170s
(tf2) $ time python3 mnist_mlp.py
60000 train samples
10000 valid samples
2020-07-29 13:26:05.609231: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7f9143ee1640 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-07-29 13:26:05.609262: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense (Dense)                (None, 512)               401920    
_________________________________________________________________
dropout (Dropout)            (None, 512)               0         
_________________________________________________________________
dense_1 (Dense)              (None, 512)               262656    
_________________________________________________________________
dropout_1 (Dropout)          (None, 512)               0         
_________________________________________________________________
dense_2 (Dense)              (None, 10)                5130      
=================================================================
Total params: 669,706
Trainable params: 669,706
Non-trainable params: 0
_________________________________________________________________
469/469 [==============================] - 11s 23ms/step - loss: 0.2473 - accuracy: 0.9233 - val_loss: 0.1034 - val_accuracy: 0.9680
Valid loss: 0.10344783961772919
Valid acc.: 0.9679999947547913

real    0m16.170s
user    0m31.537s
sys     0m4.165s
  • iostat : CPU load : 89 %
$ iostat 5
              disk0       cpu    load average
    KB/t  tps  MB/s  us sy id   1m   5m   15m
    4.00    0  0.00  19  8 73  4.74 3.39 2.58
   24.89    9  0.22  56 12 33  4.68 3.40 2.59
    0.00    0  0.00  75 14 11  4.79 3.44 2.61
    5.33    1  0.00  61 11 28  4.89 3.48 2.63
   21.97   47  1.00  13  8 79  4.73 3.47 2.63

PlaidML v0.6.4 (GPU) and Keras v2.2.4

  • PLAIDML_DEVICE_IDS

    • opencl_amd_ati_radeon_hd_6630m.0
  • time : 23.334s

(tf2) $ time python3 mnist_mlp.py
60000 train samples
10000 valid samples
INFO:plaidml:Opening device "opencl_amd_ati_radeon_hd_6630m.0"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense_1 (Dense)              (None, 512)               401920    
_________________________________________________________________
dropout_1 (Dropout)          (None, 512)               0         
_________________________________________________________________
dense_2 (Dense)              (None, 512)               262656    
_________________________________________________________________
dropout_2 (Dropout)          (None, 512)               0         
_________________________________________________________________
dense_3 (Dense)              (None, 10)                5130      
=================================================================
Total params: 669,706
Trainable params: 669,706
Non-trainable params: 0
_________________________________________________________________
Train on 60000 samples, validate on 10000 samples
Epoch 1/1
60000/60000 [==============================] - 18s 306us/step - loss: 0.2518 - acc: 0.9220 - val_loss: 0.0986 - val_acc: 0.9714
Valid loss: 0.09862979149818421
Valid acc.: 0.9714

real    0m23.334s
user    0m17.709s
sys     0m6.655s
  • iostat : CPU load : 42 %
$ iostat 5
              disk0       cpu    load average
    KB/t  tps  MB/s  us sy id   1m   5m   15m
   45.59    8  0.37   5  6 90  1.70 2.14 2.31
   25.30    9  0.21  18 10 72  1.89 2.17 2.32
   29.01   16  0.45  29 10 62  1.97 2.19 2.32
    4.00    0  0.00  27 12 61  1.98 2.18 2.32
    0.00    0  0.00  27 12 61  2.06 2.19 2.33
    4.00    0  0.00  29 12 59  1.97 2.17 2.32
   14.74   48  0.69  29 13 58  1.97 2.17 2.32
    0.00    0  0.00  13  5 82  2.06 2.19 2.32
    4.00    0  0.00  12  5 83  1.97 2.17 2.31

measure : CNN

  • using keras/examples/mnist_cnn.py (customized)

TensorFlow v2.3.0 (CPU)

  • time : 188.279s
(tf2) $ time python3 mnist_cnn.py
x_train shape: (60000, 28, 28, 1)
60000 train samples
10000 valid samples
2020-07-29 16:23:59.387600: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7fce05190fc0 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-07-29 16:23:59.387652: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d (Conv2D)              (None, 26, 26, 32)        320       
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 24, 24, 64)        18496     
_________________________________________________________________
average_pooling2d (AveragePo (None, 12, 12, 64)        0         
_________________________________________________________________
dropout (Dropout)            (None, 12, 12, 64)        0         
_________________________________________________________________
flatten (Flatten)            (None, 9216)              0         
_________________________________________________________________
dense (Dense)                (None, 128)               1179776   
_________________________________________________________________
dropout_1 (Dropout)          (None, 128)               0         
_________________________________________________________________
dense_1 (Dense)              (None, 10)                1290      
=================================================================
Total params: 1,199,882
Trainable params: 1,199,882
Non-trainable params: 0
_________________________________________________________________
469/469 [==============================] - 153s 327ms/step - loss: 2.2950 - accuracy: 0.1257 - val_loss: 2.2737 - val_accuracy: 0.2647
Valid loss: 2.273723840713501
Valid acc.: 0.2646999955177307

real    3m8.279s
user    8m30.347s
sys     0m29.370s
  • iostat : CPU load : 92 %
$ iostat 5
              disk0       cpu    load average
    KB/t  tps  MB/s  us sy id   1m   5m   15m
   56.38  179  9.87   6  6 87  2.18 2.48 2.45
   15.38  217  3.26   7  6 87  2.32 2.50 2.46
   30.12  211  6.22   9  8 83  2.30 2.49 2.46
   64.89   56  3.53  42  8 50  2.83 2.60 2.50
    6.00    0  0.00  83  8  8  3.09 2.66 2.52
   12.80    5  0.06  83  8  8  3.40 2.73 2.54
    8.00    1  0.00  83  8  8  3.53 2.77 2.56
   21.78   13  0.27  84  8  8  3.73 2.82 2.58
    0.00    0  0.00  83  8  8  3.83 2.86 2.59
    4.00    0  0.00  84  8  8  3.92 2.89 2.60
    0.00    0  0.00  84  8  8  4.01 2.93 2.62
   13.80    6  0.08  84  8  8  4.25 2.99 2.64
   20.96   14  0.29  84  8  8  4.31 3.03 2.66
   18.07   23  0.41  84  8  8  4.52 3.09 2.68
    0.00    0  0.00  84  8  8  6.00 3.42 2.80
   38.34    8  0.31  84  8  8  5.84 3.43 2.81
    0.00    0  0.00  84  8  8  6.01 3.51 2.84
    0.00    0  0.00  84  8  7  6.17 3.58 2.87
   16.00    0  0.00  84  8  7  6.40 3.67 2.90
   18.50   17  0.30  84  8  8  6.45 3.73 2.93
              disk0       cpu    load average
    KB/t  tps  MB/s  us sy id   1m   5m   15m
    5.33    1  0.00  84  8  8  6.33 3.75 2.94
    4.00    0  0.00  84  8  8  6.14 3.75 2.95
    6.40    2  0.01  84  8  8  6.21 3.81 2.97
    0.00    0  0.00  84  8  8  6.11 3.83 2.98
    0.00    0  0.00  84  8  8  6.26 3.89 3.01
   21.03   14  0.30  84  8  8  6.32 3.95 3.03
    0.00    0  0.00  84  8  8  6.38 4.00 3.06
   35.60    8  0.28  84  8  8  6.19 4.00 3.06
    0.00    0  0.00  84  8  8  6.17 4.03 3.08
   80.00    0  0.02  84  9  8  6.40 4.11 3.11
    0.00    0  0.00  84  8  8  6.04 4.08 3.11
   18.29   14  0.25  84  8  8  6.12 4.12 3.13
    8.33    2  0.02  81  8 12  5.87 4.11 3.13
    4.00    1  0.00  85  7  8  5.80 4.12 3.14
   17.28   31  0.53  81  8 11  5.90 4.17 3.16
   37.58   40  1.47  71  8 21  5.66 4.15 3.16
    9.14    1  0.01   4  6 90  5.21 4.08 3.14
   20.53   17  0.34   3  6 91  4.87 4.03 3.13

PlaidML v0.6.4 (GPU) and Keras v2.2.4

  • time : 316.005s
(tf2) $ time python3 mnist_cnn.py
x_train shape: (60000, 28, 28, 1)
60000 train samples
10000 valid samples
INFO:plaidml:Opening device "opencl_amd_ati_radeon_hd_6630m.0"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d_1 (Conv2D)            (None, 26, 26, 32)        320       
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 24, 24, 64)        18496     
_________________________________________________________________
average_pooling2d_1 (Average (None, 12, 12, 64)        0         
_________________________________________________________________
dropout_1 (Dropout)          (None, 12, 12, 64)        0         
_________________________________________________________________
flatten_1 (Flatten)          (None, 9216)              0         
_________________________________________________________________
dense_1 (Dense)              (None, 128)               1179776   
_________________________________________________________________
dropout_2 (Dropout)          (None, 128)               0         
_________________________________________________________________
dense_2 (Dense)              (None, 10)                1290      
=================================================================
Total params: 1,199,882
Trainable params: 1,199,882
Non-trainable params: 0
_________________________________________________________________
Train on 60000 samples, validate on 10000 samples
Epoch 1/1
60000/60000 [==============================] - 298s 5ms/step - loss: 0.3072 - acc: 0.9063 - val_loss: 0.0794 - val_acc: 0.9753
Valid loss: 0.07935381038188934
Valid acc.: 0.9753

real    5m16.005s
user    4m50.810s
sys     0m11.139s
  • iostat : CPU load : 37 %
$ iostat 5
              disk0       cpu    load average
    KB/t  tps  MB/s  us sy id   1m   5m   15m
   15.44   27  0.41   6  7 87  2.13 2.60 2.40
    4.00    0  0.00   4  6 90  2.12 2.59 2.39
    4.00    0  0.00  20  7 73  2.27 2.61 2.40
    0.00    0  0.00  27  4 69  2.33 2.62 2.41
    4.00    0  0.00  27  4 69  2.62 2.67 2.43
    0.00    0  0.00  28  4 68  2.65 2.68 2.43
   20.46   25  0.50  26  4 70  2.60 2.67 2.43
    0.00    0  0.00  27  4 69  2.79 2.71 2.44
   23.30   11  0.26  27  4 69  3.05 2.76 2.46
   32.07   12  0.37  28  4 68  2.96 2.75 2.46
   36.24   10  0.36  29  6 65  3.13 2.79 2.47
    6.00    1  0.00  27  4 69  3.04 2.77 2.47
   16.55   36  0.58  27  4 68  3.03 2.78 2.47
    4.00    0  0.00  28  5 68  2.95 2.76 2.47
    4.00    0  0.00  28  5 67  2.95 2.77 2.47
   31.86   17  0.53  27  4 69  2.88 2.75 2.47
   14.05    8  0.12  29  6 64  2.89 2.76 2.47
   20.72    8  0.16  27  4 69  2.90 2.76 2.48
   22.78   29  0.65  28  4 68  2.82 2.75 2.47
   26.55    2  0.06  27  3 69  2.76 2.74 2.47
              disk0       cpu    load average
    KB/t  tps  MB/s  us sy id   1m   5m   15m
   27.79   12  0.31  27  4 70  2.70 2.72 2.47
   22.35    3  0.07  28  4 69  2.64 2.71 2.46
   36.31   10  0.36  30  7 62  2.67 2.72 2.47
   34.19   13  0.43  30  4 66  2.86 2.75 2.48
   21.58   31  0.66  31  6 64  2.87 2.76 2.49
   18.50    2  0.03  28  5 67  2.96 2.78 2.49
    4.00    0  0.00  26  3 70  2.88 2.76 2.49
   21.76    7  0.14  27  4 69  2.89 2.77 2.49
   27.33    1  0.03  27  4 69  2.82 2.75 2.49
   35.08    5  0.18  28  4 68  2.75 2.74 2.49
   16.68   28  0.45  28  5 67  2.77 2.75 2.49
    4.00    0  0.00  27  4 69  2.87 2.77 2.50
    9.33    1  0.01  27  3 70  2.80 2.75 2.50
    0.00    0  0.00  26  4 70  2.74 2.74 2.49
   41.42   12  0.48  28  4 68  2.84 2.76 2.50
   35.20    1  0.03  28  5 68  3.09 2.81 2.52
   19.91   12  0.24  27  4 69  3.16 2.83 2.53
    0.00    0  0.00  26  3 71  3.07 2.82 2.53
    9.27   23  0.20  27  4 70  2.98 2.81 2.52
    0.00    0  0.00  26  3 71  2.98 2.81 2.53
              disk0       cpu    load average
    KB/t  tps  MB/s  us sy id   1m   5m   15m
    4.00    0  0.00  27  4 70  2.99 2.81 2.53
    0.00    0  0.00  26  3 71  2.99 2.82 2.53
   28.24    8  0.23  26  4 70  2.99 2.82 2.53
    0.00    0  0.00  26  3 70  2.91 2.80 2.53
    4.00    0  0.00  27  4 69  2.84 2.79 2.53
   34.93    8  0.28  27  4 69  2.77 2.78 2.52
    0.00    0  0.00  26  3 71  2.63 2.75 2.51
    0.00    0  0.00  26  3 71  2.58 2.74 2.51
   22.71   10  0.23  26  4 70  2.53 2.72 2.51
    0.00    0  0.00  26  3 71  2.49 2.71 2.50
    4.00    1  0.00  26  4 70  2.45 2.70 2.50
    4.00    1  0.00  27  4 70  2.41 2.69 2.50
    4.00    1  0.00  26  4 70  2.46 2.69 2.50
   27.88   21  0.57  26  4 70  2.50 2.70 2.50
   28.75    8  0.22  26  4 70  2.54 2.70 2.51
    4.00    0  0.00  26  4 70  2.58 2.71 2.51
    4.00    1  0.00  26  3 71  2.61 2.71 2.51
    4.00    1  0.00  26  3 71  2.56 2.70 2.51
    4.00    0  0.00  26  3 71  2.60 2.70 2.51
    4.00    0  0.00  28  4 68  2.55 2.69 2.51
              disk0       cpu    load average
    KB/t  tps  MB/s  us sy id   1m   5m   15m
   37.43    1  0.05  26  5 68  2.50 2.68 2.50
    4.00    1  0.00  26  4 70  2.78 2.73 2.52
    4.00    1  0.00  26  5 68  2.72 2.72 2.52
   33.07    9  0.29  24  7 69  2.90 2.76 2.53
   14.00    2  0.02  26  6 68  2.99 2.78 2.54
    7.47    3  0.02  16  5 79  2.83 2.75 2.53
   17.47   17  0.29   2  6 92  2.68 2.72 2.52
    0.00    0  0.00   2  6 92  2.47 2.68 2.51

setup log

(tf2) $ plaidml-setup 

PlaidML Setup (0.6.4)

Thanks for using PlaidML!

Some Notes:
  * Bugs and other issues: https://github.com/plaidml/plaidml
  * Questions: https://stackoverflow.com/questions/tagged/plaidml
  * Say hello: https://groups.google.com/forum/#!forum/plaidml-dev
  * PlaidML is licensed under the Apache License 2.0


Default Config Devices:
   No devices.

Experimental Config Devices:
   llvm_cpu.0 : CPU (LLVM)
   opencl_amd_ati_radeon_hd_6630m.0 : AMD ATI Radeon HD 6630M (OpenCL)
   opencl_cpu.0 : Intel CPU (OpenCL)

Using experimental devices can cause poor performance, crashes, and other nastiness.

Enable experimental device support? (y,n)[n]:y

Multiple devices detected (You can override by setting PLAIDML_DEVICE_IDS).
Please choose a default device:

   1 : llvm_cpu.0
   2 : opencl_amd_ati_radeon_hd_6630m.0
   3 : opencl_cpu.0

Default device? (1,2,3)[1]:2

Selected device:
    opencl_amd_ati_radeon_hd_6630m.0

Almost done. Multiplying some matrices...
Tile code:
  function (B[X,Z], C[Z,Y]) -> (A) { A[x,y : X,Y] = +(B[x,z] * C[z,y]); }
Whew. That worked.

Save settings to /Users/nobi/.plaidml? (y,n)[y]:
Success!

error log

AttributeError: module 'tensorflow' has no attribute 'get_default_graph'

EOF

1
0
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
1
0