LoginSignup
1
0

Tensorflow : M1 MaxとRTX4090のパフォーマンス比較

Last updated at Posted at 2023-09-07

ITとは関係の無い普段の仕事用にM1 MaxのMacbookを使っているんですが、機械学習で、NVIDIAの最新民生用ボードと比較してどれぐらいのパフォーマンスが出るかを知りたく、検証してみました。

コードはこちらの4に記載のスクリプトです。
https://developer.apple.com/metal/tensorflow-plugin/

*M1 Max 10-core CPU/ 32-core GPU
-Without tensorflow plugin

Epoch 1/5
2023-08-12 14:07:20.945626: W tensorflow/tsl/platform/profile_utils/cpu_utils.cc:128] Failed to get CPU frequency: 0 Hz
782/782 [==============================] - 348s 443ms/step - loss: 4.7625 - accuracy: 0.0782
Epoch 2/5
782/782 [==============================] - 345s 441ms/step - loss: 4.2499 - accuracy: 0.1255
Epoch 3/5
782/782 [==============================] - 348s 445ms/step - loss: 3.9645 - accuracy: 0.1518
Epoch 4/5
782/782 [==============================] - 359s 459ms/step - loss: 3.5721 - accuracy: 0.1895
Epoch 5/5
782/782 [==============================] - 367s 469ms/step - loss: 3.3477 - accuracy: 0.2222
CPU times: user 1h 23min 53s, sys: 15min 9s, total: 1h 39min 3s
Wall time: 29min 41s

-With Tensorflow plugin

Epoch 1/5
2023-08-12 14:40:22.793065: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:114] Plugin optimizer for device_type GPU is enabled.
782/782 [==============================] - 61s 65ms/step - loss: 4.8102 - accuracy: 0.0699
Epoch 2/5
782/782 [==============================] - 49s 63ms/step - loss: 4.4811 - accuracy: 0.0876
Epoch 3/5
782/782 [==============================] - 49s 63ms/step - loss: 4.2097 - accuracy: 0.1066
Epoch 4/5
782/782 [==============================] - 49s 63ms/step - loss: 4.2189 - accuracy: 0.0958
Epoch 5/5
782/782 [==============================] - 49s 62ms/step - loss: 3.7773 - accuracy: 0.1462
CPU times: user 4min 10s, sys: 54.9 s, total: 5min 5s
Wall time: 4min 25s

*Kaggle T4x2(参考)

Epoch 1/5
782/782 [==============================] - 84s 43ms/step - loss: 5.0559 - accuracy: 0.0460
Epoch 2/5
782/782 [==============================] - 33s 42ms/step - loss: 4.3158 - accuracy: 0.0787
Epoch 3/5
782/782 [==============================] - 33s 42ms/step - loss: 4.1034 - accuracy: 0.1102
Epoch 4/5
782/782 [==============================] - 34s 43ms/step - loss: 4.0032 - accuracy: 0.1279
Epoch 5/5
782/782 [==============================] - 33s 42ms/step - loss: 3.7757 - accuracy: 0.1461

*Kaggle TPU(参考)

Epoch 1/5
2023-08-12 07:10:31.223489: E tensorflow/core/grappler/optimizers/meta_optimizer.cc:954] model_pruner failed: INVALID_ARGUMENT: Graph does not contain terminal node AssignAddVariableOp.
2023-08-12 07:10:31.999218: E tensorflow/core/grappler/optimizers/meta_optimizer.cc:954] model_pruner failed: INVALID_ARGUMENT: Graph does not contain terminal node AssignAddVariableOp.
782/782 [==============================] - 56s 36ms/step - loss: 4.9752 - accuracy: 0.0451
Epoch 2/5
782/782 [==============================] - 28s 36ms/step - loss: 4.6078 - accuracy: 0.0649
Epoch 3/5
782/782 [==============================] - 28s 36ms/step - loss: 4.8404 - accuracy: 0.0354
Epoch 4/5
782/782 [==============================] - 28s 36ms/step - loss: nan - accuracy: 0.0224
Epoch 5/5
782/782 [==============================] - 28s 36ms/step - loss: nan - accuracy: 0.0100

比較と言いつつ自分は4090を持ってませんので、Kaggle上で協力してくれる方を探して、以下の回答を貰いました。
inbox-1842206-73b82b1a823da8f328020d2998d22b47-testing123.png

結果、4090はざっくりM1 Maxの3倍は速いということがわかりました。もちろん計算対象によって大きな違いは出るかとは思いますが、とりあえずAppleがGPUプラグインのサンプルコードとして自分であげているものでの結果がこれです。M2/3にあたりになってくると差が縮まるのかもしれませんが、M1に比べて倍になっているという話ではないはずですから、競争としては厳しそうですね。

1
0
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
1
0