More than 1 year has passed since last update.

M1MacのGPUで PyTorch動くぞ！

Posted at 2022-05-25

みなさんこんにちは。げそんです。
なにやらPyTorchがM1macのGPUに対応したらしいので使ってみました。

環境

MacBook Pro 16inch M1Max
macOS 12.3.1
python==3.9
torch==1.13.0.dev20220524

インストール方法

pipでpytorch-nightlyを入れれば普通に使えるようです。

pip3 install --pre torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/nightly/cpu

GPUの使い方

どうやらdevice名はmps (Metal Performance Shader)になるようです

import torch
mat = torch.randn(5,5,device="mps")
print(mat)

output

tensor([[-1.2882, -0.0142, -0.4158, -0.1735, -0.6549],
        [-1.0239,  1.3813, -0.2038, -0.4096,  0.9557],
        [-2.1195,  0.7094,  0.7672,  2.3561,  1.5051],
        [ 1.2114, -0.2336, -1.6929, -0.1132,  1.5441],
        [-0.6241,  0.8835,  1.7996, -0.2908,  1.1778]], device='mps:0')

ベンチマーク

とりあえずでっかい行列計算させてみました。
8192x8192の32bit不動小数点行列を計算させてみました。普通に高速化されました。

import torch
import timeit
gpu_device = "mps"
dtype = torch.float
number = 10
size = 8192
cpumat = torch.randn(size,size,dtype=dtype)
gpumat = cpumat.to(gpu_device)

cpu_time = timeit.timeit("torch.matmul(cpumat,cpumat)",number=number,globals=globals())
print("cpu_time: {:.6f} seconds".format(cpu_time))
gpu_time = timeit.timeit("torch.matmul(gpumat,gpumat)",number=number,globals=globals())
print("gpu_time: {: .6f} seconds".format(gpu_time))

output

cpu_time: 6.042504 seconds
gpu_time:  0.010529 seconds

ちなみに筆者のRTX3090で上記のベンチマークを実行した時は 0.00017秒くらいだったのでボコボコですね

問題

実装されている演算が少ない
累乗の演算など対応していない演算がとても多いです。
バグが多い
torch.arange(10,device="mps")の返り値がtensor([0, 0, 0, 0, 0, 0, 0, 0, 0, 0], device='mps:0')となるなど根本的に計算結果を狂わすバグが数多く潜んでいます。
なぜか実行回数を多くすると遅くなる
先ほどのベンチマークのtimeitの試行回数を30にしたあたりからグンと遅くなります。Cache周りかなーと適当に考えていますがそこら辺はよくわかりません。

所感

兎にも角にもやっぱりGPUは速いです。そしてMacでもPyTorchが使えるようになったことは非常に嬉しいです。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up