Macbook pro m1maxを入手したのでいくつかの方法でRのベンチマークを試してみました。
① mac上にインストールしたR 4.1.2(ローカルで走らせているので最適化されているはず)
② arm64v8/r-base:4.1.2を用いたDocker container(arm仕様のR)
③ amoselb/rstudio-m1:latest (4.0.3)を用いたDocker container(いわゆるR (x86-64))
全て、benchmarkme(version 1.0.7)というパッケージを使い、以下を実行した。
ハードウェアはこのようになっている。
Model Name: MacBook Pro
Model Identifier: MacBookPro18,4
Chip: Apple M1 Max
Total Number of Cores: 10 (8 performance and 2 efficiency)
Memory: 64 GB
テスト1
スタンダードなベンチマークとされているものを実行してみた。
res = benchmark_std()
res
結果
① mac上にインストールしたR
user system elapsed test test_group cores
1 0.092 0.002 0.094 fib prog 0
2 0.092 0.001 0.092 fib prog 0
3 0.093 0.001 0.095 fib prog 0
4 0.342 0.026 0.368 gcd prog 0
5 0.179 0.025 0.204 gcd prog 0
6 0.176 0.031 0.206 gcd prog 0
7 0.266 0.009 0.275 hilbert prog 0
8 0.263 0.010 0.274 hilbert prog 0
9 0.105 0.008 0.113 hilbert prog 0
10 0.604 0.003 0.607 toeplitz prog 0
11 0.601 0.002 0.603 toeplitz prog 0
12 0.601 0.003 0.604 toeplitz prog 0
13 0.586 0.045 0.632 escoufier prog 0
14 0.587 0.034 0.621 escoufier prog 0
15 0.586 0.031 0.618 escoufier prog 0
16 0.180 0.011 0.191 manip matrix_cal 0
17 0.333 0.012 0.345 manip matrix_cal 0
18 0.176 0.009 0.186 manip matrix_cal 0
19 0.103 0.002 0.106 power matrix_cal 0
20 0.104 0.003 0.107 power matrix_cal 0
21 0.105 0.002 0.106 power matrix_cal 0
22 0.592 0.006 0.600 sort matrix_cal 0
23 0.593 0.006 0.598 sort matrix_cal 0
24 0.604 0.006 0.610 sort matrix_cal 0
25 9.490 0.051 9.554 cross_product matrix_cal 0
26 9.498 0.047 9.553 cross_product matrix_cal 0
27 9.494 0.045 9.538 cross_product matrix_cal 0
28 0.785 0.005 0.789 lm matrix_cal 0
29 0.791 0.005 0.796 lm matrix_cal 0
30 0.792 0.004 0.796 lm matrix_cal 0
31 5.155 0.033 5.189 cholesky matrix_fun 0
32 5.152 0.035 5.188 cholesky matrix_fun 0
33 5.157 0.026 5.182 cholesky matrix_fun 0
34 1.760 0.010 1.770 determinant matrix_fun 0
35 1.748 0.010 1.759 determinant matrix_fun 0
36 1.749 0.005 1.753 determinant matrix_fun 0
37 0.418 0.001 0.419 eigen matrix_fun 0
38 0.429 0.001 0.429 eigen matrix_fun 0
39 0.437 0.001 0.437 eigen matrix_fun 0
40 0.072 0.001 0.073 fft matrix_fun 0
41 0.073 0.002 0.075 fft matrix_fun 0
42 0.073 0.002 0.074 fft matrix_fun 0
43 1.422 0.005 1.427 inverse matrix_fun 0
44 1.421 0.004 1.425 inverse matrix_fun 0
45 1.424 0.005 1.429 inverse matrix_fun 0
② arm64v8/r-base:4.1.2を用いたDocker container(m1maxで動かしたやつ)
user system elapsed test test_group cores
1 0.071 0.004 0.076 fib prog 0
2 0.057 0.000 0.058 fib prog 0
3 0.056 0.000 0.057 fib prog 0
4 1.496 0.006 1.502 gcd prog 0
5 1.384 0.000 1.381 gcd prog 0
6 1.347 0.001 1.348 gcd prog 0
7 0.203 0.024 0.228 hilbert prog 0
8 0.193 0.026 0.218 hilbert prog 0
9 0.136 0.015 0.151 hilbert prog 0
10 0.717 0.000 0.717 toeplitz prog 0
11 0.705 0.000 0.705 toeplitz prog 0
12 0.762 0.000 0.762 toeplitz prog 0
13 26.324 0.006 26.334 escoufier prog 0
14 26.371 0.000 26.374 escoufier prog 0
15 26.203 0.000 26.206 escoufier prog 0
16 0.299 0.012 0.311 manip matrix_cal 0
17 0.304 0.025 0.330 manip matrix_cal 0
18 0.196 0.029 0.224 manip matrix_cal 0
19 0.143 0.005 0.148 power matrix_cal 0
20 0.137 0.000 0.137 power matrix_cal 0
21 0.136 0.011 0.147 power matrix_cal 0
22 0.595 0.009 0.604 sort matrix_cal 0
23 0.608 0.007 0.616 sort matrix_cal 0
24 0.618 0.005 0.622 sort matrix_cal 0
25 0.426 0.030 0.095 cross_product matrix_cal 0
26 0.431 0.042 0.097 cross_product matrix_cal 0
27 0.409 0.039 0.091 cross_product matrix_cal 0
28 0.050 0.013 0.014 lm matrix_cal 0
29 0.082 0.050 0.032 lm matrix_cal 0
30 0.047 0.002 0.010 lm matrix_cal 0
31 0.466 0.271 0.164 cholesky matrix_fun 0
32 0.282 0.102 0.089 cholesky matrix_fun 0
33 0.292 0.087 0.087 cholesky matrix_fun 0
34 0.319 0.010 0.078 determinant matrix_fun 0
35 0.310 0.009 0.065 determinant matrix_fun 0
36 0.311 0.009 0.066 determinant matrix_fun 0
37 0.497 0.391 0.183 eigen matrix_fun 0
38 0.482 0.367 0.173 eigen matrix_fun 0
39 0.448 0.363 0.166 eigen matrix_fun 0
40 0.121 0.007 0.129 fft matrix_fun 0
41 0.109 0.007 0.117 fft matrix_fun 0
42 0.129 0.002 0.130 fft matrix_fun 0
43 0.346 0.075 0.087 inverse matrix_fun 0
44 0.322 0.076 0.086 inverse matrix_fun 0
45 0.387 0.088 0.102 inverse matrix_fun 0
③ amoselb/rstudio-m1:latestを用いたDocker container(m1maxで動かしたやつ)
user system elapsed test test_group cores
1 0.268 0.010 0.582 fib prog 0
2 0.209 0.007 0.484 fib prog 0
3 0.211 0.013 0.591 fib prog 0
4 1.019 0.013 2.172 gcd prog 0
5 1.070 0.003 2.368 gcd prog 0
6 0.947 0.000 2.067 gcd prog 0
7 0.183 0.034 0.418 hilbert prog 0
8 0.195 0.032 0.489 hilbert prog 0
9 0.136 0.024 0.320 hilbert prog 0
10 0.762 0.000 1.632 toeplitz prog 0
11 0.767 0.000 1.622 toeplitz prog 0
12 0.759 0.000 1.586 toeplitz prog 0
13 28.116 0.054 59.607 escoufier prog 0
14 28.850 0.219 66.235 escoufier prog 0
15 28.142 0.007 58.814 escoufier prog 0
16 0.366 0.014 0.900 manip matrix_cal 0
17 0.350 0.040 0.802 manip matrix_cal 0
18 0.207 0.034 0.488 manip matrix_cal 0
19 0.265 0.010 0.635 power matrix_cal 0
20 0.264 0.000 0.528 power matrix_cal 0
21 0.266 0.004 0.519 power matrix_cal 0
22 0.670 0.009 1.479 sort matrix_cal 0
23 0.659 0.011 1.396 sort matrix_cal 0
24 0.645 0.008 1.319 sort matrix_cal 0
25 10.440 0.001 21.883 cross_product matrix_cal 0
26 10.529 0.033 22.036 cross_product matrix_cal 0
27 10.421 0.000 21.715 cross_product matrix_cal 0
28 0.880 0.002 1.912 lm matrix_cal 0
29 0.865 0.002 1.777 lm matrix_cal 0
30 0.870 0.005 1.877 lm matrix_cal 0
31 5.801 0.033 12.187 cholesky matrix_fun 0
32 5.715 0.021 11.904 cholesky matrix_fun 0
33 5.659 0.017 11.802 cholesky matrix_fun 0
34 2.001 0.010 4.209 determinant matrix_fun 0
35 1.986 0.000 4.113 determinant matrix_fun 0
36 1.977 0.000 4.103 determinant matrix_fun 0
37 0.522 0.001 1.394 eigen matrix_fun 0
38 0.476 0.000 1.014 eigen matrix_fun 0
39 0.492 0.000 1.019 eigen matrix_fun 0
40 0.138 0.000 0.393 fft matrix_fun 0
41 0.138 0.010 0.367 fft matrix_fun 0
42 0.150 0.000 0.315 fft matrix_fun 0
43 1.797 0.105 4.595 inverse matrix_fun 0
44 1.626 0.000 3.406 inverse matrix_fun 0
45 1.609 0.000 3.383 inverse matrix_fun 0
①,②は③に比べて劇的に早い。
①,②を比較すると、②はescoufierというテストが遅かった。しかし、inverse matrix_funというテストは速いなどテストによってまちまち。甲乙つけ難いか。
テスト2
ちなみに、並列化演算のベンチマークもあったので4コアで試してみた。
bm_parallel("bm_matrix_cal_manip", runs = 3, verbose = TRUE, cores = 4)
bm = c("bm_matrix_cal_manip","bm_matrix_cal_power", "bm_matrix_cal_sort",
"bm_matrix_cal_cross_product", "bm_matrix_cal_lm")
results = lapply(bm, bm_parallel,
runs = 5, verbose = TRUE, cores = 4L)
results
結果
① mac上にインストールしたR
[[1]]
user system elapsed test test_group cores
2 0.003 0.001 0.586 manip matrix_cal 4
3 0.003 0.001 0.446 manip matrix_cal 4
4 0.003 0.001 0.440 manip matrix_cal 4
5 0.003 0.001 0.435 manip matrix_cal 4
6 0.003 0.001 0.431 manip matrix_cal 4
[[2]]
user system elapsed test test_group cores
2 0.004 0.001 0.635 power matrix_cal 4
3 0.003 0.002 0.528 power matrix_cal 4
4 0.002 0.001 0.474 power matrix_cal 4
5 0.003 0.001 0.465 power matrix_cal 4
6 0.003 0.001 0.467 power matrix_cal 4
[[3]]
user system elapsed test test_group cores
2 0.005 0.003 1.300 sort matrix_cal 4
3 0.004 0.002 1.145 sort matrix_cal 4
4 0.004 0.002 1.149 sort matrix_cal 4
5 0.004 0.002 1.147 sort matrix_cal 4
6 0.004 0.002 1.146 sort matrix_cal 4
[[4]]
user system elapsed test test_group cores
2 0.019 0.021 12.858 cross_product matrix_cal 4
3 0.019 0.021 12.766 cross_product matrix_cal 4
4 0.019 0.020 12.868 cross_product matrix_cal 4
5 0.020 0.020 12.811 cross_product matrix_cal 4
6 0.020 0.021 12.823 cross_product matrix_cal 4
[[5]]
user system elapsed test test_group cores
2 0.005 0.003 1.276 lm matrix_cal 4
3 0.004 0.003 1.249 lm matrix_cal 4
4 0.004 0.002 1.281 lm matrix_cal 4
5 0.004 0.002 1.246 lm matrix_cal 4
6 0.004 0.002 1.247 lm matrix_cal 4
② arm64v8/r-base:4.1.2を用いたDocker container(m1maxで動かしたやつ)
[[1]]
user system elapsed test test_group cores
2 0.002 0.003 0.930 manip matrix_cal 4
3 0.002 0.002 0.731 manip matrix_cal 4
4 0.003 0.000 0.746 manip matrix_cal 4
5 0.003 0.000 0.727 manip matrix_cal 4
6 0.003 0.000 0.715 manip matrix_cal 4
[[2]]
user system elapsed test test_group cores
2 0.003 0 0.794 power matrix_cal 4
3 0.004 0 0.690 power matrix_cal 4
4 0.004 0 0.628 power matrix_cal 4
5 0.004 0 0.607 power matrix_cal 4
6 0.004 0 0.623 power matrix_cal 4
[[3]]
user system elapsed test test_group cores
2 0.003 0 1.596 sort matrix_cal 4
3 0.003 0 1.322 sort matrix_cal 4
4 0.004 0 1.274 sort matrix_cal 4
5 0.003 0 1.280 sort matrix_cal 4
6 0.003 0 1.280 sort matrix_cal 4
[[4]]
user system elapsed test test_group cores
2 0.004 0 1.095 cross_product matrix_cal 4
3 0.003 0 0.993 cross_product matrix_cal 4
4 0.004 0 0.997 cross_product matrix_cal 4
5 0.003 0 0.985 cross_product matrix_cal 4
6 0.003 0 0.996 cross_product matrix_cal 4
[[5]]
user system elapsed test test_group cores
2 0.003 0 0.416 lm matrix_cal 4
3 0.003 0 0.474 lm matrix_cal 4
4 0.003 0 0.412 lm matrix_cal 4
5 0.003 0 0.473 lm matrix_cal 4
6 0.003 0 0.414 lm matrix_cal 4
③ amoselb/rstudio-m1:latestを用いたDocker container(m1maxで動かしたやつ)
[[1]]
user system elapsed test test_group cores
2 0.007 0.001 15.791 manip matrix_cal 4
3 0.003 0.004 12.259 manip matrix_cal 4
4 0.004 0.000 12.995 manip matrix_cal 4
5 0.003 0.000 9.879 manip matrix_cal 4
6 0.002 0.001 11.570 manip matrix_cal 4
[[2]]
user system elapsed test test_group cores
2 0.003 0.001 20.208 power matrix_cal 4
3 0.005 0.000 12.663 power matrix_cal 4
4 0.005 0.000 12.321 power matrix_cal 4
5 0.004 0.000 16.102 power matrix_cal 4
6 0.003 0.001 16.266 power matrix_cal 4
[[3]]
user system elapsed test test_group cores
2 0.003 0.002 27.194 sort matrix_cal 4
3 0.005 0.001 25.502 sort matrix_cal 4
4 0.003 0.001 22.389 sort matrix_cal 4
5 0.006 0.000 22.721 sort matrix_cal 4
6 0.003 0.002 21.820 sort matrix_cal 4
[[4]]
user system elapsed test test_group cores
2 0.004 0.001 277.029 cross_product matrix_cal 4
3 0.005 0.002 274.921 cross_product matrix_cal 4
4 0.004 0.000 258.306 cross_product matrix_cal 4
5 0.004 0.001 247.816 cross_product matrix_cal 4
6 0.004 0.000 246.017 cross_product matrix_cal 4
[[5]]
user system elapsed test test_group cores
2 0.002 0.001 25.374 lm matrix_cal 4
3 0.003 0.000 22.895 lm matrix_cal 4
4 0.003 0.000 24.812 lm matrix_cal 4
5 0.003 0.001 22.106 lm matrix_cal 4
6 0.002 0.001 19.931 lm matrix_cal 4
①、②が速いのは言わずもがな。②はmatrix演算が①より速い結果となった。
③は・・・使用をお勧めしません。
ただし、③は2021/11時点でDocker+Rstudio-serverが可能な唯一の方法です。
結論
arm対応Rは速い。