0
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

More than 3 years have passed since last update.

RX6800 無印の倍精度性能のメモ(fp64 1 TF @ 110W)

Last updated at Posted at 2021-05-01

mixbench-opencl-alt で計測しました.

amdgpu-pro 21.10 を利用.
110W に powerlimit しています.

------------------------ Device specifications ------------------------
Platform:            AMD Accelerated Parallel Processing
Device:              gfx1030/Advanced Micro Devices, Inc.
Driver version:      3246.0 (HSA1.1,LC)
Address bits:        64
GPU clock rate:      2475 MHz
Total global mem:    16368 MB
Max allowed buffer:  13912 MB
OpenCL version:      OpenCL 2.0
Total CUs:           30
-----------------------------------------------------------------------
Buffer size:            64MB
Workgroup size:         256
Workitem stride:        NDRange
Buffer allocation:      Device allocated
Timer:                  CL event based
Loading kernel source file...
Precompilation of kernels... [>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>]
----------------------------------------------------------------------------- CSV data -----------------------------------------------------------------------------
Experiment ID, Single Precision ops,,,,              Double precision ops,,,,              Half precision ops,,,,                Integer operations,,,
Compute iters, Flops/byte, ex.time,  GFLOPS, GB/sec, Flops/byte, ex.time,  GFLOPS, GB/sec, Flops/byte, ex.time,  GFLOPS, GB/sec, Iops/byte, ex.time,   GIOPS, GB/sec
            0,      0.000,   13.96,    0.00, 615.13,      0.000,   11.43,    0.00,1502.86,      0.000,    5.44,    0.00,1579.14,     0.000,    5.47,    0.00,1571.27
            1,      0.129,    5.33,  201.29,1560.02,      0.065,   10.94,   98.18,1521.78,      0.258,    5.31,  404.30,1566.68,     0.129,    5.40,  198.98,1542.09
            2,      0.267,    5.16,  416.37,1561.39,      0.133,   10.81,  198.73,1490.49,      0.533,    5.19,  827.63,1551.81,     0.267,    5.18,  414.68,1555.06
            3,      0.414,    5.01,  642.34,1552.32,      0.207,   11.05,  291.44,1408.64,      0.828,    5.76, 1119.03,1352.16,     0.414,    6.47,  498.01,1203.52
            4,      0.571,    6.70,  640.77,1121.34,      0.286,   15.73,  273.05, 955.69,      1.143,    7.75, 1108.18, 969.66,     0.571,    8.28,  519.02, 908.29
            5,      0.741,    8.35,  642.99, 868.04,      0.370,   17.91,  299.70, 809.20,      1.481,    7.77, 1382.25, 933.02,     0.741,    7.93,  676.93, 913.86
            6,      0.923,    7.75,  831.37, 900.66,      0.462,   17.86,  360.63, 781.36,      1.846,    6.52, 1976.21,1070.45,     0.923,    7.13,  903.81, 979.13
            7,      1.120,    6.43, 1168.23,1043.06,      0.560,   17.57,  427.73, 763.81,      2.240,    5.55, 2710.57,1210.08,     1.120,    6.88, 1091.77, 974.80
            8,      1.333,    5.60, 1533.72,1150.29,      0.667,   17.81,  482.33, 723.50,      2.667,    4.81, 3568.66,1338.25,     1.333,    6.83, 1257.79, 943.34
            9,      1.565,    4.89, 1974.78,1261.67,      0.783,   18.48,  522.81, 668.04,      3.130,    4.34, 4454.28,1422.90,     1.565,    6.90, 1400.32, 894.65
           10,      1.818,    4.50, 2386.68,1312.67,      0.909,   18.67,  574.99, 632.49,      3.636,    3.78, 5680.22,1562.06,     1.818,    6.68, 1606.32, 883.47
           11,      2.095,    3.87, 3048.26,1454.85,      1.048,   18.79,  628.54, 599.97,      4.190,    3.49, 6766.03,1614.62,     2.095,    6.73, 1753.91, 837.09
           12,      2.400,    3.44, 3745.10,1560.46,      1.200,   19.11,  674.14, 561.78,      4.800,    3.07, 8394.99,1748.96,     2.400,    6.66, 1934.51, 806.05
           13,      2.737,    3.08, 4528.11,1654.50,      1.368,   19.57,  713.24, 521.21,      5.474,    2.90, 9630.17,1759.36,     2.737,    6.98, 2000.44, 730.93
           14,      3.111,    2.87, 5235.32,1682.78,      1.556,   20.63,  728.70, 468.45,      6.222,    2.66,11301.12,1816.25,     3.111,    7.13, 2107.39, 677.37
           15,      3.529,    2.65, 6071.15,1720.16,      1.765,   20.66,  779.73, 441.85,      7.059,    2.37,13616.44,1929.00,     3.529,    7.09, 2272.49, 643.87
           16,      4.000,    2.90, 5920.46,1480.11,      2.000,   21.14,  812.73, 406.36,      8.000,    2.88,11922.27,1490.28,     4.000,    7.31, 2350.36, 587.59
           17,      4.533,    2.62, 6975.72,1538.76,      2.267,   22.83,  799.57, 352.75,      9.067,    2.60,14052.84,1549.95,     4.533,    7.94, 2299.32, 507.20
           18,      5.143,    2.64, 7327.81,1424.85,      2.571,   24.21,  798.26, 310.43,     10.286,    2.46,15709.39,1527.30,     5.143,    8.08, 2393.24, 465.35
           19,      5.846,    2.51, 8114.83,1388.06,      2.923,   24.91,  818.89, 280.15,     11.692,    2.29,17850.86,1526.72,     5.846,    8.41, 2426.88, 415.12
           20,      6.667,    2.53, 8489.78,1273.47,      3.333,   25.71,  835.37, 250.61,     13.333,    2.24,19138.71,1435.40,     6.667,    8.55, 2510.64, 376.60
           21,      7.636,    2.49, 9065.66,1187.17,      3.818,   26.47,  851.83, 223.10,     15.273,    2.19,20588.88,1348.08,     7.636,    8.89, 2536.79, 332.20
           22,      8.800,    2.52, 9367.35,1064.47,      4.400,   27.58,  856.36, 194.63,     17.600,    2.18,21673.79,1231.47,     8.800,    9.12, 2589.57, 294.27
           23,     10.222,    2.50, 9869.85, 965.53,      5.111,   28.02,  881.31, 172.43,     20.444,    2.12,23246.81,1137.07,    10.222,    9.27, 2663.90, 260.60
           24,     12.000,    2.47,10445.93, 870.49,      6.000,   28.80,  894.67, 149.11,     24.000,    2.11,24469.46,1019.56,    12.000,    9.54, 2701.34, 225.11
           25,     14.286,    2.49,10795.75, 755.70,      7.143,   29.87,  898.83, 125.84,     28.571,    2.10,25515.66, 893.05,    14.286,    9.74, 2755.61, 192.89
           26,     17.333,    2.47,11284.22, 651.01,      8.667,   30.23,  923.43, 106.55,     34.667,    2.10,26534.79, 765.43,    17.333,    9.94, 2809.87, 162.11
           27,     21.600,    2.46,11761.40, 544.51,     10.800,   31.10,  932.17,  86.31,     43.200,    2.11,27544.35, 637.60,    21.600,   10.16, 2854.21, 132.14
           28,     28.000,    2.44,12325.21, 440.19,     14.000,   31.58,  952.09,  68.01,     56.000,    2.08,28951.26, 516.99,    28.000,   10.25, 2933.12, 104.75
           29,     38.667,    2.41,12929.93, 334.39,     19.333,   32.16,  968.14,  50.08,     77.333,    2.10,29604.93, 382.82,    38.667,   10.56, 2949.59,  76.28
           30,     60.000,    2.42,13288.83, 221.48,     30.000,   33.06,  974.46,  32.48,    120.000,    2.09,30880.64, 257.34,    60.000,   10.77, 2991.72,  49.86
           31,    124.000,    2.39,13922.48, 112.28,     62.000,   33.13, 1004.82,  16.21,    248.000,    2.02,33004.11, 133.08,   124.000,   10.41, 3197.87,  25.79
           32,        inf,    2.19,15667.59,   0.00,        inf,   32.10, 1070.35,   0.00,        inf,    2.01,34248.07,   0.00,       inf,   10.24, 3355.28,   0.00
-----------------------------------

Voila! おおむね仕様通り fp64 1 TFlops でました.

参考までに Radeon VII(145W に powerlimit)

------------------------ Device specifications ------------------------
Platform:            AMD Accelerated Parallel Processing
Device:              gfx906:sramecc-:xnack-/Advanced Micro Devices, Inc.
Driver version:      3246.0 (HSA1.1,LC)
Address bits:        64
GPU clock rate:      1801 MHz
Total global mem:    16368 MB
Max allowed buffer:  13912 MB
OpenCL version:      OpenCL 2.0
Total CUs:           60
-----------------------------------------------------------------------
Buffer size:            64MB
Workgroup size:         256
Workitem stride:        NDRange
Buffer allocation:      Device allocated
Timer:                  CL event based
Loading kernel source file...
Precompilation of kernels... [>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>]
----------------------------------------------------------------------------- CSV data -----------------------------------------------------------------------------
Experiment ID, Single Precision ops,,,,              Double precision ops,,,,              Half precision ops,,,,                Integer operations,,,
Compute iters, Flops/byte, ex.time,  GFLOPS, GB/sec, Flops/byte, ex.time,  GFLOPS, GB/sec, Flops/byte, ex.time,  GFLOPS, GB/sec, Iops/byte, ex.time,   GIOPS, GB/sec
            0,      0.000,    8.83,    0.00, 973.01,      0.000,   18.44,    0.00, 931.88,      0.000,    8.61,    0.00, 997.36,     0.000,    8.56,    0.00,1003.78
            1,      0.129,   10.73,  100.07, 775.51,      0.065,   20.06,   53.53, 829.79,      0.258,   10.58,  202.90, 786.25,     0.129,   10.31,  104.17, 807.31
            2,      0.267,   10.33,  207.98, 779.92,      0.133,   19.69,  109.07, 818.02,      0.533,   10.19,  421.36, 790.04,     0.267,   10.30,  208.59, 782.21
            3,      0.414,    9.67,  333.28, 805.43,      0.207,   19.11,  168.61, 814.93,      0.828,    9.25,  696.16, 841.19,     0.414,    9.56,  337.09, 814.63
            4,      0.571,    9.12,  470.81, 823.92,      0.286,   18.47,  232.58, 814.04,      1.143,    9.18,  935.43, 818.50,     0.571,    9.35,  459.40, 803.94
            5,      0.741,    9.07,  592.11, 799.35,      0.370,   17.77,  302.16, 815.82,      1.481,    8.97, 1196.79, 807.83,     0.741,    9.18,  584.81, 789.50
            6,      0.923,    8.75,  736.02, 797.35,      0.462,   17.10,  376.81, 816.42,      1.846,    8.74, 1474.19, 798.52,     0.923,    8.67,  742.65, 804.54
            7,      1.120,    8.16,  920.79, 822.14,      0.560,   16.42,  457.75, 817.41,      2.240,    8.01, 1877.36, 838.11,     1.120,    8.33,  902.81, 806.08
            8,      1.333,    7.83, 1096.50, 822.38,      0.667,   16.00,  536.98, 805.47,      2.667,    7.79, 2204.76, 826.79,     1.333,    7.90, 1087.95, 815.96
            9,      1.565,    7.39, 1306.89, 834.96,      0.783,   15.16,  637.27, 814.29,      3.130,    7.45, 2595.75, 829.20,     1.565,    7.63, 1266.68, 809.27
           10,      1.818,    6.96, 1542.67, 848.47,      0.909,   14.91,  720.13, 792.14,      3.636,    6.89, 3115.92, 856.88,     1.818,    7.21, 1488.90, 818.89
           11,      2.095,    6.64, 1778.07, 848.62,      1.048,   13.74,  859.89, 820.81,      4.190,    6.55, 3604.49, 860.16,     2.095,    7.04, 1678.91, 801.30
           12,      2.400,    6.16, 2091.44, 871.44,      1.200,   13.64,  944.55, 787.13,      4.800,    6.16, 4182.78, 871.41,     2.400,    6.66, 1935.79, 806.58
           13,      2.737,    5.68, 2455.80, 897.31,      1.368,   13.19, 1058.53, 773.54,      5.474,    5.60, 4987.12, 911.11,     2.737,    6.32, 2208.66, 807.01
           14,      3.111,    5.41, 2779.68, 893.47,      1.556,   12.64, 1189.55, 764.71,      6.222,    5.53, 5432.70, 873.11,     3.111,    5.81, 2586.10, 831.25
           15,      3.529,    5.04, 3192.76, 904.61,      1.765,   11.92, 1351.14, 765.65,      7.059,    4.94, 6520.14, 923.69,     3.529,    5.61, 2868.90, 812.86
           16,      4.000,    5.70, 3014.71, 753.68,      2.000,   12.97, 1324.26, 662.13,      8.000,    5.75, 5973.24, 746.65,     4.000,    6.24, 2752.28, 688.07
           17,      4.533,    5.36, 3404.23, 750.93,      2.267,   12.20, 1496.38, 660.17,      9.067,    5.40, 6766.67, 746.32,     4.533,    6.20, 2945.29, 649.70
           18,      5.143,    5.21, 3709.08, 721.21,      2.571,   10.64, 1816.39, 706.37,     10.286,    5.05, 7651.69, 743.91,     5.143,    6.30, 3067.77, 596.51
           19,      5.846,    4.60, 4432.46, 758.18,      2.923,   10.35, 1971.64, 674.51,     11.692,    4.63, 8804.32, 753.00,     5.846,    6.61, 3085.03, 527.70
           20,      6.667,    4.39, 4890.68, 733.60,      3.333,    9.63, 2229.93, 668.98,     13.333,    4.35, 9862.94, 739.72,     6.667,    6.82, 3148.89, 472.33
           21,      7.636,    3.98, 5672.42, 742.82,      3.818,    9.39, 2401.76, 629.03,     15.273,    3.95,11417.00, 747.54,     7.636,    7.22, 3123.56, 409.04
           22,      8.800,    3.85, 6141.54, 697.90,      4.400,    9.69, 2437.57, 553.99,     17.600,    3.64,12985.88, 737.83,     8.800,    7.51, 3145.15, 357.40
           23,     10.222,    3.27, 7542.01, 737.81,      5.111,    9.56, 2582.83, 505.34,     20.444,    3.45,14305.98, 699.75,    10.222,    7.77, 3178.35, 310.93
           24,     12.000,    3.26, 7897.71, 658.14,      6.000,    9.85, 2616.23, 436.04,     24.000,    3.29,15662.54, 652.61,    12.000,    8.09, 3184.35, 265.36
           25,     14.286,    3.03, 8855.10, 619.86,      7.143,    9.90, 2711.28, 379.58,     28.571,    2.98,18038.67, 631.35,    14.286,    8.02, 3345.00, 234.15
           26,     17.333,    3.01, 9262.10, 534.35,      8.667,    9.74, 2865.25, 330.61,     34.667,    2.97,18803.69, 542.41,    17.333,    8.21, 3400.69, 196.19
           27,     21.600,    2.97, 9771.35, 452.38,     10.800,    9.87, 2938.25, 272.06,     43.200,    2.81,20668.48, 478.44,    21.600,    8.37, 3461.92, 160.27
           28,     28.000,    2.87,10457.53, 373.48,     14.000,   10.11, 2972.97, 212.36,     56.000,    2.94,20478.55, 365.69,    28.000,    8.54, 3518.81, 125.67
           29,     38.667,    2.82,11030.53, 285.27,     19.333,   10.10, 3083.54, 159.49,     77.333,    2.88,21637.94, 279.80,    38.667,    8.75, 3559.17,  92.05
           30,     60.000,    2.83,11379.93, 189.67,     30.000,   10.30, 3128.70, 104.29,    120.000,    2.84,22648.49, 188.74,    60.000,    8.94, 3603.47,  60.06
           31,    124.000,    2.76,12069.69,  97.34,     62.000,   10.35, 3216.99,  51.89,    248.000,    2.82,23644.16,  95.34,   124.000,    9.09, 3660.30,  29.52
           32,        inf,    2.76,12441.71,   0.00,        inf,   10.54, 3260.61,   0.00,        inf,    2.70,25497.97,   0.00,       inf,    9.09, 3780.10,   0.00
--------------------------------------------------------------------------------------------------------------------------------------------------------------------

fp64 3.2 TF とさすがです.

RX6800, 単体では Radeon VII, Titan V などには及びませんが, メモリ 16 GB ありますので Vulkan レイトレしつつ倍精度も欲しいという場合には(必要に応じて)複数枚並べて使うのもよい選択肢になりそうです(ETH 60 MH/s @ 120W なのでマイニングで元もとれる)

ただ, 今はもう入手できないっぽいようなので, RX6900XT(1.5 TF) になるでしょうか.

amdgpu-pro 21.10 でやっと Linux でも Navi2 安定運用できてきた感がありますので(+ Vulkan ray tracing のベータ対応), Navi2 で倍精度計算も実務的になりそうやも
(GeForce EULA と異なり, データセンターでも動かし放題なのがよい)

その他

0
0
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
0
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?