6
2

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

More than 5 years have passed since last update.

Meltdownへの対策の影響をLINPACKベンチマークなどで比較してみる

Last updated at Posted at 2018-01-04

はじめに

どうもIntel製のCPUにMeltdownという不具合が見つかり、対策にあたって深刻なパフォーマンス問題を引き起こす可能性があるらしい。色々な人がベンチマークを行っていると思うが、自分でも試してみることにする。
ただしSandy Bridge世代のXeon E5-2660 (2.2GHz 8C16T) ×2、CentOS 6と古い環境なので注意。

LINPACKベンチマークについて

線形方程式を高速に解くベンチマークでCPUとメモリ帯域への負荷が高い。今回はIntel® Optimized LINPACK Benchmark for Linuxを利用する。使用したのは構築時に利用したlinpack_11.0.3でベンチマークはrunme_xeon64を実行するだけ。ただし実行する環境で実メモリが最低でも16GB必要。

比較結果

RedHatのサイトによると、Meltdownへの対策はkernel-2.6.32-696.18.7.el6.x86_64行われたらしいこの修正版は少なくともCentOS 6のrepoにはまだ降ってきていない(CentOS 7は存在する様子)。
とりあえず対策前のkernel-2.6.32-696.16.1.el6.x86_64の手元環境でベンチマークを行ってみる。

lin_xeon64-before.txt
This is a SAMPLE run script for SMP LINPACK. Change it to reflect
the correct number of CPUs/threads, problem input files, etc..
2018年  1月  4日 木曜日 17:29:53 JST
Intel(R) Optimized LINPACK Benchmark data

Current date/time: Thu Jan  4 17:29:53 2018

CPU frequency:    2.999 GHz
Number of CPUs: 2
Number of cores: 16
Number of threads: 32

Parameters are set to:

Number of tests: 15
Number of equations to solve (problem size) : 1000  2000  5000  10000 15000 18000 20000 22000 25000 26000 27000 30000 35000 40000 45000
Leading dimension of array                  : 1000  2000  5008  10000 15000 18008 20016 22008 25000 26000 27000 30000 35000 40000 45000
Number of trials to run                     : 4     2     2     2     2     2     2     2     2     2     1     1     1     1     1
Data alignment value (in Kbytes)            : 4     4     4     4     4     4     4     4     4     4     4     1     1     1     1

Maximum memory requested that can be used=16200901024, at the size=45000

=================== Timing linear equation system solver ===================

Size   LDA    Align. Time(s)    GFlops   Residual     Residual(norm) Check
1000   1000   4      0.023      28.4723  8.724688e-13 2.975343e-02   pass
1000   1000   4      0.018      37.6328  8.724688e-13 2.975343e-02   pass
1000   1000   4      0.017      38.9671  8.724688e-13 2.975343e-02   pass
1000   1000   4      0.017      38.2876  8.724688e-13 2.975343e-02   pass
2000   2000   4      0.053      101.4005 4.701128e-12 4.089406e-02   pass
2000   2000   4      0.052      103.0739 4.701128e-12 4.089406e-02   pass
5000   5008   4      0.623      133.8619 2.434170e-11 3.394253e-02   pass
5000   5008   4      0.633      131.8303 2.434170e-11 3.394253e-02   pass
10000  10000  4      3.863      172.6226 8.916344e-11 3.143993e-02   pass
10000  10000  4      3.861      172.6983 8.916344e-11 3.143993e-02   pass
15000  15000  4      14.544     154.7288 2.165846e-10 3.411244e-02   pass
15000  15000  4      13.828     162.7475 2.165846e-10 3.411244e-02   pass
18000  18008  4      23.296     166.9203 2.945255e-10 3.225417e-02   pass
18000  18008  4      23.203     167.5927 2.945255e-10 3.225417e-02   pass
20000  20016  4      32.185     165.7339 3.831049e-10 3.391318e-02   pass
20000  20016  4      32.032     166.5251 3.831049e-10 3.391318e-02   pass
22000  22008  4      42.983     165.1728 4.066827e-10 2.978791e-02   pass
22000  22008  4      42.664     166.4062 4.066827e-10 2.978791e-02   pass
25000  25000  4      61.417     169.6253 5.501781e-10 3.128666e-02   pass
25000  25000  4      61.447     169.5433 5.501781e-10 3.128666e-02   pass
26000  26000  4      69.610     168.3475 5.851288e-10 3.076783e-02   pass
26000  26000  4      69.455     168.7243 5.851288e-10 3.076783e-02   pass
27000  27000  4      79.124     165.8588 6.532881e-10 3.185765e-02   pass
30000  30000  1      104.469    172.3171 7.329930e-10 2.889466e-02   pass
35000  35000  1      159.257    179.4945 1.115330e-09 3.237635e-02   pass
40000  40000  1      208.059    205.0856 1.359319e-09 3.023172e-02   pass
45000  45000  1      297.994    203.8768 1.876477e-09 3.301464e-02   pass

Performance Summary (GFlops)

Size   LDA    Align.  Average  Maximal
1000   1000   4       35.8399  38.9671
2000   2000   4       102.2372 103.0739
5000   5008   4       132.8461 133.8619
10000  10000  4       172.6604 172.6983
15000  15000  4       158.7381 162.7475
18000  18008  4       167.2565 167.5927
20000  20016  4       166.1295 166.5251
22000  22008  4       165.7895 166.4062
25000  25000  4       169.5843 169.6253
26000  26000  4       168.5359 168.7243
27000  27000  4       165.8588 165.8588
30000  30000  1       172.3171 172.3171
35000  35000  1       179.4945 179.4945
40000  40000  1       205.0856 205.0856
45000  45000  1       203.8768 203.8768

Residual checks PASSED

End of tests

Done: 2018年  1月  4日 木曜日 18:07:59 JST

1月5日の未明(UTCで 04-Jan-2018 19:42)にkernel-2.6.32-696.18.7.el6.x86_64がupdatesに追加されたようなのでアップデートして再起動後に再度テスト。

lin_xeon64-after.txt
This is a SAMPLE run script for SMP LINPACK. Change it to reflect
the correct number of CPUs/threads, problem input files, etc..
2018年  1月  5日 金曜日 10:14:10 JST
Intel(R) Optimized LINPACK Benchmark data

Current date/time: Fri Jan  5 10:14:10 2018

CPU frequency:    1.200 GHz
Number of CPUs: 2
Number of cores: 16
Number of threads: 32

Parameters are set to:

Number of tests: 15
Number of equations to solve (problem size) : 1000  2000  5000  10000 15000 18000 20000 22000 25000 26000 27000 30000 35000 40000 45000
Leading dimension of array                  : 1000  2000  5008  10000 15000 18008 20016 22008 25000 26000 27000 30000 35000 40000 45000
Number of trials to run                     : 4     2     2     2     2     2     2     2     2     2     1     1     1     1     1
Data alignment value (in Kbytes)            : 4     4     4     4     4     4     4     4     4     4     4     1     1     1     1

Maximum memory requested that can be used=16200901024, at the size=45000

=================== Timing linear equation system solver ===================

Size   LDA    Align. Time(s)    GFlops   Residual     Residual(norm) Check
1000   1000   4      0.031      21.5036  8.724688e-13 2.975343e-02   pass
1000   1000   4      0.019      34.6105  8.724688e-13 2.975343e-02   pass
1000   1000   4      0.019      34.7159  8.724688e-13 2.975343e-02   pass
1000   1000   4      0.019      35.4197  8.724688e-13 2.975343e-02   pass
2000   2000   4      0.052      101.9037 4.701128e-12 4.089406e-02   pass
2000   2000   4      0.052      102.3420 4.701128e-12 4.089406e-02   pass
5000   5008   4      0.616      135.4189 2.434170e-11 3.394253e-02   pass
5000   5008   4      0.622      134.0885 2.434170e-11 3.394253e-02   pass
10000  10000  4      3.893      171.3056 8.916344e-11 3.143993e-02   pass
10000  10000  4      3.867      172.4651 8.916344e-11 3.143993e-02   pass
15000  15000  4      14.059     160.0744 2.165846e-10 3.411244e-02   pass
15000  15000  4      14.510     155.0924 2.165846e-10 3.411244e-02   pass
18000  18008  4      23.791     163.4498 2.945255e-10 3.225417e-02   pass
18000  18008  4      23.628     164.5806 2.945255e-10 3.225417e-02   pass
20000  20016  4      32.758     162.8351 3.831049e-10 3.391318e-02   pass
20000  20016  4      32.702     163.1117 3.831049e-10 3.391318e-02   pass
22000  22008  4      43.922     161.6420 4.066827e-10 2.978791e-02   pass
22000  22008  4      43.850     161.9055 4.066827e-10 2.978791e-02   pass
25000  25000  4      62.763     165.9878 5.501781e-10 3.128666e-02   pass
25000  25000  4      62.212     167.4592 5.501781e-10 3.128666e-02   pass
26000  26000  4      71.553     163.7752 5.851288e-10 3.076783e-02   pass
26000  26000  4      72.195     162.3197 5.851288e-10 3.076783e-02   pass
27000  27000  4      82.244     159.5668 6.532881e-10 3.185765e-02   pass
30000  30000  1      107.351    167.6913 7.329930e-10 2.889466e-02   pass
35000  35000  1      165.246    172.9897 1.115330e-09 3.237635e-02   pass
40000  40000  1      210.337    202.8639 1.359319e-09 3.023172e-02   pass
45000  45000  1      304.250    199.6849 1.876477e-09 3.301464e-02   pass

Performance Summary (GFlops)

Size   LDA    Align.  Average  Maximal
1000   1000   4       31.5624  35.4197
2000   2000   4       102.1229 102.3420
5000   5008   4       134.7537 135.4189
10000  10000  4       171.8853 172.4651
15000  15000  4       157.5834 160.0744
18000  18008  4       164.0152 164.5806
20000  20016  4       162.9734 163.1117
22000  22008  4       161.7738 161.9055
25000  25000  4       166.7235 167.4592
26000  26000  4       163.0475 163.7752
27000  27000  4       159.5668 159.5668
30000  30000  1       167.6913 167.6913
35000  35000  1       172.9897 172.9897
40000  40000  1       202.8639 202.8639
45000  45000  1       199.6849 199.6849

Residual checks PASSED

End of tests

Done: 2018年  1月  5日 金曜日 10:52:51 JST

確かに性能は低下しているように見えるが2%程度なので深刻な性能低下とは言えないように感じる。この結果はRedHatのナレッジベースの記事と符合する。
kernelのアップデート時にlibvirtやqemu-kvmもアップデートされていたので、仮想化環境で影響を強く受けるのではないかと思われる。

仮想化ゲスト上でベンチマークしてみる

せっかくなので仮想化ゲストを作成して比較してみる。
仮想化ホストはKVMでvCPU 32、メモリ 128GB、10GbE接続のiSCSI ストレージ(HDD)上にCentOS 7.4をインストールしている。minimalでインストール後にパッケージのアップデートは行っていない。
なお、下記ベンチマーク時に仮想化ホスト上で他の仮想化ゲストは動作していない。

仮想化ホストのkernel等アップデート前

都合によりパッチ適用前のホストは少し古いパッケージとなっている。

# yum list installed kernel libvirt qemu-kvm
インストール済みパッケージ
kernel.x86_64                  2.6.32-642.13.1.el6                      @updates
libvirt.x86_64                 0.10.2-60.el6                            @base
qemu-kvm.x86_64                2:0.12.1.2-2.491.el6_8.3                 @updates

LINPACKベンチマーク

前掲のLINPACKベンチマークを実行するためにvCPUはホストと同じSandyBridgeを設定している。
なぜか14:05開始で1時間30分以上掛かっていることになっているが、実際には15:00あたりで開始しているはず…

lin_xeon64-kvm-before.txt
2018年  1月  5日 金曜日 14:05:46 JST
Intel(R) Optimized LINPACK Benchmark data

Current date/time: Fri Jan  5 14:05:46 2018

CPU frequency:    1.183 GHz
Number of CPUs: 32
Number of cores: 32
Number of threads: 32

Parameters are set to:

Number of tests: 15
Number of equations to solve (problem size) : 1000  2000  5000  10000 15000 18000 20000 22000 25000 26000 27000 30000 35000 40000 45000
Leading dimension of array                  : 1000  2000  5008  10000 15000 18008 20016 22008 25000 26000 27000 30000 35000 40000 45000
Number of trials to run                     : 4     2     2     2     2     2     2     2     2     2     1     1     1     1     1
Data alignment value (in Kbytes)            : 4     4     4     4     4     4     4     4     4     4     4     1     1     1     1

Maximum memory requested that can be used=16200901024, at the size=45000

=================== Timing linear equation system solver ===================

Size   LDA    Align. Time(s)    GFlops   Residual     Residual(norm) Check
1000   1000   4      1.079      0.6195   8.724688e-13 2.975343e-02   pass
1000   1000   4      0.304      2.2011   8.724688e-13 2.975343e-02   pass
1000   1000   4      0.009      71.4727  8.724688e-13 2.975343e-02   pass
1000   1000   4      0.009      70.7288  8.724688e-13 2.975343e-02   pass
2000   2000   4      0.045      119.0130 4.700934e-12 4.089237e-02   pass
2000   2000   4      0.047      113.0734 4.700934e-12 4.089237e-02   pass
5000   5008   4      1.012      82.3808  2.905478e-11 4.051455e-02   pass
5000   5008   4      0.904      92.2604  2.905478e-11 4.051455e-02   pass
10000  10000  4      6.006      111.0320 8.916344e-11 3.143993e-02   pass
10000  10000  4      9.940      67.0859  8.916344e-11 3.143993e-02   pass
15000  15000  4      20.360     110.5331 2.245460e-10 3.536637e-02   pass
15000  15000  4      12.060     186.6063 2.245460e-10 3.536637e-02   pass
18000  18008  4      24.829     156.6172 3.547512e-10 3.884962e-02   pass
18000  18008  4      23.544     165.1681 3.547512e-10 3.884962e-02   pass
20000  20016  4      24.910     214.1382 3.717613e-10 3.290902e-02   pass
20000  20016  4      33.420     159.6077 3.717613e-10 3.290902e-02   pass
22000  22008  4      36.094     196.6980 4.602465e-10 3.371124e-02   pass
22000  22008  4      34.890     203.4869 4.602465e-10 3.371124e-02   pass
25000  25000  4      57.939     179.8094 6.014913e-10 3.420466e-02   pass
25000  25000  4      62.237     167.3898 6.014913e-10 3.420466e-02   pass
26000  26000  4      68.882     170.1270 6.095033e-10 3.204951e-02   pass
26000  26000  4      53.737     218.0738 6.095033e-10 3.204951e-02   pass
27000  27000  4      83.299     157.5464 6.309912e-10 3.077034e-02   pass
30000  30000  1      91.139     197.5192 8.002768e-10 3.154699e-02   pass
35000  35000  1      143.780    198.8155 1.077145e-09 3.126789e-02   pass
40000  40000  1      193.893    220.0695 1.361293e-09 3.027561e-02   pass
45000  45000  1      294.865    206.0405 1.779995e-09 3.131714e-02   pass

Performance Summary (GFlops)

Size   LDA    Align.  Average  Maximal
1000   1000   4       36.2555  71.4727
2000   2000   4       116.0432 119.0130
5000   5008   4       87.3206  92.2604
10000  10000  4       89.0589  111.0320
15000  15000  4       148.5697 186.6063
18000  18008  4       160.8926 165.1681
20000  20016  4       186.8729 214.1382
22000  22008  4       200.0925 203.4869
25000  25000  4       173.5996 179.8094
26000  26000  4       194.1004 218.0738
27000  27000  4       157.5464 157.5464
30000  30000  1       197.5192 197.5192
35000  35000  1       198.8155 198.8155
40000  40000  1       220.0695 220.0695
45000  45000  1       206.0405 206.0405

Residual checks PASSED

End of tests

Done: 2018年  1月  5日 金曜日 15:37:40 JST

UnixBench

ついでなのでUnixBench 5.1.3でも比較。

   BYTE UNIX Benchmarks (Version 5.1.3)

   System: localhost.localdomain: GNU/Linux
   OS: GNU/Linux -- 3.10.0-693.el7.x86_64 -- #1 SMP Tue Aug 22 21:09:27 UTC 2017
   Machine: x86_64 (x86_64)
   Language: en_US.utf8 (charmap="UTF-8", collate="UTF-8")
   CPU 0: Intel Xeon E312xx (Sandy Bridge) (4400.0 bogomips)
          x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET
   CPU 1: Intel Xeon E312xx (Sandy Bridge) (4400.0 bogomips)
          x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET
   CPU 2: Intel Xeon E312xx (Sandy Bridge) (4400.0 bogomips)
          x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET
   CPU 3: Intel Xeon E312xx (Sandy Bridge) (4400.0 bogomips)
          x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET
   CPU 4: Intel Xeon E312xx (Sandy Bridge) (4400.0 bogomips)
          x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET
   CPU 5: Intel Xeon E312xx (Sandy Bridge) (4400.0 bogomips)
          x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET
   CPU 6: Intel Xeon E312xx (Sandy Bridge) (4400.0 bogomips)
          x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET
   CPU 7: Intel Xeon E312xx (Sandy Bridge) (4400.0 bogomips)
          x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET
   CPU 8: Intel Xeon E312xx (Sandy Bridge) (4400.0 bogomips)
          x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET
   CPU 9: Intel Xeon E312xx (Sandy Bridge) (4400.0 bogomips)
          x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET
   CPU 10: Intel Xeon E312xx (Sandy Bridge) (4400.0 bogomips)
          x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET
   CPU 11: Intel Xeon E312xx (Sandy Bridge) (4400.0 bogomips)
          x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET
   CPU 12: Intel Xeon E312xx (Sandy Bridge) (4400.0 bogomips)
          x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET
   CPU 13: Intel Xeon E312xx (Sandy Bridge) (4400.0 bogomips)
          x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET
   CPU 14: Intel Xeon E312xx (Sandy Bridge) (4400.0 bogomips)
          x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET
   CPU 15: Intel Xeon E312xx (Sandy Bridge) (4400.0 bogomips)
          x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET
   CPU 16: Intel Xeon E312xx (Sandy Bridge) (4400.0 bogomips)
          x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET
   CPU 17: Intel Xeon E312xx (Sandy Bridge) (4400.0 bogomips)
          x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET
   CPU 18: Intel Xeon E312xx (Sandy Bridge) (4400.0 bogomips)
          x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET
   CPU 19: Intel Xeon E312xx (Sandy Bridge) (4400.0 bogomips)
          x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET
   CPU 20: Intel Xeon E312xx (Sandy Bridge) (4400.0 bogomips)
          x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET
   CPU 21: Intel Xeon E312xx (Sandy Bridge) (4400.0 bogomips)
          x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET
   CPU 22: Intel Xeon E312xx (Sandy Bridge) (4400.0 bogomips)
          x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET
   CPU 23: Intel Xeon E312xx (Sandy Bridge) (4400.0 bogomips)
          x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET
   CPU 24: Intel Xeon E312xx (Sandy Bridge) (4400.0 bogomips)
          x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET
   CPU 25: Intel Xeon E312xx (Sandy Bridge) (4400.0 bogomips)
          x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET
   CPU 26: Intel Xeon E312xx (Sandy Bridge) (4400.0 bogomips)
          x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET
   CPU 27: Intel Xeon E312xx (Sandy Bridge) (4400.0 bogomips)
          x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET
   CPU 28: Intel Xeon E312xx (Sandy Bridge) (4400.0 bogomips)
          x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET
   CPU 29: Intel Xeon E312xx (Sandy Bridge) (4400.0 bogomips)
          x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET
   CPU 30: Intel Xeon E312xx (Sandy Bridge) (4400.0 bogomips)
          x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET
   CPU 31: Intel Xeon E312xx (Sandy Bridge) (4400.0 bogomips)
          x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET
   15:57:04 up  1:55,  3 users,  load average: 0.23, 0.76, 9.16; runlevel 3

------------------------------------------------------------------------
Benchmark Run: 金  1月 05 2018 15:57:04 - 16:24:07
32 CPUs in system; running 1 parallel copy of tests

Dhrystone 2 using register variables       31960454.1 lps   (10.0 s, 7 samples)
Double-Precision Whetstone                     3880.5 MWIPS (4.6 s, 7 samples)
Execl Throughput                               1122.7 lps   (30.0 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks       1012632.2 KBps  (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks          283216.9 KBps  (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks       2497954.9 KBps  (30.0 s, 2 samples)
Pipe Throughput                             1448859.8 lps   (10.0 s, 7 samples)
Pipe-based Context Switching                 132236.3 lps   (10.0 s, 7 samples)
Process Creation                               3279.8 lps   (30.0 s, 2 samples)
Shell Scripts (1 concurrent)                   2552.0 lpm   (60.0 s, 2 samples)
Shell Scripts (8 concurrent)                   1307.1 lpm   (60.0 s, 2 samples)
System Call Overhead                        2214891.1 lps   (10.0 s, 7 samples)

System Benchmarks Index Values               BASELINE       RESULT    INDEX
Dhrystone 2 using register variables         116700.0   31960454.1   2738.7
Double-Precision Whetstone                       55.0       3880.5    705.5
Execl Throughput                                 43.0       1122.7    261.1
File Copy 1024 bufsize 2000 maxblocks          3960.0    1012632.2   2557.2
File Copy 256 bufsize 500 maxblocks            1655.0     283216.9   1711.3
File Copy 4096 bufsize 8000 maxblocks          5800.0    2497954.9   4306.8
Pipe Throughput                               12440.0    1448859.8   1164.7
Pipe-based Context Switching                   4000.0     132236.3    330.6
Process Creation                                126.0       3279.8    260.3
Shell Scripts (1 concurrent)                     42.4       2552.0    601.9
Shell Scripts (8 concurrent)                      6.0       1307.1   2178.5
System Call Overhead                          15000.0    2214891.1   1476.6
                                                                   ========
System Benchmarks Index Score                                        1052.4

------------------------------------------------------------------------
Benchmark Run: 金  1月 05 2018 16:24:07 - 16:24:07
32 CPUs in system; running 32 parallel copies of tests

仮想化ホストのkernel等アップデート後

# yum list installed kernel libvirt qemu-kvm
インストール済みパッケージ
kernel.x86_64                  2.6.32-696.18.7.el6                      @updates
libvirt.x86_64                 0.10.2-62.el6_9.1                        @updates
qemu-kvm.x86_64                2:0.12.1.2-2.503.el6_9.4                 @updates

LINPACKベンチマーク

lin_xeon64-kvm-after.txt
2018年  1月  5日 金曜日 17:10:02 JST
Intel(R) Optimized LINPACK Benchmark data

Current date/time: Fri Jan  5 17:10:02 2018

CPU frequency:    1.185 GHz
Number of CPUs: 32
Number of cores: 32
Number of threads: 32

Parameters are set to:

Number of tests: 15
Number of equations to solve (problem size) : 1000  2000  5000  10000 15000 18000 20000 22000 25000 26000 27000 30000 35000 40000 45000
Leading dimension of array                  : 1000  2000  5008  10000 15000 18008 20016 22008 25000 26000 27000 30000 35000 40000 45000
Number of trials to run                     : 4     2     2     2     2     2     2     2     2     2     1     1     1     1     1
Data alignment value (in Kbytes)            : 4     4     4     4     4     4     4     4     4     4     4     1     1     1     1

Maximum memory requested that can be used=16200901024, at the size=45000

=================== Timing linear equation system solver ===================

Size   LDA    Align. Time(s)    GFlops   Residual     Residual(norm) Check
1000   1000   4      0.364      1.8369   8.724688e-13 2.975343e-02   pass
1000   1000   4      1.000      0.6690   8.724688e-13 2.975343e-02   pass
1000   1000   4      1.437      0.4653   8.724688e-13 2.975343e-02   pass
1000   1000   4      1.319      0.5070   8.724688e-13 2.975343e-02   pass
2000   2000   4      0.042      126.0391 4.700934e-12 4.089237e-02   pass
2000   2000   4      0.047      113.0396 4.700934e-12 4.089237e-02   pass
5000   5008   4      0.762      109.4769 2.905478e-11 4.051455e-02   pass
5000   5008   4      0.742      112.3330 2.905478e-11 4.051455e-02   pass
10000  10000  4      4.011      166.2445 8.916344e-11 3.143993e-02   pass
10000  10000  4      3.957      168.5287 8.916344e-11 3.143993e-02   pass
15000  15000  4      10.794     208.4864 2.245460e-10 3.536637e-02   pass
15000  15000  4      10.799     208.3938 2.245460e-10 3.536637e-02   pass
18000  18008  4      18.125     214.5514 3.547512e-10 3.884962e-02   pass
18000  18008  4      30.476     127.5958 3.547512e-10 3.884962e-02   pass
20000  20016  4      25.014     213.2472 3.717613e-10 3.290902e-02   pass
20000  20016  4      25.086     212.6328 3.717613e-10 3.290902e-02   pass
22000  22008  4      32.748     216.7970 4.602465e-10 3.371124e-02   pass
22000  22008  4      32.912     215.7182 4.602465e-10 3.371124e-02   pass
25000  25000  4      62.711     166.1261 6.014913e-10 3.420466e-02   pass
25000  25000  4      46.967     221.8114 6.014913e-10 3.420466e-02   pass
26000  26000  4      51.578     227.2016 6.095033e-10 3.204951e-02   pass
26000  26000  4      51.710     226.6210 6.095033e-10 3.204951e-02   pass
27000  27000  4      58.047     226.0823 6.309912e-10 3.077034e-02   pass
30000  30000  1      78.164     230.3074 8.002768e-10 3.154699e-02   pass
35000  35000  1      125.388    227.9778 1.077145e-09 3.126789e-02   pass
40000  40000  1      215.707    197.8145 1.361293e-09 3.027561e-02   pass
45000  45000  1      282.994    214.6835 1.779995e-09 3.131714e-02   pass

Performance Summary (GFlops)

Size   LDA    Align.  Average  Maximal
1000   1000   4       0.8696   1.8369
2000   2000   4       119.5393 126.0391
5000   5008   4       110.9050 112.3330
10000  10000  4       167.3866 168.5287
15000  15000  4       208.4401 208.4864
18000  18008  4       171.0736 214.5514
20000  20016  4       212.9400 213.2472
22000  22008  4       216.2576 216.7970
25000  25000  4       193.9687 221.8114
26000  26000  4       226.9113 227.2016
27000  27000  4       226.0823 226.0823
30000  30000  1       230.3074 230.3074
35000  35000  1       227.9778 227.9778
40000  40000  1       197.8145 197.8145
45000  45000  1       214.6835 214.6835

Residual checks PASSED

End of tests

Done: 2018年  1月  5日 金曜日 17:49:58 JST

UnixBench

BYTE UNIX Benchmarks (Version 5.1.3)
   System: localhost.localdomain: GNU/Linux
   OS: GNU/Linux -- 3.10.0-693.el7.x86_64 -- #1 SMP Tue Aug 22 21:09:27 UTC 2017
   Machine: x86_64 (x86_64)
   Language: en_US.utf8 (charmap="UTF-8", collate="UTF-8")
   CPU 0: Intel Xeon E312xx (Sandy Bridge) (4400.0 bogomips)
          x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET
   CPU 1: Intel Xeon E312xx (Sandy Bridge) (4400.0 bogomips)
          x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET
   CPU 2: Intel Xeon E312xx (Sandy Bridge) (4400.0 bogomips)
          x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET
   CPU 3: Intel Xeon E312xx (Sandy Bridge) (4400.0 bogomips)
          x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET
   CPU 4: Intel Xeon E312xx (Sandy Bridge) (4400.0 bogomips)
          x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET
   CPU 5: Intel Xeon E312xx (Sandy Bridge) (4400.0 bogomips)
          x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET
   CPU 6: Intel Xeon E312xx (Sandy Bridge) (4400.0 bogomips)
          x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET
   CPU 7: Intel Xeon E312xx (Sandy Bridge) (4400.0 bogomips)
          x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET
   CPU 8: Intel Xeon E312xx (Sandy Bridge) (4400.0 bogomips)
          x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET
   CPU 9: Intel Xeon E312xx (Sandy Bridge) (4400.0 bogomips)
          x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET
   CPU 10: Intel Xeon E312xx (Sandy Bridge) (4400.0 bogomips)
          x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET
   CPU 11: Intel Xeon E312xx (Sandy Bridge) (4400.0 bogomips)
          x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET
   CPU 12: Intel Xeon E312xx (Sandy Bridge) (4400.0 bogomips)
          x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET
   CPU 13: Intel Xeon E312xx (Sandy Bridge) (4400.0 bogomips)
          x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET
   CPU 14: Intel Xeon E312xx (Sandy Bridge) (4400.0 bogomips)
          x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET
   CPU 15: Intel Xeon E312xx (Sandy Bridge) (4400.0 bogomips)
          x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET
   CPU 16: Intel Xeon E312xx (Sandy Bridge) (4400.0 bogomips)
          x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET
   CPU 17: Intel Xeon E312xx (Sandy Bridge) (4400.0 bogomips)
          x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET
   CPU 18: Intel Xeon E312xx (Sandy Bridge) (4400.0 bogomips)
          x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET
   CPU 19: Intel Xeon E312xx (Sandy Bridge) (4400.0 bogomips)
          x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET
   CPU 20: Intel Xeon E312xx (Sandy Bridge) (4400.0 bogomips)
          x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET
   CPU 21: Intel Xeon E312xx (Sandy Bridge) (4400.0 bogomips)
          x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET
   CPU 22: Intel Xeon E312xx (Sandy Bridge) (4400.0 bogomips)
          x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET
   CPU 23: Intel Xeon E312xx (Sandy Bridge) (4400.0 bogomips)
          x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET
   CPU 24: Intel Xeon E312xx (Sandy Bridge) (4400.0 bogomips)
          x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET
   CPU 25: Intel Xeon E312xx (Sandy Bridge) (4400.0 bogomips)
          x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET
   CPU 26: Intel Xeon E312xx (Sandy Bridge) (4400.0 bogomips)
          x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET
   CPU 27: Intel Xeon E312xx (Sandy Bridge) (4400.0 bogomips)
          x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET
   CPU 28: Intel Xeon E312xx (Sandy Bridge) (4400.0 bogomips)
          x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET
   CPU 29: Intel Xeon E312xx (Sandy Bridge) (4400.0 bogomips)
          x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET
   CPU 30: Intel Xeon E312xx (Sandy Bridge) (4400.0 bogomips)
          x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET
   CPU 31: Intel Xeon E312xx (Sandy Bridge) (4400.0 bogomips)
          x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET
   18:08:06 up  1:24,  1 user,  load average: 0.00, 0.83, 9.15; runlevel 3

------------------------------------------------------------------------
Benchmark Run: 金  1月 05 2018 18:08:06 - 18:35:07
32 CPUs in system; running 1 parallel copy of tests

Dhrystone 2 using register variables       32439270.4 lps   (10.0 s, 7 samples)
Double-Precision Whetstone                     3944.9 MWIPS (4.9 s, 7 samples)
Execl Throughput                               1133.5 lps   (30.0 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks       1029433.5 KBps  (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks          287239.0 KBps  (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks       2759356.5 KBps  (30.0 s, 2 samples)
Pipe Throughput                             1501040.6 lps   (10.0 s, 7 samples)
Pipe-based Context Switching                 129568.2 lps   (10.0 s, 7 samples)
Process Creation                               3351.8 lps   (30.0 s, 2 samples)
Shell Scripts (1 concurrent)                   2616.8 lpm   (60.0 s, 2 samples)
Shell Scripts (8 concurrent)                   1357.5 lpm   (60.0 s, 2 samples)
System Call Overhead                        2257813.0 lps   (10.0 s, 7 samples)

System Benchmarks Index Values               BASELINE       RESULT    INDEX
Dhrystone 2 using register variables         116700.0   32439270.4   2779.7
Double-Precision Whetstone                       55.0       3944.9    717.3
Execl Throughput                                 43.0       1133.5    263.6
File Copy 1024 bufsize 2000 maxblocks          3960.0    1029433.5   2599.6
File Copy 256 bufsize 500 maxblocks            1655.0     287239.0   1735.6
File Copy 4096 bufsize 8000 maxblocks          5800.0    2759356.5   4757.5
Pipe Throughput                               12440.0    1501040.6   1206.6
Pipe-based Context Switching                   4000.0     129568.2    323.9
Process Creation                                126.0       3351.8    266.0
Shell Scripts (1 concurrent)                     42.4       2616.8    617.2
Shell Scripts (8 concurrent)                      6.0       1357.5   2262.5
System Call Overhead                          15000.0    2257813.0   1505.2
                                                                   ========
System Benchmarks Index Score                                        1078.1

------------------------------------------------------------------------
Benchmark Run: 金  1月 05 2018 18:35:07 - 18:35:07
32 CPUs in system; running 32 parallel copies of tests

結論

とりあえず仮想化ゲストでもたちまち遅くなるという訳ではなさそう。多くの仮想化ゲストに同時に負荷をかけてやれば違いが出るのかもしれないが、手元環境の都合上試すのは難しいのでパス。あるいはWindowsデスクトップPCで報告されているように高速ストレージだと有意な差が見られるのかもしれない。
それより仮想化ホスト上でLINPACKベンチマークを行った時の結果の不安定さが気になった。

6
2
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
6
2

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?