LoginSignup
16
10

More than 5 years have passed since last update.

Meltdown cpu脆弱性カーネルアップデートでほんとに性能ダウンした

Last updated at Posted at 2018-01-11

環境

DELL PowerEdge R300
cpu: Xeon X3363 2.83GHz 4core  開発コードYorkfield(-CL)
https://ark.intel.com/ja/products/35279/Intel-Xeon-Processor-X3363-12M-Cache-2_83-GHz-1333-MHz-FSB
mem: 12GB
hdd: SAS6G 15000rpm
os: ubuntu 16.04
4.4.0-104-generic

4.13.0-25-generic

やった手順

uname -aでカーネルバージョン確認
Linux un11 4.4.0-104-generic #127-Ubuntu SMP Mon Dec 11 12:16:42 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

spectre-meltdown-checkerを実行してみる
wget https://raw.githubusercontent.com/speed47/spectre-meltdown-checker/master/spectre-meltdown-checker.sh
sudo sh ./spectre-meltdown-checker.sh

結果
上2つがSpectre。下がMeltdown。「全〜然だめぇ〜」と言われる

Spectre and Meltdown mitigation detection tool v0.21

Checking for vulnerabilities against live running kernel Linux 4.4.0-104-generic #127-Ubuntu SMP Mon Dec 11 12:16:42 UTC 2017 x86_64

CVE-2017-5753 [bounds check bypass] aka 'Spectre Variant 1'
* Checking count of LFENCE opcodes in kernel: NO (only 38 opcodes found, should be >= 70)
> STATUS: VULNERABLE (heuristic to be improved when official patches become available)

CVE-2017-5715 [branch target injection] aka 'Spectre Variant 2'
* Mitigation 1
* Hardware (CPU microcode) support for mitigation: NO
* Kernel support for IBRS: NO
* IBRS enabled for Kernel space: NO
* IBRS enabled for User space: NO
* Mitigation 2
* Kernel compiled with retpoline option: NO
* Kernel compiled with a retpoline-aware compiler: NO
> STATUS: VULNERABLE (IBRS hardware + kernel support OR kernel with retpoline are needed to mitigate the vulnerability)

CVE-2017-5754 [rogue data cache load] aka 'Meltdown' aka 'Variant 3'
* Kernel supports Page Table Isolation (PTI): NO
* PTI enabled and active: NO
> STATUS: VULNERABLE (PTI is needed to mitigate the vulnerability)

A false sense of security is worse than no security at all, see --disclaimer`


ベンチマーク実施
wget https://storage.googleapis.com/google-code-archive-downloads/v2/code.google.com/byte-unixbench/UnixBench5.1.3.tgz
tar xvzf UnixBench5.1.3.tgz
結果はページ末尾に記載↓↓

アップグレードする
sudo apt install linux-generic-hwe-16.04-edge

再起動

カーネルバージョン確認
Linux un11 4.13.0-25-generic #29~16.04.2-Ubuntu SMP Tue Jan 9 12:16:39 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

spectre-meltdown-checkerを実行

Spectre and Meltdown mitigation detection tool v0.24

Checking for vulnerabilities against live running kernel Linux 4.13.0-25-generic #29~16.04.2-Ubuntu SMP Tue Jan 9 12:16:39 UTC 2018 x86_64

CVE-2017-5753 [bounds check bypass] aka 'Spectre Variant 1'
* Checking count of LFENCE opcodes in kernel: NO (only 42 opcodes found, should be >= 70)
> STATUS: VULNERABLE (heuristic to be improved when official patches become available)

CVE-2017-5715 [branch target injection] aka 'Spectre Variant 2'
* Mitigation 1
* Hardware (CPU microcode) support for mitigation: NO
* Kernel support for IBRS: NO
* IBRS enabled for Kernel space: NO
* IBRS enabled for User space: NO
* Mitigation 2
* Kernel compiled with retpoline option: NO
* Kernel compiled with a retpoline-aware compiler: NO
> STATUS: VULNERABLE (IBRS hardware + kernel support OR kernel with retpoline are needed to mitigate the vulnerability)

CVE-2017-5754 [rogue data cache load] aka 'Meltdown' aka 'Variant 3'
* Kernel supports Page Table Isolation (PTI): YES
* PTI enabled and active: YES
> STATUS: NOT VULNERABLE (PTI mitigates the vulnerability)

A false sense of security is worse than no security at all, see --disclaimer

Meltdownだけ対策されてるー


ベンチマーク実施

ベンチマーク結果

対策前

   BYTE UNIX Benchmarks (Version 5.1.3)

   System: miner1: GNU/Linux
   OS: GNU/Linux -- 4.4.0-104-generic -- #127-Ubuntu SMP Mon Dec 11 12:16:42 UTC 2017
   Machine: x86_64 (x86_64)
   Language: en_US.utf8 (charmap="UTF-8", collate="UTF-8")
   CPU 0: Intel(R) Xeon(R) CPU X3363 @ 2.83GHz (5666.2 bogomips)
          Hyper-Threading, x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET, Intel virtualization
   CPU 1: Intel(R) Xeon(R) CPU X3363 @ 2.83GHz (5666.2 bogomips)
          Hyper-Threading, x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET, Intel virtualization
   CPU 2: Intel(R) Xeon(R) CPU X3363 @ 2.83GHz (5666.2 bogomips)
          Hyper-Threading, x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET, Intel virtualization
   CPU 3: Intel(R) Xeon(R) CPU X3363 @ 2.83GHz (5666.2 bogomips)
          Hyper-Threading, x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET, Intel virtualization
   19:05:19 up 21 days, 22:41,  1 user,  load average: 0.00, 0.00, 0.00; runlevel 5

------------------------------------------------------------------------
Benchmark Run: 水  1月 10 2018 19:05:19 - 19:33:34
4 CPUs in system; running 1 parallel copy of tests

Dhrystone 2 using register variables       28119902.0 lps   (10.0 s, 7 samples)
Double-Precision Whetstone                     3158.7 MWIPS (9.9 s, 7 samples)
Execl Throughput                               2974.6 lps   (30.0 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks        783667.8 KBps  (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks          221717.8 KBps  (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks       1180483.7 KBps  (30.0 s, 2 samples)
Pipe Throughput                             1573155.7 lps   (10.0 s, 7 samples)
Pipe-based Context Switching                 198800.3 lps   (10.0 s, 7 samples)
Process Creation                               7286.8 lps   (30.0 s, 2 samples)
Shell Scripts (1 concurrent)                   9579.3 lpm   (60.0 s, 2 samples)
Shell Scripts (8 concurrent)                   2890.5 lpm   (60.0 s, 2 samples)
System Call Overhead                        2293641.9 lps   (10.0 s, 7 samples)

System Benchmarks Index Values               BASELINE       RESULT    INDEX
Dhrystone 2 using register variables         116700.0   28119902.0   2409.6
Double-Precision Whetstone                       55.0       3158.7    574.3
Execl Throughput                                 43.0       2974.6    691.8
File Copy 1024 bufsize 2000 maxblocks          3960.0     783667.8   1979.0
File Copy 256 bufsize 500 maxblocks            1655.0     221717.8   1339.7
File Copy 4096 bufsize 8000 maxblocks          5800.0    1180483.7   2035.3
Pipe Throughput                               12440.0    1573155.7   1264.6
Pipe-based Context Switching                   4000.0     198800.3    497.0
Process Creation                                126.0       7286.8    578.3
Shell Scripts (1 concurrent)                     42.4       9579.3   2259.3
Shell Scripts (8 concurrent)                      6.0       2890.5   4817.6
System Call Overhead                          15000.0    2293641.9   1529.1
                                                                   ========
System Benchmarks Index Score                                        1332.2

------------------------------------------------------------------------
Benchmark Run: 水  1月 10 2018 19:33:34 - 20:01:50
4 CPUs in system; running 4 parallel copies of tests

Dhrystone 2 using register variables      110748798.1 lps   (10.0 s, 7 samples)
Double-Precision Whetstone                    12669.0 MWIPS (9.9 s, 7 samples)
Execl Throughput                              10551.4 lps   (29.9 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks        627694.8 KBps  (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks          177384.4 KBps  (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks       1454342.1 KBps  (30.0 s, 2 samples)
Pipe Throughput                             6318157.0 lps   (10.0 s, 7 samples)
Pipe-based Context Switching                1062264.5 lps   (10.0 s, 7 samples)
Process Creation                              19246.2 lps   (30.0 s, 2 samples)
Shell Scripts (1 concurrent)                  26011.5 lpm   (60.0 s, 2 samples)
Shell Scripts (8 concurrent)                   3215.3 lpm   (60.0 s, 2 samples)
System Call Overhead                        4000482.5 lps   (10.0 s, 7 samples)

System Benchmarks Index Values               BASELINE       RESULT    INDEX
Dhrystone 2 using register variables         116700.0  110748798.1   9490.0
Double-Precision Whetstone                       55.0      12669.0   2303.5
Execl Throughput                                 43.0      10551.4   2453.8
File Copy 1024 bufsize 2000 maxblocks          3960.0     627694.8   1585.1
File Copy 256 bufsize 500 maxblocks            1655.0     177384.4   1071.8
File Copy 4096 bufsize 8000 maxblocks          5800.0    1454342.1   2507.5
Pipe Throughput                               12440.0    6318157.0   5078.9
Pipe-based Context Switching                   4000.0    1062264.5   2655.7
Process Creation                                126.0      19246.2   1527.5
Shell Scripts (1 concurrent)                     42.4      26011.5   6134.8
Shell Scripts (8 concurrent)                      6.0       3215.3   5358.9
System Call Overhead                          15000.0    4000482.5   2667.0
                                                                   ========
System Benchmarks Index Score                                        2937.5

対策後

   BYTE UNIX Benchmarks (Version 5.1.3)

   System: miner1: GNU/Linux
   OS: GNU/Linux -- 4.13.0-25-generic -- #29~16.04.2-Ubuntu SMP Tue Jan 9 12:16:39 UTC 2018
   Machine: x86_64 (x86_64)
   Language: en_US.utf8 (charmap="UTF-8", collate="UTF-8")
   CPU 0: Intel(R) Xeon(R) CPU X3363 @ 2.83GHz (5666.5 bogomips)
          Hyper-Threading, x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET, Intel virtualization
   CPU 1: Intel(R) Xeon(R) CPU X3363 @ 2.83GHz (5666.5 bogomips)
          Hyper-Threading, x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET, Intel virtualization
   CPU 2: Intel(R) Xeon(R) CPU X3363 @ 2.83GHz (5666.5 bogomips)
          Hyper-Threading, x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET, Intel virtualization
   CPU 3: Intel(R) Xeon(R) CPU X3363 @ 2.83GHz (5666.5 bogomips)
          Hyper-Threading, x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET, Intel virtualization
   10:27:19 up 12:29,  1 user,  load average: 0.00, 0.00, 0.00; runlevel 5

------------------------------------------------------------------------
Benchmark Run: 木  1月 11 2018 10:27:19 - 10:55:32
4 CPUs in system; running 1 parallel copy of tests

Dhrystone 2 using register variables       26211740.0 lps   (10.0 s, 7 samples)
Double-Precision Whetstone                     3166.2 MWIPS (9.9 s, 7 samples)
Execl Throughput                               3624.6 lps   (30.0 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks        417875.3 KBps  (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks          115622.4 KBps  (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks        880767.8 KBps  (30.0 s, 2 samples)
Pipe Throughput                              670758.3 lps   (10.0 s, 7 samples)
Pipe-based Context Switching                 190568.3 lps   (10.0 s, 7 samples)
Process Creation                               7535.9 lps   (30.0 s, 2 samples)
Shell Scripts (1 concurrent)                   8931.3 lpm   (60.0 s, 2 samples)
Shell Scripts (8 concurrent)                   2451.1 lpm   (60.0 s, 2 samples)
System Call Overhead                         616139.0 lps   (10.0 s, 7 samples)

System Benchmarks Index Values               BASELINE       RESULT    INDEX
Dhrystone 2 using register variables         116700.0   26211740.0   2246.1
Double-Precision Whetstone                       55.0       3166.2    575.7
Execl Throughput                                 43.0       3624.6    842.9
File Copy 1024 bufsize 2000 maxblocks          3960.0     417875.3   1055.2
File Copy 256 bufsize 500 maxblocks            1655.0     115622.4    698.6
File Copy 4096 bufsize 8000 maxblocks          5800.0     880767.8   1518.6
Pipe Throughput                               12440.0     670758.3    539.2
Pipe-based Context Switching                   4000.0     190568.3    476.4
Process Creation                                126.0       7535.9    598.1
Shell Scripts (1 concurrent)                     42.4       8931.3   2106.4
Shell Scripts (8 concurrent)                      6.0       2451.1   4085.2
System Call Overhead                          15000.0     616139.0    410.8
                                                                   ========
System Benchmarks Index Score                                         966.3

------------------------------------------------------------------------
Benchmark Run: 木  1月 11 2018 10:55:32 - 11:23:47
4 CPUs in system; running 4 parallel copies of tests

Dhrystone 2 using register variables      105167823.9 lps   (10.0 s, 7 samples)
Double-Precision Whetstone                    12665.7 MWIPS (9.9 s, 7 samples)
Execl Throughput                               9885.1 lps   (30.0 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks        601864.0 KBps  (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks          171021.2 KBps  (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks       1324158.9 KBps  (30.0 s, 2 samples)
Pipe Throughput                             2700705.7 lps   (10.0 s, 7 samples)
Pipe-based Context Switching                 676091.6 lps   (10.0 s, 7 samples)
Process Creation                              18847.7 lps   (30.0 s, 2 samples)
Shell Scripts (1 concurrent)                  20126.8 lpm   (60.0 s, 2 samples)
Shell Scripts (8 concurrent)                   2822.7 lpm   (60.1 s, 2 samples)
System Call Overhead                        2136850.9 lps   (10.0 s, 7 samples)

System Benchmarks Index Values               BASELINE       RESULT    INDEX
Dhrystone 2 using register variables         116700.0  105167823.9   9011.8
Double-Precision Whetstone                       55.0      12665.7   2302.9
Execl Throughput                                 43.0       9885.1   2298.9
File Copy 1024 bufsize 2000 maxblocks          3960.0     601864.0   1519.9
File Copy 256 bufsize 500 maxblocks            1655.0     171021.2   1033.4
File Copy 4096 bufsize 8000 maxblocks          5800.0    1324158.9   2283.0
Pipe Throughput                               12440.0    2700705.7   2171.0
Pipe-based Context Switching                   4000.0     676091.6   1690.2
Process Creation                                126.0      18847.7   1495.8
Shell Scripts (1 concurrent)                     42.4      20126.8   4746.9
Shell Scripts (8 concurrent)                      6.0       2822.7   4704.5
System Call Overhead                          15000.0    2136850.9   1424.6
                                                                   ========
System Benchmarks Index Score                                        2360.1

比較

内容 説明 アップデート前 アップデート後 性能低下割合
Dhrystone 2 using register variables 整数演算(回数/秒) 110748798.1 105167823.9 95.0
Double-Precision Whetstone 浮動小数点数演算(回数/秒) 12669.0 12665.7 100.0
Execl Throughput execlの実行(回数) 10551.4 9885.1 93.7
File Copy 1024 bufsize 2000 maxblocks ファイルのコピー(KBps) 627694.8 601864.0 95.9
File Copy 256 bufsize 500 maxblocks ファイルのコピー(KBps) 177384.4 171021.2 96.4
File Copy 4096 bufsize 8000 maxblocks ファイルのコピー(KBps) 1454342.1 1324158.9 91.0
Pipe Throughput パイプを繰り返す(回数) 6318157.0 2700705.7 42.7
Pipe-based Context Switching 2つのプロセス間でパイプを繰り返す(回数) 1062264.5 676091.6 63.6
Process Creation プロセスを作成する(回数) 19246.2 18847.7 97.9
Shell Scripts (1 concurrent) shellコマンドをたたく(回数) 26011.5 20126.8 77.4
Shell Scripts (8 concurrent) 上記x8を同時に実行(回数) 3215.3 2822.7 87.8
System Call Overhead getpidを実行(回数) 4000482.5 2136850.9 53.4

同じ筐体で軒並み数値が下がっていますので、性能低下は間違いないようです。
1桁台のみの低下が多いようですが、パイプを使った処理には苦戦しているようです。
投機的処理がかなりスピードに貢献していたんですね。

その他雑記

Spectre(Bounds Check Bypass)のほうが先に対策されそう?
Spectre(Branch Target Injection)の攻撃を防ぐための以下3つの機能
IBRS
STIBP
IBPB
が必要らしい。
当然DELL PowerEdge R300 にはBIOSアップデート来てない。

トランポリンすると性能低下せずに回避できるらしい。
でもカーネルやディストリビューションのアップデートが来るまではまだ時間がかかるらしい。
頑張ってる人たちにドリンク送ろう。

intelから1/8に配られたマイクロコード(k)
https://downloadcenter.intel.com/download/27431/Linux-Processor-Microcode-Data-File

16
10
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
16
10