KVM
Spectre
Meltdown

KVMでMeltdown/Spectreのパッチ適用後の性能検証

More than 1 year has passed since last update.


概要

こちらの投稿を参考にKVM上でMeltdown/Spectreパッチ適用後に性能試験を実施しました。

今回はとりあえずunixBenchのみです。適宜検証したら追記したいと思います。

Redhatの参考情報

https://access.redhat.com/ja/security/vulnerabilities/3311961


前提条件


  • 検証するVMが動作するKVMホストは、パッチ適用済です。

  • 検証するVMは、CentOS7.3、CPU:4core、Memory:16GB


バージョン情報


VM(パッチ未適用)

$ uname -a

Linux vm1 3.10.0-693.5.2.el7.x86_64 #1 SMP Fri Oct 13 10:46:25 EDT 2017 x86_64 x86_64 x86_64 GNU/Linux


VM(パッチ適用)

$ uname -a

Linux vm2 3.10.0-693.11.6.el7.x86_64 #1 SMP Thu Jan 4 01:06:37 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux


脆弱性チェック

https://raw.githubusercontent.com/speed47/spectre-meltdown-checker/master/spectre-meltdown-checker.sh を実行


パッチ未適用VM

Spectre Variant 1,2,3に脆弱性を確認

$ sudo sh ./spectre-meltdown-checker.sh

Spectre and Meltdown mitigation detection tool v0.31

Checking for vulnerabilities against running kernel Linux 3.10.0-693.5.2.el7.x86_64 #1 SMP Fri Oct 13 10:46:25 EDT 2017 x86_64
CPU is QEMU Virtual CPU version 1.5.3

CVE-2017-5753 [bounds check bypass] aka 'Spectre Variant 1'
* Checking count of LFENCE opcodes in kernel: NO
> STATUS: VULNERABLE (only 21 opcodes found, should be >= 70, heuristic to be improved when official patches become available)

CVE-2017-5715 [branch target injection] aka 'Spectre Variant 2'
* Mitigation 1
* Hardware (CPU microcode) support for mitigation
* The SPEC_CTRL MSR is available: YES
* The SPEC_CTRL CPUID feature bit is set: NO
* Kernel support for IBRS: NO
* IBRS enabled for Kernel space: NO
* IBRS enabled for User space: NO
* Mitigation 2
* Kernel compiled with retpoline option: NO
* Kernel compiled with a retpoline-aware compiler: NO
> STATUS: VULNERABLE (IBRS hardware + kernel support OR kernel with retpoline are needed to mitigate the vulnerability)

CVE-2017-5754 [rogue data cache load] aka 'Meltdown' aka 'Variant 3'
* Kernel supports Page Table Isolation (PTI): NO
* PTI enabled and active: NO
* Checking if we're running under Xen PV (64 bits): NO
> STATUS: VULNERABLE (PTI is needed to mitigate the vulnerability)

A false sense of security is worse than no security at all, see --disclaimer


パッチ適用VM

Spectre Variant 2に脆弱性を確認。それ以外はOK。

$ sudo sh ./spectre-meltdown-checker.sh

Spectre and Meltdown mitigation detection tool v0.31

Checking for vulnerabilities against running kernel Linux 3.10.0-693.11.6.el7.x86_64 #1 SMP Thu Jan 4 01:06:37 UTC 2018 x86_64
CPU is QEMU Virtual CPU version 1.5.3

CVE-2017-5753 [bounds check bypass] aka 'Spectre Variant 1'
* Checking count of LFENCE opcodes in kernel: YES
> STATUS: NOT VULNERABLE (106 opcodes found, which is >= 70, heuristic to be improved when official patches become available)

CVE-2017-5715 [branch target injection] aka 'Spectre Variant 2'
* Mitigation 1
* Hardware (CPU microcode) support for mitigation
* The SPEC_CTRL MSR is available: YES
* The SPEC_CTRL CPUID feature bit is set: NO
* Kernel support for IBRS: YES
* IBRS enabled for Kernel space: NO
* IBRS enabled for User space: NO
* Mitigation 2
* Kernel compiled with retpoline option: NO
* Kernel compiled with a retpoline-aware compiler: NO
> STATUS: VULNERABLE (IBRS hardware + kernel support OR kernel with retpoline are needed to mitigate the vulnerability)

CVE-2017-5754 [rogue data cache load] aka 'Meltdown' aka 'Variant 3'
* Kernel supports Page Table Isolation (PTI): YES
* PTI enabled and active: YES
* Checking if we're running under Xen PV (64 bits): NO
> STATUS: NOT VULNERABLE (PTI mitigates the vulnerability)

A false sense of security is worse than no security at all, see --disclaimer


unixBenchの結果

4coreのテストのみを記載


パッチ未適用VM

------------------------------------------------------------------------

Benchmark Run: Wed Jan 17 2018 15:07:20 - 15:14:03
4 CPUs in system; running 4 parallel copies of tests

Dhrystone 2 using register variables 131231880.5 lps (10.0 s, 1 samples)
Double-Precision Whetstone 16937.1 MWIPS (9.6 s, 1 samples)
Execl Throughput 16987.2 lps (29.3 s, 1 samples)
File Copy 1024 bufsize 2000 maxblocks 1389846.0 KBps (30.0 s, 1 samples)
File Copy 256 bufsize 500 maxblocks 385034.0 KBps (30.0 s, 1 samples)
File Copy 4096 bufsize 8000 maxblocks 4373700.0 KBps (30.0 s, 1 samples)
Pipe Throughput 6748830.2 lps (10.0 s, 1 samples)
Pipe-based Context Switching 1417922.6 lps (10.0 s, 1 samples)
Process Creation 44711.5 lps (30.0 s, 1 samples)
Shell Scripts (1 concurrent) 21357.6 lpm (60.0 s, 1 samples)
Shell Scripts (8 concurrent) 3414.1 lpm (60.0 s, 1 samples)
System Call Overhead 6678039.5 lps (10.0 s, 1 samples)

System Benchmarks Index Values BASELINE RESULT INDEX
Dhrystone 2 using register variables 116700.0 131231880.5 11245.2
Double-Precision Whetstone 55.0 16937.1 3079.5
Execl Throughput 43.0 16987.2 3950.5
File Copy 1024 bufsize 2000 maxblocks 3960.0 1389846.0 3509.7
File Copy 256 bufsize 500 maxblocks 1655.0 385034.0 2326.5
File Copy 4096 bufsize 8000 maxblocks 5800.0 4373700.0 7540.9
Pipe Throughput 12440.0 6748830.2 5425.1
Pipe-based Context Switching 4000.0 1417922.6 3544.8
Process Creation 126.0 44711.5 3548.5
Shell Scripts (1 concurrent) 42.4 21357.6 5037.2
Shell Scripts (8 concurrent) 6.0 3414.1 5690.2
System Call Overhead 15000.0 6678039.5 4452.0
========
System Benchmarks Index Score 4523.3


パッチ適用VM

------------------------------------------------------------------------

Benchmark Run: Wed Jan 17 2018 15:29:47 - 15:36:30
4 CPUs in system; running 4 parallel copies of tests

Dhrystone 2 using register variables 128905413.5 lps (10.0 s, 1 samples)
Double-Precision Whetstone 16450.1 MWIPS (9.8 s, 1 samples)
Execl Throughput 11097.3 lps (29.0 s, 1 samples)
File Copy 1024 bufsize 2000 maxblocks 1203052.0 KBps (30.0 s, 1 samples)
File Copy 256 bufsize 500 maxblocks 313511.0 KBps (30.0 s, 1 samples)
File Copy 4096 bufsize 8000 maxblocks 3669716.0 KBps (30.0 s, 1 samples)
Pipe Throughput 2038905.8 lps (10.0 s, 1 samples)
Pipe-based Context Switching 638231.7 lps (10.0 s, 1 samples)
Process Creation 34631.1 lps (30.0 s, 1 samples)
Shell Scripts (1 concurrent) 16080.8 lpm (60.0 s, 1 samples)
Shell Scripts (8 concurrent) 2448.6 lpm (60.0 s, 1 samples)
System Call Overhead 1568599.1 lps (10.0 s, 1 samples)

System Benchmarks Index Values BASELINE RESULT INDEX
Dhrystone 2 using register variables 116700.0 128905413.5 11045.9
Double-Precision Whetstone 55.0 16450.1 2990.9
Execl Throughput 43.0 11097.3 2580.8
File Copy 1024 bufsize 2000 maxblocks 3960.0 1203052.0 3038.0
File Copy 256 bufsize 500 maxblocks 1655.0 313511.0 1894.3
File Copy 4096 bufsize 8000 maxblocks 5800.0 3669716.0 6327.1
Pipe Throughput 12440.0 2038905.8 1639.0
Pipe-based Context Switching 4000.0 638231.7 1595.6
Process Creation 126.0 34631.1 2748.5
Shell Scripts (1 concurrent) 42.4 16080.8 3792.6
Shell Scripts (8 concurrent) 6.0 2448.6 4081.0
System Call Overhead 15000.0 1568599.1 1045.7
========
System Benchmarks Index Score 2905.0


unixBenchの性能比較結果

参考にしたqiitaの記事通りのテスト項目が著しく低下することを確認しました。

System Benchmarks Index Values
パッチ適用前
パッチ適用後
性能差(%)

Dhrystone 2 using register variables
11245.2
11045.9
98.23%

Double-Precision Whetstone
3079.5
2990.9
97.12%

Execl Throughput
3950.5
2580.8
65.33%

File Copy 1024 bufsize 2000 maxblocks
3509.7
3038
86.56%

File Copy 256 bufsize 500 maxblocks
2326.5
1894.3
81.42%

File Copy 4096 bufsize 8000 maxblocks
7540.9
6327.1
83.90%

Pipe Throughput
5425.1
1639
30.21%

Pipe-based Context Switching
3544.8
1595.6
45.01%

Process Creation
3548.5
2748.5
77.46%

Shell Scripts (1 concurrent)
5037.2
3792.6
75.29%

Shell Scripts (8 concurrent)
5690.2
4081
71.72%

System Call Overhead
4452
1045.7
23.49%

System Benchmarks Index Score
4523.3
2905.0
64.22%

テスト項目の意味は、こちらを参考にしました。

実際にちゃんと本番相当の負荷をかけて本格的なテストをしたかったですが、一旦ここまでとします。続報があれば追記します。


2018/01/23追記


mysqlslapの結果

こちらの記事を参考に、mysqlslapの性能を検証しました。

全体的に10%から20%前後の性能劣化が確認できました。特にinsert系の性能劣化が目立ちます。

テスト内容
テスト結果の分類
パッチ適用前
パッチ適用後
性能差(%)

50スレッド 1000行データ 1000クエリ read(テーブルスキャン)
Average
0.602
0.676
-12.29%

Minimum
0.495
0.548
-10.71%

Maximum
0.873
1.016
-16.38%

50スレッド 1000行データ 1000クエリ write(テーブルへの挿入)
Average
0.148
0.173
-16.89%

Minimum
0.124
0.152
-22.58%

Maximum
0.173
0.196
-13.29%

50スレッド 1000行データ 1000クエリ key(主キー読み取り)
Average
0.075
0.085
-13.33%

Minimum
0.069
0.079
-14.49%

Maximum
0.085
0.095
-11.76%

50スレッド 1000行データ 1000クエリ mixed(挿入とテーブルスキャンを半々)
Average
0.106
0.121
-14.15%

Minimum
0.098
0.104
-6.12%

Maximum
0.13
0.128
1.54%

50スレッド 1000行データ 1000クエリ update(更新)
Average
0.176
0.189
-7.39%

Minimum
0.159
0.175
-10.06%

Maximum
0.197
0.209
-6.09%

また続報があれば追記します。