概要
現在の環境(Linux, munin 2.0.34, smartmontools 6.6)ではplugin-conf.dの設定だけだと対応できなかったのでメモ
手順
ドライブ一覧の強制
env.drivesを指定しない場合のドライブ一覧は、
hddtemp_smartctl
my @drivesSCSI;
if (-d '/sys/block/') {
opendir(SCSI, '/sys/block/');
@drivesSCSI = grep /sd[a-z]/, readdir SCSI;
closedir(SCSI);
}
というように取得されている(Linuxの場合)。当然、nvme0n1などは該当しない。
そこでenv.drivesを設定することで対応する。
/etc/munin/plugin-conf.d/munin-node
[hddtemp_smartctl]
user root
group disk
env.drives sda sdb nvme0n1
##温度行マッチ条件への対応
smartctlの出力から温度を示している行を抜き出す処理は、
hddtemp_smartctl
if ($output =~ /Current Drive Temperature:\s*(\d+)/) {
print "$drive.value $1\n";
} elsif ($output =~ /^(194 Temperature_Celsius.*)/m) {
my @F = split /\s+/, $1;
print "$drive.value $F[9]\n";
} elsif ($output =~ /^(231 Temperature_Celsius.*)/m) {
my @F = split ' ', $1;
print "$drive.value $F[9]\n";
} elsif ($output =~ /^(190 Airflow_Temperature_Cel.*)/m) {
my @F = split ' ', $1;
print "$drive.value $F[9]\n";
} else {
print "$drive.value U\n";
print "$drive.extinfo Temperature not detected in smartctl output\n";
}
のようになっている。
一方、smartctlでNVMe SSDのS.M.A.R.T.情報を取得すると、
$sudo smartctl -A /dev/nvme0n1
smartctl 6.6 2017-11-05 r4594 [x86_64-linux-4.13.12-gentoo] (local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF SMART DATA SECTION ===
SMART/Health Information (NVMe Log 0x02, NSID 0x1)
Critical Warning: 0x00
Temperature: 46 Celsius
Available Spare: 100%
Available Spare Threshold: 10%
Percentage Used: 0%
Data Units Read: 1,941,482 [994 GB]
Data Units Written: 2,372,659 [1.21 TB]
Host Read Commands: 39,349,367
Host Write Commands: 48,090,812
Controller Busy Time: 341
Power Cycles: 154
Power On Hours: 16,814
Unsafe Shutdowns: 52
Media and Data Integrity Errors: 0
Error Information Log Entries: 39
のような出力が得られる。"^Temperature:"の行を少し加工すれば第1条件にマッチしそうだ。
上記の出力フォーマットはsmartmontoolsのnvmeprint.cppで
nvmeprint.cpp
pout("SMART/Health Information (NVMe Log 0x02, NSID 0x%x)\n", nsid);
pout("Critical Warning: 0x%02x\n", smart_log.critical_warning);
pout("Temperature: %s\n",
kelvin_to_str(buf, le16_to_uint(smart_log.temperature)));
pout("Available Spare: %u%%\n", smart_log.avail_spare);
pout("Available Spare Threshold: %u%%\n", smart_log.spare_thresh);
pout("Percentage Used: %u%%\n", smart_log.percent_used);
pout("Data Units Read: %s\n", le128_to_str(buf, smart_log.data_units_read, 1000*512));
pout("Data Units Written: %s\n", le128_to_str(buf, smart_log.data_units_written, 1000*512));
pout("Host Read Commands: %s\n", le128_to_str(buf, smart_log.host_reads));
pout("Host Write Commands: %s\n", le128_to_str(buf, smart_log.host_writes));
pout("Controller Busy Time: %s\n", le128_to_str(buf, smart_log.ctrl_busy_time));
pout("Power Cycles: %s\n", le128_to_str(buf, smart_log.power_cycles));
pout("Power On Hours: %s\n", le128_to_str(buf, smart_log.power_on_hours));
pout("Unsafe Shutdowns: %s\n", le128_to_str(buf, smart_log.unsafe_shutdowns));
pout("Media and Data Integrity Errors: %s\n", le128_to_str(buf, smart_log.media_errors));
pout("Error Information Log Entries: %s\n", le128_to_str(buf, smart_log.num_err_log_entries));
と定義されているので、ここを改変すればよさそうだが、とりあえずはsedでアドホックに対応する。
/usr/local/bin/smartctl
#!/bin/sh
/usr/sbin/smartctl "$@" |sed '/^Temperature:/s/^/Current Drive /'
のようなシェルスクリプトを用意し、env.smartctlを設定する。
/etc/munin/plugin-conf.d/munin-node
[hddtemp_smartctl]
user root
group disk
env.smartctl /usr/local/bin/smartctl
env.drives sda sdb nvme0n1
munin-runで確認すると、nvme0n1の値が得られている。
$sudo munin-run hddtemp_smartctl
nvme0n1.value 46
sda.value 32
sdb.value 35
その他の解決策
- muninのhddtemp_smartctlを改変する
- smartmontoolsのsmartctlを改変する
- udevのNVMeデバイスに対する命名規則を改変する
2.なら以下のようなパッチで対応できる。
--- a/nvmeprint.cpp
+++ b/nvmeprint.cpp
@@ -297,7 +297,7 @@ static void print_smart_log(const nvme_smart_log & smart_log, unsigned nsid,
char buf[64];
pout("SMART/Health Information (NVMe Log 0x02, NSID 0x%x)\n", nsid);
pout("Critical Warning: 0x%02x\n", smart_log.critical_warning);
- pout("Temperature: %s\n",
+ pout("Current Drive Temperature: %s\n",
kelvin_to_str(buf, le16_to_uint(smart_log.temperature)));
pout("Available Spare: %u%%\n", smart_log.avail_spare);
pout("Available Spare Threshold: %u%%\n", smart_log.spare_thresh);
環境
$munin-run --version
Version:
This is munin-run (munin-node) v2.0.34
$Id$
Copyright:
Copyright (C) 2002-2009 Audun Ytterdal, Jimmy Olsen, Tore Anderson,
Nicolai Langfeldt / Linpro AS.
This is free software; see the source for copying conditions. There is
NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR
PURPOSE.
This program is released under the GNU General Public License
$smartctl --version
smartctl 6.6 2017-11-05 r4594 [x86_64-linux-4.13.12-gentoo] (local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org
smartctl comes with ABSOLUTELY NO WARRANTY. This is free
software, and you are welcome to redistribute it under
the terms of the GNU General Public License; either
version 2, or (at your option) any later version.
See http://www.gnu.org for further details.
smartmontools release 6.6 dated 2017-11-05 at 15:20:58 UTC
smartmontools SVN rev 4594 dated 2017-11-05 at 15:21:35
smartmontools build host: x86_64-pc-linux-gnu
smartmontools build with: C++14, GCC 7.2.0
smartmontools configure arguments: '--prefix=/usr' '--build=x86_64-pc-linux-gnu' '--host=x86_64-pc-linux-gnu' '--mandir=/usr/share/man' '--infodir=/usr/share/info' '--datadir=/usr/share' '--sysconfdir=/etc' '--localstatedir=/var/lib' '--disable-dependency-tracking' '--disable-silent-rules' '--htmldir=/usr/share/doc/smartmontools-6.6/html' '--libdir=/usr/lib64' '--docdir=/usr/share/doc/smartmontools-6.6' '--with-drivedbdir=/var/db/smartmontools' '--with-initscriptdir=/etc/init.d' '--with-libcap-ng' '--without-selinux' '--with-systemdsystemunitdir=/lib/systemd/system' '--without-gnupg' '--without-update-smart-drivedb' 'build_alias=x86_64-pc-linux-gnu' 'host_alias=x86_64-pc-linux-gnu' 'CXXFLAGS=-O2 -march=native -pipe' 'LDFLAGS=-Wl,-O1 -Wl,--as-needed' 'CFLAGS=-O2 -march=native -pipe' 'PKG_CONFIG_PATH=/usr/lib64/pkgconfig'
$uname -o
GNU/Linux