Environment
# cat /etc/lsb-release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=14.04
DISTRIB_CODENAME=trusty
DISTRIB_DESCRIPTION="Ubuntu 14.04.2 LTS"
# dmidecode -s system-product-name
ProLiant ML30 Gen9
Xenial にも対応してほしい......
エラー内容
これが /var/log/syslog
に延々と記録される......
[ 1866.577934] ACPI Error: SMBus/IPMI/GenericSerialBus write requires Buffer of length 66, found length 32 (20140424/exfield-420)
[ 1866.577949] ACPI Error: Method parse/execution failed [\_SB_.PMI0._PMM] (Node ffff88103f0451b8), AE_AML_BUFFER_LIMIT (20140424/psparse-536)
[ 1866.577971] ACPI Exception: AE_AML_BUFFER_LIMIT, Evaluating _PMM (20140424/power_meter-338)
TL;DR
カーネルモジュールの acpi_power_meter
を恒久的に読み込まれないようにする。
echo "blacklist acpi_power_meter" >> /etc/modprobe.d/hwmon.conf
Frequent ACPI errors in dmesg ring buffer · Issue #827 · firehol/netdata
調査ログ
エラーの文言から調べた所、下記のブログ記事がヒットし、手順の通りコマンドを実行。
(RHEL/CentOS カテゴリだったが、ACPI とかカーネルレベルの問題だと思うので dist による違いは無いはず)。
KERNEL ACPI ERROR SMBUS/IPMI/GENERICSERIALBUS
cat /sys/devices/LNXSYSTM:00/LNXSYBUS:00/ACPI000D:00/power1_average
を実行した際、 /var/log/syslog
に当該ログが出力されていることも確認。
# find /sys/devices/LNXSYSTM\:00/ |grep ACPI000D
/sys/devices/LNXSYSTM:00/LNXSYBUS:00/ACPI000D:00
/sys/devices/LNXSYSTM:00/LNXSYBUS:00/ACPI000D:00/hid
/sys/devices/LNXSYSTM:00/LNXSYBUS:00/ACPI000D:00/power1_average_interval
/sys/devices/LNXSYSTM:00/LNXSYBUS:00/ACPI000D:00/name
/sys/devices/LNXSYSTM:00/LNXSYBUS:00/ACPI000D:00/path
/sys/devices/LNXSYSTM:00/LNXSYBUS:00/ACPI000D:00/power1_average
/sys/devices/LNXSYSTM:00/LNXSYBUS:00/ACPI000D:00/hwmon
/sys/devices/LNXSYSTM:00/LNXSYBUS:00/ACPI000D:00/hwmon/hwmon0
/sys/devices/LNXSYSTM:00/LNXSYBUS:00/ACPI000D:00/hwmon/hwmon0/power
/sys/devices/LNXSYSTM:00/LNXSYBUS:00/ACPI000D:00/hwmon/hwmon0/power/control
/sys/devices/LNXSYSTM:00/LNXSYBUS:00/ACPI000D:00/hwmon/hwmon0/power/async
/sys/devices/LNXSYSTM:00/LNXSYBUS:00/ACPI000D:00/hwmon/hwmon0/power/runtime_enabled
/sys/devices/LNXSYSTM:00/LNXSYBUS:00/ACPI000D:00/hwmon/hwmon0/power/runtime_active_kids
/sys/devices/LNXSYSTM:00/LNXSYBUS:00/ACPI000D:00/hwmon/hwmon0/power/runtime_active_time
/sys/devices/LNXSYSTM:00/LNXSYBUS:00/ACPI000D:00/hwmon/hwmon0/power/autosuspend_delay_ms
/sys/devices/LNXSYSTM:00/LNXSYBUS:00/ACPI000D:00/hwmon/hwmon0/power/runtime_status
/sys/devices/LNXSYSTM:00/LNXSYBUS:00/ACPI000D:00/hwmon/hwmon0/power/runtime_usage
/sys/devices/LNXSYSTM:00/LNXSYBUS:00/ACPI000D:00/hwmon/hwmon0/power/runtime_suspended_time
/sys/devices/LNXSYSTM:00/LNXSYBUS:00/ACPI000D:00/hwmon/hwmon0/device
/sys/devices/LNXSYSTM:00/LNXSYBUS:00/ACPI000D:00/hwmon/hwmon0/subsystem
/sys/devices/LNXSYSTM:00/LNXSYBUS:00/ACPI000D:00/hwmon/hwmon0/uevent
/sys/devices/LNXSYSTM:00/LNXSYBUS:00/ACPI000D:00/power
/sys/devices/LNXSYSTM:00/LNXSYBUS:00/ACPI000D:00/power/control
/sys/devices/LNXSYSTM:00/LNXSYBUS:00/ACPI000D:00/power/async
/sys/devices/LNXSYSTM:00/LNXSYBUS:00/ACPI000D:00/power/runtime_enabled
/sys/devices/LNXSYSTM:00/LNXSYBUS:00/ACPI000D:00/power/runtime_active_kids
/sys/devices/LNXSYSTM:00/LNXSYBUS:00/ACPI000D:00/power/runtime_active_time
/sys/devices/LNXSYSTM:00/LNXSYBUS:00/ACPI000D:00/power/autosuspend_delay_ms
/sys/devices/LNXSYSTM:00/LNXSYBUS:00/ACPI000D:00/power/runtime_status
/sys/devices/LNXSYSTM:00/LNXSYBUS:00/ACPI000D:00/power/runtime_usage
/sys/devices/LNXSYSTM:00/LNXSYBUS:00/ACPI000D:00/power/runtime_suspended_time
/sys/devices/LNXSYSTM:00/LNXSYBUS:00/ACPI000D:00/power1_model_number
/sys/devices/LNXSYSTM:00/LNXSYBUS:00/ACPI000D:00/modalias
/sys/devices/LNXSYSTM:00/LNXSYBUS:00/ACPI000D:00/power1_average_interval_max
/sys/devices/LNXSYSTM:00/LNXSYBUS:00/ACPI000D:00/power1_average_interval_min
/sys/devices/LNXSYSTM:00/LNXSYBUS:00/ACPI000D:00/driver
/sys/devices/LNXSYSTM:00/LNXSYBUS:00/ACPI000D:00/power1_oem_info
/sys/devices/LNXSYSTM:00/LNXSYBUS:00/ACPI000D:00/subsystem
/sys/devices/LNXSYSTM:00/LNXSYBUS:00/ACPI000D:00/status
/sys/devices/LNXSYSTM:00/LNXSYBUS:00/ACPI000D:00/uevent
/sys/devices/LNXSYSTM:00/LNXSYBUS:00/ACPI000D:00/physical_node
/sys/devices/LNXSYSTM:00/LNXSYBUS:00/ACPI000D:00/power1_accuracy
/sys/devices/LNXSYSTM:00/LNXSYBUS:00/ACPI000D:00/power1_serial_number
/sys/devices/LNXSYSTM:00/LNXSYBUS:00/ACPI000D:00/measures
/sys/devices/LNXSYSTM:00/LNXSYBUS:00/ACPI000D:00/measures/LNXSYBUS:00
/sys/devices/LNXSYSTM:00/LNXSYBUS:00/ACPI000D:00/power1_is_battery
# cat /sys/devices/LNXSYSTM:00/LNXSYBUS:00/ACPI000D:00/power1_average
0
# apt-get install -y lm-sensors # sensors コマンドが無かったのでインストール
# sensors
power_meter-acpi-0
Adapter: ACPI interface
coretemp-isa-0000
Adapter: ISA adapter
Physical id 0: +37.0°C (high = +80.0°C, crit = +100.0°C)
Core 0: +37.0°C (high = +80.0°C, crit = +100.0°C)
Core 1: +36.0°C (high = +80.0°C, crit = +100.0°C)
Core 2: +38.0°C (high = +80.0°C, crit = +100.0°C)
Core 3: +35.0°C (high = +80.0°C, crit = +100.0°C)
tg3-pci-0200
Adapter: PCI adapter
temp1: +0.1°C (high = +0.1°C, crit = +0.1°C)
# vim /etc/sensors.d/hp-proliant-ml60.conf # 下記内容を追記して保存
# cat /etc/sensors.d/hp-proliant-ml60.conf
chip "power_meter-acpi-0"
ignore power1
# service lm-sensors restart
一旦これで解決したかと思ったが、再起動後に同じログが流れ始める。
更に調査し、下記 issue で解決した。
$ sudo modprobe -r acpi_power_meter # 一時的に解決する
$ sudo echo "blacklist acpi_power_meter" >> /etc/modprobe.d/hwmon.conf # 恒久的に解決する
おわり。