LoginSignup
4
4

More than 5 years have passed since last update.

DISK障害でCPU負荷が上がった

Last updated at Posted at 2013-08-21

nagiosから警告

***** Nagios *****

Notification Type: PROBLEM

Service: Current Load
Host: localhost
Address: 127.0.0.1
State: CRITICAL

Date/Time: Wed Aug 21 11:05:57 JST 2013

Additional Info:

CRITICAL - load average: 7.00, 7.01, 6.97

サーバーにログイン

dmesg確認
$ dmesg -T | less
  • /dev/sdbの認識の後、Call Trace: が表示されている。
logの確認
$ watch -d "ls -ltr /var/log/*log"

syslog に書き込みがどんどんされている。

/var/log/syslog
Aug 21 11:20:48 server udevd[635]: timeout: killing '/sbin/blkid -o udev -p /dev/sdb1' [2918]
Aug 21 11:20:48 server udevd[2916]: timeout: killing '/sbin/blkid -o udev -p /dev/sdb5' [2921]
Aug 21 11:20:48 server udevd[2915]: timeout: killing '/sbin/blkid -o udev -p /dev/sdb3' [2920]
Aug 21 11:20:48 server udevd[663]: timeout: killing '/sbin/blkid -o udev -p /dev/sdb2' [2919]
Aug 21 11:20:49 server udevd[2917]: timeout: killing '/sbin/blkid -o udev -p /dev/sdb6' [2922]

の組みがどんどん表示されている。

/dev/sdb が障害の様子。


Calculate CPU load limit

calc_la_limit.sh
echo "$(( $(( $(grep 'physical id' '/proc/cpuinfo' | uniq | wc -l) * $(grep 'core id' '/proc/cpuinfo' | wc -l) )) * 2 + 1 ))"
$ ./calc_la_limit.sh
9
4
4
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
4
4