LoginSignup
7
12

More than 5 years have passed since last update.

CentOS7で「kernel:BUG: soft lockup~」というメッセージが表示された

Last updated at Posted at 2016-06-13

ある日こんなエラーが・・・

Message from syslogd@cmv7 at Jun 13 17:54:30 ...
kernel:BUG: soft lockup - CPU#0 stuck for 23s! [rcuos/2:16]

環境

  • ESXi5.5
  • CentOS7.2 64bit
  • config-3.10.0-327.18.2.el7.x86_64

/var/log/messagesを見てみると・・・

Jun 13 17:54:30 cmv7 kernel: BUG: soft lockup - CPU#0 stuck for 23s! [rcuos/2:16]
Jun 13 17:54:30 cmv7 kernel: Modules linked in: vmw_vsock_vmci_transport vsock coretemp crc32_pclmul ghash_clmulni_intel ppdev aesni_intel lrw gf128mul glue_helper ablk_helper cryptd vmw_balloon pcspkr sg vmw_vmci i2c_piix4 shpchp parport_pc parport nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sr_mod sd_mod cdrom crc_t10dif crct10dif_generic ata_generic pata_acpi vmwgfx drm_kms_helper ttm drm crct10dif_pclmul crct10dif_common ata_piix crc32c_intel libata mptspi serio_raw i2c_core scsi_transport_spi vmxnet3 mptscsih mptbase floppy dm_mirror dm_region_hash dm_log dm_mod
Jun 13 17:54:30 cmv7 kernel: CPU: 0 PID: 16 Comm: rcuos/2 Tainted: G             L ------------   3.10.0-327.18.2.el7.x86_64 #1
Jun 13 17:54:30 cmv7 kernel: Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 04/14/2014
Jun 13 17:54:30 cmv7 kernel: task: ffff880139bc2e00 ti: ffff880139bd8000 task.ti: ffff880139bd8000
Jun 13 17:54:30 cmv7 kernel: RIP: 0010:[<ffffffffa00543be>]  [<ffffffffa00543be>] mpt_put_msg_frame+0x5e/0x80 [mptbase]
Jun 13 17:54:30 cmv7 kernel: RSP: 0018:ffff88013fc03bc8  EFLAGS: 00000246
Jun 13 17:54:30 cmv7 kernel: RAX: ffffc90008800000 RBX: ffff8800956530b0 RCX: 0000000000000015
Jun 13 17:54:30 cmv7 kernel: RDX: ffff880036b53800 RSI: ffff8800369dc000 RDI: 000000000000000e
Jun 13 17:54:30 cmv7 kernel: RBP: ffff88013fc03bd8 R08: 0000000000000003 R09: ffff8800369d90d8
Jun 13 17:54:30 cmv7 kernel: R10: ffff8800971d6540 R11: ffffea0004dc8f00 R12: ffff88013fc03b38
Jun 13 17:54:30 cmv7 kernel: R13: ffffffff81646e1d R14: ffff88013fc03bd8 R15: ffff8800369dc000
Jun 13 17:54:30 cmv7 kernel: FS:  0000000000000000(0000) GS:ffff88013fc00000(0000) knlGS:0000000000000000
Jun 13 17:54:30 cmv7 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Jun 13 17:54:30 cmv7 kernel: CR2: 00000000005643eb CR3: 0000000097221000 CR4: 00000000000007f0
Jun 13 17:54:30 cmv7 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Jun 13 17:54:30 cmv7 kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Jun 13 17:54:30 cmv7 kernel: Stack:
Jun 13 17:54:30 cmv7 kernel: ffff8800369dcd30 0000000000000048 ffff88013fc03c80 ffffffffa0076729
Jun 13 17:54:30 cmv7 kernel: ffff8800369dc008 04000000ffc00400 0000000000000015 ffff8800971d6540
Jun 13 17:54:30 cmv7 kernel: 0000000000000054 ffff8800369dc188 0000006000000015 ffff8801358c3010
Jun 13 17:54:30 cmv7 kernel: Call Trace:
Jun 13 17:54:30 cmv7 kernel: <IRQ>
Jun 13 17:54:30 cmv7 kernel:
Jun 13 17:54:30 cmv7 kernel: [<ffffffffa0076729>] mptscsih_qcmd+0x249/0x820 [mptscsih]
Jun 13 17:54:30 cmv7 kernel: [<ffffffffa006f2b0>] mptspi_qcmd+0x50/0xe0 [mptspi]
Jun 13 17:54:30 cmv7 kernel: [<ffffffff81417f3a>] scsi_dispatch_cmd+0xaa/0x230
Jun 13 17:54:30 cmv7 kernel: [<ffffffff81420ec1>] scsi_request_fn+0x501/0x770
Jun 13 17:54:30 cmv7 kernel: [<ffffffff812c7793>] __blk_run_queue+0x33/0x40
Jun 13 17:54:30 cmv7 kernel: [<ffffffff812c7806>] blk_run_queue+0x26/0x40
Jun 13 17:54:30 cmv7 kernel: [<ffffffff8141f2e8>] scsi_run_queue+0x258/0x2f0
Jun 13 17:54:30 cmv7 kernel: [<ffffffff81421170>] scsi_next_command+0x20/0x40
Jun 13 17:54:30 cmv7 kernel: [<ffffffff814212e5>] scsi_end_request+0x155/0x1d0
Jun 13 17:54:30 cmv7 kernel: [<ffffffff814214c3>] scsi_io_completion+0x103/0x600
Jun 13 17:54:30 cmv7 kernel: [<ffffffff81416805>] scsi_finish_command+0xd5/0x130
Jun 13 17:54:30 cmv7 kernel: [<ffffffff8142099a>] scsi_softirq_done+0x12a/0x150
Jun 13 17:54:30 cmv7 kernel: [<ffffffff812d1a50>] blk_done_softirq+0x90/0xc0
Jun 13 17:54:30 cmv7 kernel: [<ffffffff81084b0f>] __do_softirq+0xef/0x280
Jun 13 17:54:30 cmv7 kernel: [<ffffffff81647adc>] call_softirq+0x1c/0x30
Jun 13 17:54:30 cmv7 kernel: <EOI>
Jun 13 17:54:30 cmv7 kernel:
Jun 13 17:54:30 cmv7 kernel: [<ffffffff81016fc5>] do_softirq+0x65/0xa0
Jun 13 17:54:30 cmv7 kernel: [<ffffffff81084404>] local_bh_enable+0x94/0xa0
Jun 13 17:54:30 cmv7 kernel: [<ffffffff81124265>] rcu_nocb_kthread+0x255/0x370
Jun 13 17:54:30 cmv7 kernel: [<ffffffff810a6ae0>] ? wake_up_atomic_t+0x30/0x30
Jun 13 17:54:30 cmv7 kernel: [<ffffffff81124010>] ? rcu_start_gp+0x40/0x40
Jun 13 17:54:30 cmv7 kernel: [<ffffffff810a5aef>] kthread+0xcf/0xe0
Jun 13 17:54:30 cmv7 kernel: [<ffffffff810a5a20>] ? kthread_create_on_node+0x140/0x140
Jun 13 17:54:30 cmv7 kernel: [<ffffffff81646118>] ret_from_fork+0x58/0x90
Jun 13 17:54:30 cmv7 kernel: [<ffffffff810a5a20>] ? kthread_create_on_node+0x140/0x140
Jun 13 17:54:30 cmv7 kernel: Code: 8b 96 68 01 00 00 0f b7 c8 44 03 a6 b0 01 00 00 44 8b 04 8a 45 09 c4 f6 86 e0 00 00 00 04 75 10 48 8b 83 e8 00 00 00 44 89 60 40 <5b> 41 5c 5d c3 48 8d 76 08 0f b7 c8 44 89 e2 48 c7 c7 78 23 06

どうなったか・・・

処理が重くなり、アプリケーションの処理がほぼ止まったような状態となってしまいました。。。

なにをしたか・・・?

とりあえず以下のパラメータを変更して再起動してみたところエラーは出なくなり速度も通常の状態まで戻りました。

これを

/boot/config-3.10.0-327.18.2.el7.x86_64
CONFIG_LOCKUP_DETECTOR=y

こう

/boot/config-3.10.0-327.18.2.el7.x86_64
CONFIG_LOCKUP_DETECTOR=n

おわりに・・・

とりあえず今のところ同じようなエラーは症状はでていないですが、原因を究明できていない状態なので、原因究明に勤しみたいと思います!

参考情報

https://kb.vmware.com/selfservice/search.do?cmd=displayKC&docType=kc&docTypeID=DT_KB_1_1&externalId=2094326
http://visualworks.dip.jp/achiral/blog/blog/2015/04/28/centos7%E3%82%B5%E3%83%BC%E3%83%90%E3%83%BC%E4%B8%8D%E8%AA%BF%E6%94%B9%E5%96%84soft-lockup/

7
12
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
7
12