何さ
snmpd
起動してんのに Link up
および Link Down
の snmptrap
が送出されなかったんで頑張った。
環境
[root@localhost snmp]# cat /etc/redhat-release
CentOS Linux release 8.0.1905 (Core)
[root@localhost snmp]# snmpd --version
NET-SNMP version: 5.8
Web: http://www.net-snmp.org/
Email: net-snmp-coders@lists.sourceforge.net
[root@localhost snmp]# rpm -qa | grep snmp
net-snmp-5.8-12.el8_1.x86_64
net-snmp-libs-5.8-12.el8_1.x86_64
net-snmp-agent-libs-5.8-12.el8_1.x86_64
net-snmp-utils-5.8-12.el8_1.x86_64
変更前の snmpd.conf
####
# First, map the community name "public" into a "security name"
# sec.name source community
com2sec notConfigUser default public
####
# Second, map the security name into a group name:
# groupName securityModel securityName
group notConfigGroup v1 notConfigUser
group notConfigGroup v2c notConfigUser
####
# Third, create a view for us to let the group have rights to:
# Make at least snmpwalk -v 1 localhost -c public system fast again.
# name incl/excl subtree mask(optional)
view systemview included .1.3.6.1.2.1.1
view systemview included .1.3.6.1.2.1.25.1.1
####
# Finally, grant the group read-only access to the systemview view.
# group context sec.model sec.level prefix read write notif
access notConfigGroup "" any noauth exact systemview none none
# -----------------------------------------------------------------------------
trapsink 127.0.0.1 testcom
# Here is a commented out example configuration that allows less
# restrictive access.
trapsink 127.0.0.1 testcom
により 「127.0.0.1
に testcom
というコミュニティ名で snmptrap
を v1
で投げるぜヒャッハー」というはず。
実験(失敗パターン:trapsink
つけただけ)
[root@localhost ~]# tcpdump -i lo port 162
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on lo, link-type EN10MB (Ethernet), capture size 262144 bytes
[root@localhost ~]# tail -F /var/log/messages
これでIF(enp0s3)を抜去してみる
(待てど暮らせど何も出ない)
Mar 31 06:11:28 localhost kernel: e1000: enp0s3 NIC Link is Down
Mar 31 06:11:34 localhost NetworkManager[756]: <info> [1585649494.6265] device (enp0s3): state change: activated -> unavailable (reason 'carrier-changed', sys-iface-state: 'managed')
Mar 31 06:11:34 localhost NetworkManager[756]: <info> [1585649494.6269] dhcp4 (enp0s3): canceled DHCP transaction
Mar 31 06:11:34 localhost NetworkManager[756]: <info> [1585649494.6269] dhcp4 (enp0s3): state changed bound -> done
Mar 31 06:11:34 localhost NetworkManager[756]: <info> [1585649494.6581] manager: NetworkManager state is now CONNECTED_LOCAL
Mar 31 06:11:34 localhost dbus-daemon[722]: [system] Activating via systemd: service name='org.freedesktop.nm_dispatcher' unit='dbus-org.freedesktop.nm-dispatcher.service' requested by ':1.8' (uid=0 pid=756 comm="/usr/sbin/NetworkManager --no-daemon " label="system_u:system_r:NetworkManager_t:s0")
Mar 31 06:11:34 localhost systemd[1]: Starting Network Manager Script Dispatcher Service...
Mar 31 06:11:34 localhost dbus-daemon[722]: [system] Successfully activated service 'org.freedesktop.nm_dispatcher'
Mar 31 06:11:34 localhost systemd[1]: Started Network Manager Script Dispatcher Service.
Mar 31 06:11:34 localhost nm-dispatcher[4374]: req:1 'down' [enp0s3]: new request (2 scripts)
Mar 31 06:11:34 localhost nm-dispatcher[4374]: req:1 'down' [enp0s3]: start running ordered scripts...
Mar 31 06:11:34 localhost nm-dispatcher[4374]: req:2 'connectivity-change': new request (2 scripts)
Mar 31 06:11:34 localhost chronyd[727]: Source 133.243.238.163 offline
Mar 31 06:11:34 localhost chronyd[727]: Can't synchronise: no selectable sources
Mar 31 06:11:34 localhost nm-dispatcher[4374]: req:2 'connectivity-change': start running ordered scripts...
おうおうどういうことやねん。
答えはいつだって公式にある
公式:What traps are sent by the agent?
The agent does not send 'linkUp' or 'linkDown' traps by default. It can
be configured to do this using the directive 'linkUpDownNotifications'.
See the 'snmpd.conf(5)' man page (under ACTIVE MONITORING) for details.
OK,Google. 翻訳して
エージェントは、デフォルトでは「linkUp」または「linkDown」トラップを送信しません。
ディレクティブ 'linkUpDownNotifications'を使用してこれを行うように構成できます。
詳細については、「snmpd.conf(5)」のマニュアルページ(「アクティブモニタリング」の下)を参照してください。
man 5 snmpd.conf
を見てみる(linkUpDownNotifications
)
linkUpDownNotifications yes
will configure the Event MIB tables to monitor the ifTable for
network interfaces being taken up or down, and triggering a
linkUp or linkDown notification as appropriate.
This is exactly equivalent to the configuration:
notificationEvent linkUpTrap linkUp ifIndex ifAdminStatus ifOperStatus
notificationEvent linkDownTrap linkDown ifIndex ifAdminStatus ifOperStatus
monitor -r 60 -e linkUpTrap "Generate linkUp" ifOperStatus != 2
monitor -r 60 -e linkDownTrap "Generate linkDown" ifOperStatus == 2
とりあえず linkUpDownNotifications yes
をつけりゃいいのかな。
実験(失敗パターン:linkUpDownNotifications
つけただけ)
# group context sec.model sec.level prefix read write notif
access notConfigGroup "" any noauth exact systemview none none
# -----------------------------------------------------------------------------
trapsink 127.0.0.1 testcom
linkUpDownNotifications yes
これで systemctl restart snmpd.service
をやってみる
Mar 31 20:49:58 localhost systemd[1]: Stopping Simple Network Management Protocol (SNMP) Daemon....
Mar 31 20:49:58 localhost snmpd[1365]: Received TERM or STOP signal... shutting down...
Mar 31 20:49:58 localhost snmptrapd[1371]: No access configuration - dropping trap.
Mar 31 20:49:58 localhost systemd[1]: Stopped Simple Network Management Protocol (SNMP) Daemon..
Mar 31 20:49:58 localhost systemd[1]: Starting Simple Network Management Protocol (SNMP) Daemon....
Mar 31 20:49:58 localhost snmpd[1375]: iquerySecName has not been configured - internal queries will fail
Mar 31 20:49:58 localhost snmpd[1375]: /etc/snmp/snmpd.conf: line 65: Error: You must specify a default user name using the agentSecName token
Mar 31 20:49:58 localhost snmpd[1375]: /etc/snmp/snmpd.conf: line 65: Error: You must specify a default user name using the agentSecName token
Mar 31 20:49:58 localhost snmpd[1375]: net-snmp: 2 error(s) in config file(s)
Mar 31 20:49:58 localhost snmpd[1375]: NET-SNMP version 5.8
Mar 31 20:49:58 localhost snmptrapd[1371]: No access configuration - dropping trap.
Mar 31 20:49:58 localhost systemd[1]: Started Simple Network Management Protocol (SNMP) Daemon..
なんか怒られた。追加した linkUpDownNotifications yes
について苦言を申されている。
agentSecName
がどうとかなんとか。
man 5 snmpd.conf
を見てみる(agentSecName
)
agentSecName NAME
specifies the default SNMPv3 username, to be used when making
internal queries to retrieve any necessary information
(either for evaluating the monitored expression, or building a
notification payload). These internal queries always use
SNMPv3, even if normal querying of the agent is done using
SNMPv1 or SNMPv2c.
Note that this user must also be explicitly
created (createUser) and given appropriate access
rights (e.g. rouser). This directive is purely con - cerned with
defining which user should be used - not with actually setting
this user up.
createUser
でユーザを作成し、rouser
等で適切なアクセス権を付与しつつagentSecName NAME
に当てないといけないらしい。
実験(成功パターン)
# group context sec.model sec.level prefix read write notif
access notConfigGroup "" any noauth exact systemview none none
# -----------------------------------------------------------------------------
rouser userro
createUser userro MD5 "hogehoge" DES
trapsink 127.0.0.1 testcom
agentSecName userro
linkUpDownNotifications yes
これで systemctl restart snmpd.service
をやってみる
Mar 31 20:59:14 localhost systemd[1]: Stopping Simple Network Management Protocol (SNMP) Daemon....
Mar 31 20:59:14 localhost snmpd[1375]: Received TERM or STOP signal... shutting down...
Mar 31 20:59:14 localhost snmptrapd[1371]: No access configuration - dropping trap.
Mar 31 20:59:14 localhost systemd[1]: Stopped Simple Network Management Protocol (SNMP) Daemon..
Mar 31 20:59:14 localhost systemd[1]: Starting Simple Network Management Protocol (SNMP) Daemon....
Mar 31 20:59:14 localhost snmpd[1413]: NET-SNMP version 5.8
Mar 31 20:59:14 localhost snmptrapd[1371]: No access configuration - dropping trap.
Mar 31 20:59:14 localhost systemd[1]: Started Simple Network Management Protocol (SNMP) Daemon..
大丈夫そう。
[root@localhost ~]# tcpdump -i lo port 162
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on lo, link-type EN10MB (Ethernet), capture size 262144 bytes
[root@localhost ~]# tail -F /var/log/messages
これでIF(enp0s3)を抜去してみる
21:02:14.599493 IP localhost.54843 > localhost.snmptrap: C="testcom" Trap(107) E:8072.3.2.10 192.168.56.3 linkDown 18004 interfaces.ifTable.ifEntry.ifIndex.2=2 interfaces.ifTable.ifEntry.ifAdminStatus.2=1 interfaces.ifTable.ifEntry.ifOperStatus.2=2 S:1.1.4.3.0=E:8072.3.2.10
Mar 31 21:01:38 localhost kernel: e1000: enp0s3 NIC Link is Down
Mar 31 21:01:44 localhost NetworkManager[742]: <info> [1585702904.8108] device (enp0s3): state change: activated -> unavailable (reason 'carrier-changed', sys-iface-state: 'managed')
Mar 31 21:01:44 localhost NetworkManager[742]: <info> [1585702904.8110] dhcp4 (enp0s3): canceled DHCP transaction
Mar 31 21:01:44 localhost NetworkManager[742]: <info> [1585702904.8110] dhcp4 (enp0s3): state changed bound -> done
Mar 31 21:01:44 localhost NetworkManager[742]: <info> [1585702904.8407] manager: NetworkManager state is now CONNECTED_LOCAL
Mar 31 21:01:44 localhost dbus-daemon[712]: [system] Activating via systemd: service name='org.freedesktop.nm_dispatcher' unit='dbus-org.freedesktop.nm-dispatcher.service' requested by ':1.7' (uid=0 pid=742 comm="/usr/sbin/NetworkManager --no-daemon " label="system_u:system_r:NetworkManager_t:s0")
Mar 31 21:01:44 localhost systemd[1]: Starting Network Manager Script Dispatcher Service...
Mar 31 21:01:44 localhost dbus-daemon[712]: [system] Successfully activated service 'org.freedesktop.nm_dispatcher'
Mar 31 21:01:44 localhost systemd[1]: Started Network Manager Script Dispatcher Service.
Mar 31 21:01:44 localhost nm-dispatcher[1518]: req:1 'down' [enp0s3]: new request (2 scripts)
Mar 31 21:01:44 localhost nm-dispatcher[1518]: req:1 'down' [enp0s3]: start running ordered scripts...
Mar 31 21:01:44 localhost nm-dispatcher[1518]: req:2 'connectivity-change': new request (2 scripts)
Mar 31 21:01:44 localhost chronyd[716]: Source 133.243.238.163 offline
Mar 31 21:01:44 localhost nm-dispatcher[1518]: req:2 'connectivity-change': start running ordered scripts...
Mar 31 21:02:14 localhost snmpd[1413]: empty variable list in _query
Mar 31 21:02:14 localhost snmptrapd[1371]: No access configuration - dropping trap.
おー出てきた!
けど、Mar 31 21:01:38
にLink is Down
発生しているのにtrap
が21:02:14
って遅くね?
実験(改善パターン:監視周期を変える)
linkUpDownNotifications yes
will configure the Event MIB tables to monitor the ifTable for
network interfaces being taken up or down, and triggering a
linkUp or linkDown notification as appropriate.
This is exactly equivalent to the configuration:
notificationEvent linkUpTrap linkUp ifIndex ifAdminStatus ifOperStatus
notificationEvent linkDownTrap linkDown ifIndex ifAdminStatus ifOperStatus
monitor -r 60 -e linkUpTrap "Generate linkUp" ifOperStatus != 2
monitor -r 60 -e linkDownTrap "Generate linkDown" ifOperStatus == 2
linkUpDownNotifications yes
を定義することで後半4行の定義と同じことが行われるよ、という内容。
ここで気になるのがmonitor -r 60
の定義。たぶんこれが60秒周期を意味している。
さっきsystemctl restart snmpd.service
をしたのがMar 31 20:59:14
で、snmptrap
の時間がMar 31 21:02:14
であり、秒が一致しているので恐らくこれ。
なので監視周期を狭める場合はlinkUpDownNotifications yes
ではなく以下のように定義すると良さそう。
access notConfigGroup "" any noauth exact systemview none none
# -----------------------------------------------------------------------------
rouser userro
createUser userro MD5 "hogehoge" DES
trapsink 127.0.0.1 testcom
agentSecName userro
#linkUpDownNotifications yes
notificationEvent linkUpTrap linkUp ifIndex ifAdminStatus ifOperStatus
notificationEvent linkDownTrap linkDown ifIndex ifAdminStatus ifOperStatus
monitor -r 10 -e linkUpTrap "Generate linkUp" ifOperStatus != 2
monitor -r 10 -e linkDownTrap "Generate linkDown" ifOperStatus == 2
例として10秒周期にしてみた。
systemctl restart snmpd.service
する。
21:10:52.073503 IP localhost.54843 > localhost.snmptrap: C="testcom" Trap(29) E:8072.4 192.168.56.3 enterpriseSpecific s=2 69752
21:10:52.197584 IP localhost.43359 > localhost.snmptrap: C="testcom" Trap(29) E:8072.3.2.10 192.168.56.3 coldStart 7
21:10:52.226101 IP localhost.43359 > localhost.snmptrap: C="testcom" Trap(106) E:8072.3.2.10 192.168.56.3 linkUp 10 interfaces.ifTable.ifEntry.ifIndex.1=1 interfaces.ifTable.ifEntry.ifAdminStatus.1=1 interfaces.ifTable.ifEntry.ifOperStatus.1=1 S:1.1.4.3.0=E:8072.3.2.10
21:10:52.226285 IP localhost.43359 > localhost.snmptrap: C="testcom" Trap(106) E:8072.3.2.10 192.168.56.3 linkUp 10 interfaces.ifTable.ifEntry.ifIndex.3=3 interfaces.ifTable.ifEntry.ifAdminStatus.3=1 interfaces.ifTable.ifEntry.ifOperStatus.3=1 S:1.1.4.3.0=E:8072.3.2.10
21:10:52.226617 IP localhost.43359 > localhost.snmptrap: C="testcom" Trap(106) E:8072.3.2.10 192.168.56.3 linkDown 10 interfaces.ifTable.ifEntry.ifIndex.2=2 interfaces.ifTable.ifEntry.ifAdminStatus.2=1 interfaces.ifTable.ifEntry.ifOperStatus.2=2 S:1.1.4.3.0=E:8072.3.2.10
Mar 31 21:10:52 localhost systemd[1]: Stopping Simple Network Management Protocol (SNMP) Daemon....
Mar 31 21:10:52 localhost snmpd[1413]: Received TERM or STOP signal... shutting down...
Mar 31 21:10:52 localhost snmptrapd[1371]: No access configuration - dropping trap.
Mar 31 21:10:52 localhost systemd[1]: Stopped Simple Network Management Protocol (SNMP) Daemon..
Mar 31 21:10:52 localhost systemd[1]: Starting Simple Network Management Protocol (SNMP) Daemon....
Mar 31 21:10:52 localhost snmpd[1547]: NET-SNMP version 5.8
Mar 31 21:10:52 localhost snmptrapd[1371]: No access configuration - dropping trap.
Mar 31 21:10:52 localhost systemd[1]: Started Simple Network Management Protocol (SNMP) Daemon..
Mar 31 21:10:52 localhost snmptrapd[1371]: No access configuration - dropping trap.
Mar 31 21:10:52 localhost snmptrapd[1371]: No access configuration - dropping trap.
Mar 31 21:10:52 localhost snmptrapd[1371]: No access configuration - dropping trap.
あれ、まだケーブル抜き差ししてないのにズラズラとtrap
が流れてきた。
個別に定義したことで現在の状態を流すようになったんかな。
ケーブル抜去してみる。
21:26:23.634308 IP localhost.43359 > localhost.snmptrap: C="testcom" Trap(108) E:8072.3.2.10 192.168.56.3 linkDown 93009 interfaces.ifTable.ifEntry.ifIndex.2=2 interfaces.ifTable.ifEntry.ifAdminStatus.2=1 interfaces.ifTable.ifEntry.ifOperStatus.2=2 S:1.1.4.3.0=E:8072.3.2.10
Mar 31 21:26:18 localhost kernel: e1000: enp0s3 NIC Link is Down
Mar 31 21:26:23 localhost snmptrapd[1371]: No access configuration - dropping trap.
Mar 31 21:26:24 localhost NetworkManager[742]: <info> [1585704384.8745] device (enp0s3): state change: activated -> unavailable (reason 'carrier-changed', sys-iface-state: 'managed')
Mar 31 21:26:24 localhost NetworkManager[742]: <info> [1585704384.8747] dhcp4 (enp0s3): canceled DHCP transaction
Mar 31 21:26:24 localhost NetworkManager[742]: <info> [1585704384.8747] dhcp4 (enp0s3): state changed bound -> done
Mar 31 21:26:24 localhost NetworkManager[742]: <info> [1585704384.9080] manager: NetworkManager state is now CONNECTED_LOCAL
Mar 31 21:26:24 localhost dbus-daemon[712]: [system] Activating via systemd: service name='org.freedesktop.nm_dispatcher' unit='dbus-org.freedesktop.nm-dispatcher.service' requested by ':1.7' (uid=0 pid=742 comm="/usr/sbin/NetworkManager --no-daemon " label="system_u:system_r:NetworkManager_t:s0")
Mar 31 21:26:24 localhost systemd[1]: Starting Network Manager Script Dispatcher Service...
Mar 31 21:26:24 localhost dbus-daemon[712]: [system] Successfully activated service 'org.freedesktop.nm_dispatcher'
Mar 31 21:26:24 localhost systemd[1]: Started Network Manager Script Dispatcher Service.
Mar 31 21:26:24 localhost nm-dispatcher[1648]: req:1 'down' [enp0s3]: new request (2 scripts)
Mar 31 21:26:24 localhost nm-dispatcher[1648]: req:1 'down' [enp0s3]: start running ordered scripts...
Mar 31 21:26:24 localhost nm-dispatcher[1648]: req:2 'connectivity-change': new request (2 scripts)
Mar 31 21:26:24 localhost chronyd[716]: Source 133.243.238.163 offline
Mar 31 21:26:24 localhost chronyd[716]: Can't synchronise: no selectable sources
Mar 31 21:26:24 localhost nm-dispatcher[1648]: req:2 'connectivity-change': start running ordered scripts...
先程systemctl restart snmpd.service
したのがMar 31 21:10:52
だったんでそこから10秒周期で監視されたとして妥当な感じ。
その他
snmptrap
発行時に/var/log/messages
に出る以下はなんだろう。
Mar 31 21:02:14 localhost snmptrapd[1371]: No access configuration - dropping trap.
調べると以下の記事がHit
ITインフラ技術の実験室 - SNMPの設定
Resolve the message "No access configuration" of snmptrapd
見るとsnmptrap
を送出時、ではなく受信したときの話らしい。
今回はローカルアドレス127.0.0.1
にコミュニティ名testcom
で飛ばしたけど、そのsnmptrap
の受信を許容していないから出るっぽい。
本番環境では外部のマネージャに飛ばすことになっていて、そちらはちゃんと設定されているから大丈夫ぽい。
結論
以下のどっちか。
60秒周期監視(シンプル)
# group context sec.model sec.level prefix read write notif
access notConfigGroup "" any noauth exact systemview none none
# -----------------------------------------------------------------------------
rouser userro
createUser userro MD5 "hogehoge" DES
trapsink 127.0.0.1 testcom
agentSecName userro
linkUpDownNotifications yes
10秒周期監視(ごちゃる)
# group context sec.model sec.level prefix read write notif
access notConfigGroup "" any noauth exact systemview none none
# -----------------------------------------------------------------------------
rouser userro
createUser userro MD5 "hogehoge" DES
trapsink 127.0.0.1 testcom
agentSecName userro
notificationEvent linkUpTrap linkUp ifIndex ifAdminStatus ifOperStatus
notificationEvent linkDownTrap linkDown ifIndex ifAdminStatus ifOperStatus
monitor -r 10 -e linkUpTrap "Generate linkUp" ifOperStatus != 2
monitor -r 10 -e linkDownTrap "Generate linkDown" ifOperStatus == 2