Fixed
Details
Assignee
Benjamin ReedBenjamin ReedReporter
David SchlenkDavid SchlenkHB Grooming Date
Nov 10, 2020HB Backlog Status
NBComponents
Fix versions
Priority
Minor
Details
Details
Assignee
Benjamin Reed
Benjamin ReedReporter
David Schlenk
David SchlenkHB Grooming Date
Nov 10, 2020
HB Backlog Status
NB
Components
Fix versions
Priority
PagerDuty
PagerDuty
PagerDuty
Created October 31, 2020 at 12:23 AM
Updated March 1, 2021 at 8:27 PM
Resolved November 20, 2020 at 3:23 PM
For DevJam 2020, pandemic edition, I worked on improving the SNMP interface poller. As is, the SNMP interface poller only reacts to the
up(1)
anddown(2)
values ofifAdminStatus
andifOperStatus
. For many cases, this is fine, but there are more possible values and we should handle all of them in a configurable manner.The possible values from
IF-MIB
areint
meaning
1
up
2
down
3
testing
4
unknown
5
dormant
6
notPresent
7
lowerLayerDown
For instance, Cisco marks the status of failed SIP dial peers as
testing(3)
rather thandown(2)
and in fact, if you read RFC 2863 the purposes of two of the newer values,notPresent(6)
andlowerLayerDown(6),
is to serve as more refined down states.My changes make it possible to treat additional values as up or down values through configuration. Additionally, if the specific down value is not down(2), an additional event (that reduces on the normal down alarm) is sent indicating the specific value present in ifOperStatus. For these purposes, any value that is not
up(1)
is treated as a potential down-like value, despite the fact that one could reasonably argue thatdormant(5)
could be treated as an up value if you had a particularly optimistic view of the world.I also changed the logic for emitting events to send the
interface(Admin|Oper)Up
events whenever the previous stored value was not one of the configured up values rather than only when it was literally down (or one of the newly-configured down values). Likewise, the inverse: send theinterface(Admin|Oper)Down
events whenever the previous value was not one of the configured down values. The reason for this change is from the RFC:— Change to the testing state if some test(s) must be performed
?? on the interface. Presumably after completion of the test, the??
?? interface's state will change to up, dormant, or down, as??
?? appropriate.??
Since we don't know exactly what state may proceed an interface ending up in an up / down state, we should react to any state that is not one of them, rather than only reacting to the states configured to be down / up.
Also, while discussing these changes with David Hustace, he suggested a change to use common reduction / clear keys for these events and the (translated)
SNMP_Link_(Up|Down)
events so they reduce on each other in cases where both link traps and the SNMP interface poller are enabled.Finally, documentation of the SNMP interface poller existed mostly in not entirely accurate/updated wiki pages, so I took a stab at improving it.
The PR is #3178.