SnmpMonitor / SnmpPlugin "walk" parameter not working correctly

Description

Defining a service using SnmpPlugin and having SnmpMonitor use the "walk"
parameter does not work correctly.

The logic described in the
"Monitoring a Dell PowerEdge Expandable RAID Controller 3/Di" Wiki page says
that the "walk" parameter should mark the service down if any returned value
does not meet the specified criteria.

The logic in the code (SnmpMonitor.java) marks the service up if any returned
value meets the specified criteria.

I believe the correct behavior would be to mark the service down if any return
does not match. Having looked at many SNMP MIB trees, I find many, many
instances where a table returns status values for a group of related items
(cards in a switch, disks in a server, etc). In this type of instance,
it would be useful to know if any one value is bad.

In SnmpMonitor.java, PollStatus, the code does this:
PollStatus status = PollStatus.unavailable();
[...]
if ("true".equals(walkstr)) {
List<SnmpValue> results = SnmpUtils.getColumns(agentConfig, "snmpPoller", snmpObjectId);
for(SnmpValue result : results) {

if (result != null) {
log().debug("poll: SNMPwalk poll succeeded, addr=" + ipaddr.getHostAddress() + " oid=" + oid + " value=" + result);
if (meetsCriteria(result, operator, operand)) {
status = PollStatus.available();
}
} else {

The desired behavior should be to track all results inside the for loop, and
at the exit from the loop mark the status as available only if all results
met the criteria,

Configuration details:

capsd-configuration.xml:
<protocol-plugin protocol="Dell-PERC"
class-name="org.opennms.netmgt.capsd.plugins.SnmpPlugin" scan="on" user-defined="false">
<property key="vbname" value=".1.3.6.1.4.1.674.10893.1.20.130.1.1.5.1" />
<property key="timeout" value="2000" />
<property key="retry" value="1" />
</protocol-plugin>

poller-configuration.xml:
<service name="Dell-PERC" interval="300000" user-defined="false" status="on">
<parameter key="retry" value="2"/>
<parameter key="timeout" value="3000"/>
<parameter key="port" value="161"/>
<parameter key="oid" value=".1.3.6.1.4.1.10893.1.20.130.4.1.23"/>
<parameter key="operator" value="="/>
<parameter key="operand" value="3"/>
<parameter key="walk" value="true"/>
</service>
<monitor service="Dell-PERC" class-name="org.opennms.netmgt.poller.monitors.SnmpMonitor"/>

Output from pollerd.log:
PollableService: Start Scheduled Poll of service 100:<IP>:Dell_PERC
PollableServiceConfig: Polling 100:<IP>:Dell_PERC using pkg example1
SnmpMonitor: poll: service= SNMP address= AgentConfig[Address: <NAME>/<IP>, Port: 161, Community: *, Timeout: 3000, Retries: 3, MaxVarsPerPdu: 10, MaxRepitions: 2, Max request size: 65535, Version: 1, ProxyForAddress: null]
SnmpMonitor: SnmpMonitor.poll: SnmpAgentConfig address: AgentConfig[Address: <NAME>/<IP>, Port: 161, Community: *, Timeout: 3000, Retries: 3, MaxVarsPerPdu: 10, MaxRepitions: 2, Max request size: 65535, Version: 1, ProxyForAddress: null]
Snmp4JWalker: Walking snmpPoller for <NAME>/<IP> using version SNMPv1 with config: AgentConfig[Address: <NAME>/<IP>, Port: 161, Community:*, Timeout: 3000, Retries: 3, MaxVarsPerPdu: 10, MaxRepitions: 2, Max request size: 65535, Version: 1, ProxyForAddress: null]
Snmp4JWalker: Sending tracker pdu of size 1
SnmpMonitor: poll: SNMPwalk poll succeeded, addr=<IP> oid=.1.3.6.1.4.1.674.10893.1.20.130.4.1.23 value=3
SnmpMonitor: poll: SNMPwalk poll succeeded, addr=<IP> oid=.1.3.6.1.4.1.674.10893.1.20.130.4.1.23 value=3
SnmpMonitor: poll: SNMPwalk poll succeeded, addr=<IP> oid=.1.3.6.1.4.1.674.10893.1.20.130.4.1.23 value=4
SnmpMonitor: poll: SNMPwalk poll succeeded, addr=<IP> oid=.1.3.6.1.4.1.674.10893.1.20.130.4.1.23 value=3
SnmpMonitor: poll: SNMPwalk poll succeeded, addr=<IP> oid=.1.3.6.1.4.1.674.10893.1.20.130.4.1.23 value=3
PollableServiceConfig: Finish polling 100:<IP>:Dell_PERC using pkg example1 result =Up
PollableService: Finish Scheduled Poll of service 100:<IP>:Dell_PERC, started at ...

Environment

Operating System: All Platform: All

Acceptance / Success Criteria

None

blocks

Lucidchart Diagrams

Activity

Show:

Ralph Waters February 11, 2008 at 4:21 PM

Parameterizing makes sense. Perhaps allow for some sort of "match-all" or "match-any" or "match-none" criteria?

This would allow users to poll an oid tree and mark a service up with:

  • all of the sub-values meeting a criteria

  • any one of the sub-values meeting a criteria

  • none of sub-values meeting a criteria

The current implementation is to match any one of the sub-values, so that would need to be the default parameter.

Jeff Gehlbach February 11, 2008 at 1:47 PM

How about parameterizing the behavior rather than changing the existing one? That way people using it as it works today won't break when they upgrade.

Details

Assignee

Reporter

Labels

Affects versions

Priority

PagerDuty

Created February 11, 2008 at 10:31 AM
Updated September 21, 2021 at 6:23 PM

Flag notifications