HP Procurve Switch port data collection issue.

Description

I've found an interesting problem doing data collection on the interfaces on our HP Procurve 5400 and 3400 (and probably others) series switch where the first two interfaces won't collect high speed statistics.

I've noted this in installations of OpenNMS 1.8.5 and 1.8.10

A fresh install of OpenNMS won't collect this data on any interface because the systemDef doesn't include mib2-X-interfaces. I added the mib2-X-interfaces group to the systemDef, and started getting "high speed" data for the interfaces, except for the first two interfaces on the switch updated in /opt/opennms/etc/datacollection-config.xml:

<systemDef name="HP ProCurve"> <sysoidMask>.1.3.6.1.4.1.11.2.3.7.11.</sysoidMask> <collect> <includeGroup>mib2-X-interfaces</includeGroup> <includeGroup>hp-procurve</includeGroup> </collect> </systemDef> <group name="mib2-X-interfaces" ifType="all"> <mibObj oid=".1.3.6.1.2.1.31.1.1.1.1" instance="ifIndex" alias="ifName" type="string" /> <mibObj oid=".1.3.6.1.2.1.31.1.1.1.15" instance="ifIndex" alias="ifHighSpeed" type="string" /> <mibObj oid=".1.3.6.1.2.1.31.1.1.1.6" instance="ifIndex" alias="ifHCInOctets" type="Counter64" /> <mibObj oid=".1.3.6.1.2.1.31.1.1.1.10" instance="ifIndex" alias="ifHCOutOctets" type="Counter64" /> </group>

I do note that on the HP Procurve switches, OID 1.3.6.1.2.1.31.1.1.1.15 is of type "Gauge" instead of string for all interfaces. I've also confirmed with an SNMPGET, that the 1.3.6.1.2.1.31.1.1.1.6.1 (and ...10.1) OID's have vaild data compared to interfaces 3-nn.

From the collectd.log:

I get a lot of these messages, which appears that the interfaces are being walked for data, but note that interface 1 and 2 skip ifHCInOctets and ifHCOutOctets which are picked up starting on interface 3.

2011-02-25 13:56:13,216 INFO [DefaultUDPTransportMapping_10.10.10.128/0] NumericAttributeType: Setting attribute: ifOut Discards [.1.3.6.1.2.1.2.2.1.19].[1] = '0' 2011-02-25 13:56:13,216 INFO [DefaultUDPTransportMapping_10.10.10.128/0] NumericAttributeType: Setting attribute: ifOut Errors [.1.3.6.1.2.1.2.2.1.20].[1] = '0' 2011-02-25 13:56:13,216 INFO [DefaultUDPTransportMapping_10.10.10.128/0] NumericAttributeType: Setting attribute: ifOut Discards [.1.3.6.1.2.1.2.2.1.19].[2] = '0' 2011-02-25 13:56:13,216 INFO [DefaultUDPTransportMapping_10.10.10.128/0] NumericAttributeType: Setting attribute: ifOut Errors [.1.3.6.1.2.1.2.2.1.20].[2] = '0' 2011-02-25 13:56:13,224 INFO[DefaultUDPTransportMapping_10.10.10.128/0] NumericAttributeType: Setting attribute: ifOut Discards [.1.3.6.1.2.1.2.2.1.19].[3] = '0' 2011-02-25 13:56:13,225 INFO [DefaultUDPTransportMapping_10.10.10.128/0] NumericAttributeType: Setting attribute: ifOut Errors [.1.3.6.1.2.1.2.2.1.20].[3] = '0' 2011-02-25 13:56:13,225 INFO [DefaultUDPTransportMapping_10.10.10.128/0] NumericAttributeType: Setting attribute: ifHCInOctets [.1.3.6.1.2.1.31.1.1.1.6].[3] = '795336964480' 2011-02-25 13:56:13,225 INFO [DefaultUDPTransportMapping_10.10.10.128/0] NumericAttributeType: Setting attribute: ifHCOutOctets [.1.3.6.1.2.1.31.1.1.1.10].[3] = '33863740432524' 2011-02-25 13:56:13,225 INFO [DefaultUDPTransportMapping_10.10.10.128/0] NumericAttributeType: Setting attribute: pethMainPsePower [1.3.6.1.2.1.105.1.3.1.1.2].[3] = '546' 2011-02-25 13:56:13,226 INFO [DefaultUDPTransportMapping_10.10.10.128/0] NumericAttributeType: Setting attribute: pethMainPseConPower [1.3.6.1.2.1.105.1.3.1.1.4].[3] = '0' 2011-02-25 13:56:13,226 INFO [DefaultUDPTransportMapping_10.10.10.128/0] NumericAttributeType: Setting attribute: ifOutDiscards [.1.3.6.1.2.1.2.2.1.19].[4] = '0' 2011-02-25 13:56:13,226 INFO [DefaultUDPTransportMapping_10.10.10.128/0] NumericAttributeType: Setting attribute: ifOutErrors [.1.3.6.1.2.1.2.2.1.20].[4] = '0'

Further along in the collectd.log, I see these messages, which are a telltale of the missing High Speed data. These entries only exist for interfaceSnmp[1|2]

2011-02-25 14:21:26,160 WARN [CollectdScheduler-50 Pool-fiber0] CollectorThresholdingSet: passedThresholdFilters: can't find value of ifHighSpeed for resource node[1].interfaceSnmp[A1] 2011-02-25 14:21:26,160 WARN [CollectdScheduler-50 Pool-fiber0] CollectorThresholdingSet: passedThresholdFilters: can't find value of ifHighSpeed for resource node[1].interfaceSnmp[A1] 2011-02-25 14:21:26,160 WARN [CollectdScheduler-50 Pool-fiber0] CollectorThresholdingSet: passedThresholdFilters: can't find value of ifHighSpeed for resource node[1].interfaceSnmp[A2] 2011-02-25 14:21:26,160 WARN [CollectdScheduler-50 Pool-fiber0] CollectorThresholdingSet: passedThresholdFilters: can't find value of ifHighSpeed for resource node[1].interfaceSnmp[A2] 2011-02-25 14:26:20,541 WARN [CollectdScheduler-50 Pool-fiber0] CollectorThresholdingSet: passedThresholdFilters: can't find value of ifHighSpeed for resource node[2].interfaceSnmp[A1] 2011-02-25 14:26:20,541 WARN [CollectdScheduler-50 Pool-fiber0] CollectorThresholdingSet: passedThresholdFilters: can't find value of ifHighSpeed for resource node[2].interfaceSnmp[A1] 2011-02-25 14:26:20,542 WARN [CollectdScheduler-50 Pool-fiber0] CollectorThresholdingSet: passedThresholdFilters: can't find value of ifHighSpeed for resource node[2].interfaceSnmp[A2] 2011-02-25 14:26:20,542 WARN [CollectdScheduler-50 Pool-fiber0] CollectorThresholdingSet: passedThresholdFilters: can't find value of ifHighSpeed for resource node[2].interfaceSnmp[A2]

I would appreciate any pointers on how to get OpenNMS to report the High Speed interface data on these first two interfaces.

[FOLLOWUP]
> Make sure the value reported back by ifHighSpeed is cached correctly in
> strings.properties by OpenNMS. You could try comparing
> strings.properties with a known working interface for the same node.
>
The strings.properties files for the first interface is missing some data that a "normal" interface records.

Starting in working directory /var/opennms/rrd/snmp/1

# cat A2/strings.properties #Thu Feb 24 16:48:20 MST 2011 ifSpeed=4294967295 ifDescr=A2 # cat A3/strings.properties #Fri Feb 25 10:36:50 MST 2011 ifSpeed=4294967295 ifDescr=A3 ifName=A3 ifHighSpeed=10000

So, I deleted the strings.properties and all .lrb files for port "A1", and let OpenNMS recreate the files. OpenNMS needed to be restarted for the string.properties file to be recreated. The recreated file looks the same as the original file.

# ls A1 A2 A3 A1: ifInDiscards.jrb ifInNUcastpkts.jrb ifInUcastpkts.jrb ifOutErrors.jrb ifOutOctets.jrb strings.properties ifInErrors.jrb ifInOctets.jrb ifOutDiscards.jrb ifOutNUcastPkts.jrb ifOutUcastPkts.jrb A2: ifInDiscards.jrb ifInNUcastpkts.jrb ifInUcastpkts.jrb ifOutErrors.jrb ifOutOctets.jrb strings.properties ifInErrors.jrb ifInOctets.jrb ifOutDiscards.jrb ifOutNUcastPkts.jrb ifOutUcastPkts.jrb A3: ifHCInOctets.jrb ifInErrors.jrb ifInUcastpkts.jrb ifOutNUcastPkts.jrb strings.properties ifHCOutOctets.jrb ifInNUcastpkts.jrb ifOutDiscards.jrb ifOutOctets.jrb ifInDiscards.jrb ifInOctets.jrb ifOutErrors.jrb ifOutUcastPkts.jrb

> Curious why these show "ifOut" instead of ifOutDiscards or ifOutErrors.
> It might not be related to the Octets problem, but is there really an
> OID mapped to an attribute named "ifOut" in the data collection
> configuration? I don't see that anywhere in my configs.

I'm using stock configs on my test server, so I can confirm there is no bare "ioOut" mapped to an OID.

I've got the collectd logs set to DEBUG, but I don't see any significant extra information. Please let me know what log extracts, snmpget's etc that would help debug this issue.

Thanks!

Environment

OpenNMS server OS: Debian 5.0.7 OpenNMS version 1.8.5 Also tried on RHEL6 with OpenNMS 1.8.8. The 1.8.8 system is a stand alone test box with only 4 monitored nodes (3 HP Procurve switches and a Dell Server), and is suitable for destructive testing.

Acceptance / Success Criteria

None

Attachments

2
  • 14 Sep 2011, 12:59 PM
  • 09 Sep 2011, 10:49 AM

Lucidchart Diagrams

Activity

Show:

Anthony Johnson June 21, 2013 at 4:59 PM

FYI, HP Flex 10s have the same issue on 1.11.90

Robert Drake October 15, 2012 at 11:32 PM

I stumbled on your bug while looking for something else, thought I would take a look because I was thinking I might spot something in the snmpwalk.

Uninformed Theory:

Maybe rather than polling the node again it uses values it stored in the database to repopulate that file if it disappears. If you haven't done so yet can you try removing the node completely and putting it back? (Although I can't imagine you haven't done this considering it was posted a year ago)

From what I see, without simulating it or having the hardware, the procurve.properties file you uploaded seems to show all the needed values are there so it should work.

Ray Frush November 14, 2011 at 11:26 AM

I haven't installed the latest and greatest bits, but the lack of activity on this ticket suggests that there's been little traction.

I think the way this will get fixed is if someone who's paying for support logs the same issue.

Mike Roberts November 13, 2011 at 11:43 PM

Did this go anywhere ? we are also experiencing something very similar on 1.8.10 with brand new enterprise dlink switches.

Ray Frush September 14, 2011 at 12:59 PM

SNMP Walk of ProCurve 5400 series.

Details

Assignee

Reporter

Labels

Affects versions

Priority

PagerDuty

Created March 1, 2011 at 6:46 PM
Updated September 21, 2021 at 6:24 PM