Problems associated with SNMP4J affects OpenNMS performance (contention issues)

Description

While diagnosing a problem on a big installation of OpenNMS, I found that there are several threads blocked by org.snmp4j.security.SecurityProtocols.addDefaultProtocols.

That method is invoked inside SNMP4J, it is typically related with SNMPv3 stuff, but it doesn't seem to be a way to avoid calling it when creating an SNMP session, and the system on which this problem was detected doesn't use SNMPv3 at all.

Potential Workarounds (could have side effects):

1. Go back to JoeSNMP
2. Comment org.snmp4j.smisyntaxes in opennms.properties

Acceptance / Success Criteria

None

Lucidchart Diagrams

Activity

Seth Leger April 12, 2017 at 10:29 AM

PR merged, marking as fixed.

commit 45d11cb9e66f60ff2a565aba7a2d526dad938d32

Jesse White April 7, 2017 at 4:04 PM

PR: https://github.com/OpenNMS/opennms/pull/1428

I found a way to optimize the creation of SNMP4J session in order to avoid this source of contention. Using a single session as proposed in https://opennms.atlassian.net/browse/NMS-8825#icft=NMS-8825 would help optimize this further, but would require more code changes.

Jesse White March 17, 2017 at 4:35 PM

Yeah, this could be related to https://opennms.atlassian.net/browse/NMS-8825#icft=NMS-8825, using a single thread with async requests would solve this particular problem too.

On another large install, we found a similar problem. The stack trace reveal that there are

397 threads with this stack: at org.snmp4j.security.SecurityProtocols.addDefaultProtocols(SecurityProtocols.java:100)

Ron Roskens March 15, 2017 at 12:32 PM

Related to NMS-8825?

Why are we recreating the Snmp session each time to send out a request?

From https://oosnmp.net/confluence/pages/viewpage.action?pageId=1441796, they seem to recommend creating one Snmp session and share it across threads.

Jesse White March 15, 2017 at 11:44 AM
Edited

addDefaultProtocols() is called everytime a new Snmp session is created (regardless of whether or not v3 is being used):
https://github.com/j-white/snmp4j/blob/2.5.5/src/main/java/org/snmp4j/Snmp.java#L237

This method is synchronized, and instantiates the protocol classes every time it is called:
https://github.com/j-white/snmp4j/blob/2.5.5/src/main/java/org/snmp4j/security/SecurityProtocols.java#L102

Optimizing this will likely require changes to snmp4j upstream.

Fixed

Details

Assignee

Reporter

Components

Sprint

Affects versions

Priority

PagerDuty

Created March 15, 2017 at 11:25 AM
Updated April 12, 2017 at 2:28 PM
Resolved April 12, 2017 at 10:29 AM

Flag notifications