Add connection pooling to vmware data collection

Description

We're trying out the VMware integration in OpenNMS 14.0.1 and using the auto-generated configuration files from following the wiki doc. We're seeing OpenNMS try to log into the vCenter between every 1s to every 10s. This is adding a huge amount of entries into the virtual centre event list/tables (VPX_EVENT/VPX_EVENT_ARG). However the auto vmware config seems to indicate that data collection should only happen every 5 minutes and the same for the VMwareCim-HostSystem polling.

Ronald Roskens suggested:
"I took a peek at the data collection and each attempt at data collection is going to cause a login into vCenter. So if you have 300 provisioned nodes from inside vCenter, I could envision each node making its own connection. Additionally, if you have this many, than each one would probably also be attempting a login during the 5 minutes

If you look at the $OPENNMS_HOME/logs/instrumentation.log, searching for lines containing "collector.collect: begin:" you should get an idea for what package, node, interface, and service metrics are being recorded for. This should help to identify whether its a data collection or poller causing the high number of connections. "

That does indeed appear to be the case. Wouldn't be possible to implement some kind of connection pooling feature for VMware data collection? First it would seem that there would be a certain amount of unnecessary overhead and latency in the startup, authentication, and tear down of all these connections to the same vCenter server. More importantly is the effect that all those connection authentication has in burying other events in the VMware event log.

Starting with the design of the stored procedure that vCenter has for purging older events from the event log, I've created a query to purge these specific entries from the log without extensive locking/swamping vCenter activity. That lets me purge these entries from the VMware log and look at other entries for problem investigation. However it would be better to reduce these events at the source, and reduce the load on both OpenNMS and vCenter servers at the same time.

Environment

OpenNMS 1.14

Acceptance / Success Criteria

None

Lucidchart Diagrams

Activity

Show:

Details

Assignee

Reporter

Components

Affects versions

Priority

PagerDuty

Created January 6, 2015 at 4:39 PM
Updated July 26, 2023 at 2:16 PM