The ILR reports wrong data when there are several packages with different collection rates on collectd-configuration.xml for the same service
Description
Environment
Acceptance / Success Criteria
Attachments
Lucidchart Diagrams
Activity
Alejandro Galue March 25, 2014 at 9:04 AM
Fixed on revision 03c05196f096b82d73b6007662c1dc613f9123b0 on master.
Alejandro Galue February 19, 2014 at 3:43 PM
The revision 22e159948bd661bf604c5b93704a1cdbb52a4281 contains the proposed workaround for the ILR until we provide package support on the ILR report.
The workaround is: move the EJN package from $OPENNMS_HOME/etc/collectd-configuration.xml to $OPENNMS_HOME/etc/examples/collectd-configuration.xml
Alejandro Galue February 19, 2014 at 3:11 PM
For now, I suggest to remove the EJN package from collectd-configuration.xml by default and release 1.12.5 ... Then we can start working on a solution for the ILR for 1.12.6
Alejandro Galue February 19, 2014 at 3:10 PM
Here is the problem:
By default there are two packages that match all the IP addresses for the SNMP Service:
The ILR is going to get confused because it will recognize different entries based on nodeId/ipAddress/serviceName.
Example:
As you can see, the CollectableService is trying to collect data from the same node twice: one using the parameters from the example1 package, and other from the ejn package; but:
As you can see, the instrumentation data is not considering the package, so it seems like Collectd is collecting data twice which is not true (Collectd is working properly). Because example1 requires data every 5 minutes and ejn every 3 minutes, the average time is 2 minutes, which is what we see on the ILR.
So if we modify the matching criteria to be nodeId/ipAddress/serviceName/packageName that won't be a problem anymore.
Alejandro Galue February 13, 2014 at 5:16 PM
When something wrong happen with a non-SNMP collector, no errors are reported on the ILR. Here is why:
Collectd.instrumentation().reportCollectionException(...) is being used only by the SNMP Collector and the TCA Collector. Actually, only those collectors are using beginCollectingServiceData() and endCollectingServiceData().
My opinion: we should either update all the collectors implementations to use the instrumentation API like the SnmpCollector in case that is mandatory for ILR; or, modify CollectableService in order to provide all the data necessary for ILR and make the collectors implementations independent of the instrumentation API.
Details
Assignee
Alejandro GalueAlejandro GalueReporter
Alejandro GalueAlejandro GalueLabels
Components
Sprint
NoneFix versions
Affects versions
Priority
Critical
Details
Details
Assignee
Reporter
Labels
Components
Sprint
Fix versions
Affects versions
Priority
PagerDuty
PagerDuty Incident
PagerDuty
PagerDuty Incident
PagerDuty

The ILR displays the data as if Collectd were collecting every 2 minutes approximately, which is not true according with collectd-configuration.xml.
After disabling the VMWare Collector and restarting the ILR data, the report shows the correct information (i.e. data collected every 5 minutes).