Missing event/alarm when datacollection has trouble

Description


Mattermost conversation:

:
An interface has very high usage. That's why OpenNMS can't collect data anymore or just sometimes (or the monitored node timed out). Should I get a datacollectionFailed event in this cases? I mean, I didn't get one.

:
You should get one if it fails but maybe it didn't fail and just took a long time. It time between two collections is too large, the graph will render a NaN.


In this case you are totally blind and this state can exist $duration. An administrator should get notified in this case.
I've attached RRD interface files for investigation.

Acceptance / Success Criteria

None

Attachments

3

Lucidchart Diagrams

Activity

Show:

Details

Assignee

Reporter

Labels

Components

Affects versions

Priority

PagerDuty

Created May 29, 2018 at 10:03 PM
Updated September 21, 2021 at 7:26 PM