Missing event/alarm when datacollection has trouble
Description
Mattermost conversation:
: An interface has very high usage. That's why OpenNMS can't collect data anymore or just sometimes (or the monitored node timed out). Should I get a datacollectionFailed event in this cases? I mean, I didn't get one.
: You should get one if it fails but maybe it didn't fail and just took a long time. It time between two collections is too large, the graph will render a NaN.
In this case you are totally blind and this state can exist $duration. An administrator should get notified in this case. I've attached RRD interface files for investigation.
Mattermost conversation:
:
An interface has very high usage. That's why OpenNMS can't collect data anymore or just sometimes (or the monitored node timed out). Should I get a datacollectionFailed event in this cases? I mean, I didn't get one.
:
You should get one if it fails but maybe it didn't fail and just took a long time. It time between two collections is too large, the graph will render a NaN.
In this case you are totally blind and this state can exist $duration. An administrator should get notified in this case.
I've attached RRD interface files for investigation.