threshold exceeded events deleted and re-created on opennms restart

Description

Set up thresholds for filesystem usage on servers; set up notifications when a trigger value is met ( i use datasourcetype=dskIndex in datacollection ); when the trigger value is crossed a new event is created and the corresponding notification is sent.
If you restart opennms, the event related to the threshold is deleted from the db and a new one is created: this result in a new notification sent for the same threshold with a new time-stamp ( corresponding to the restart time and not to the real-event time ).
Also, if the event was acknowledged at the time of it's occurence , now ( 'couse it's a new event ) is not acknowledged...

Environment

linux CentOS-5

Acceptance / Success Criteria

None

Linked issues

is duplicated by

NMS-5439

Duplicate threshold alarms are generated after OpenNMS restart.

Lucidchart Diagrams

Activity

Antonio Russo March 21, 2012 at 5:26 AM

You can use the event translator to persist the critical threshold events you are looking for. So you save the right timestamp. Also you can avoid recreating the translated event if already active into the database.

Andrea Russos March 21, 2012 at 4:07 AM

Hi Alejandro!
..as far as you think only to notifications i agree with you that an automation which prevent opennms to send a new notification will help ...
..But..
the real problem ( almost IMO ) is that if opennms restart, the thresholds are seen as NEW critical events, with the time stamp ( i mean the time in which the critical event occur ) corresponding to the service restart; also ( as i've already written ) if the old alarms where acknowledged ( by helpdesk people, as an example ) the new ones related to the same problems where not ....
I think this is a problem which may involve also SLAs ( if they are correlated to an opennms deploy .... )
I think would be better if thresholds ( which are critical events, in most cases ) would be persistent to backend DB, don't you ??

--Andrea

Alejandro Galue March 20, 2012 at 1:25 PM

The states of the thresholds are stored in memory (i.e. they are not persisted to a file or the database). For this reason when you stop OpenNMS, those states are lost and if the threshold condition still exist after starting OpenNMS, you will receive new notifications.

You can avoid this by creating some automations that can prevent sending the second notification if there is an alarm already raised for a particular threshold violation.

Makes sense?

Details
Assignee
Unassigned
Reporter
Andrea Russos
Labels
RBsupport
Due date
Mar 14, 2012
Components
Affects versions
1.8.15
1.8.17
Priority
Major

PagerDuty

Created March 14, 2012 at 12:15 PM

Updated September 21, 2021 at 6:23 PM

threshold exceeded events deleted and re-created on opennms restart

Description

Environment

Acceptance / Success Criteria

Linked issues

is duplicated by

Lucidchart Diagrams

Activity

Antonio Russo March 21, 2012 at 5:26 AM

Andrea Russos March 21, 2012 at 4:07 AM

Alejandro Galue March 20, 2012 at 1:25 PM

DetailsAssigneeUnassignedUnassignedReporterAndrea RussosAndrea RussosLabelsRBsupportDue dateMar 14, 2012ComponentsAffects versions1.8.151.8.17PriorityMajor

Details

Assignee

Reporter

Labels

Due date

Components

Affects versions

Priority

PagerDutyPagerDuty Incident

PagerDuty

Details
Assignee
Unassigned
Reporter
Andrea Russos
Labels
RBsupport
Due date
Mar 14, 2012
Components
Affects versions
1.8.15
1.8.17
Priority
Major

PagerDuty