The alarm-type for BSM event definitions is conceptually incorrect

Description

The are 2 event definitions with alarm-data introduced by BSMD:

* uei.opennms.org/bsm/serviceProblem

* uei.opennms.org/bsm/serviceProblemResolved

On both cases, the alarm-type is defined with a value of 1 which is not technically correct.

alarm-type=1 means this is a problem with a resolution. That also means, it should be another event with alarm-type=2 that should resolve the problem in question (using a valid clear-key).

This is not the case here, because the serviceProblemResolved will UPDATE the existing serviceProblem alarm by changing the severity and other fields like the logMsg and description. So, technically, this serviceProblemResourced is not "resolving" an existing alarm because there is no alarm for this event.

That being said, both events should have alarm-type=3, due to how the alarm-data is defined for this pair of events.

Acceptance / Success Criteria

None

Lucidchart Diagrams

Activity

Show:

Benjamin Reed September 27, 2017 at 8:51 PM

OK, the new issue is

Alejandro Galue September 26, 2017 at 12:14 PM

As you can see on the following links:

https://github.com/OpenNMS/opennms/blob/opennms-18.0.4-1/opennms-base-assembly/src/main/filtered/etc/events/opennms.events.xml#L2056

https://github.com/OpenNMS/opennms/blob/opennms-19.1.0-1/opennms-base-assembly/src/main/filtered/etc/events/opennms.events.xml#L2058

https://github.com/OpenNMS/opennms/blob/opennms-20.0.2-1/opennms-base-assembly/src/main/filtered/etc/events/opennms.bsm.events.xml#L41

Both, the serviceProblem event and the serviceProblemResolved event had alarm-type=1.

 

I propose one of the following options for 20.1.1 (or 21.0.0) and Meridian 2017:

1) Revert the fix, which means, put back alarm-type=1 on both events, and then create a new PR to set alarm-type=2 for the serviceProblemResolved.

2) Leave the fix applied, and create a new PR to set alarm-type=1 for the serviceProblem, and alarm-type=2 for the serviceProblemResolved.

 

Sounds good?

David Hustace September 26, 2017 at 7:43 AM

I agree that the serviceProblemResolved event should be alarm-type=2... I didn't realize that it wasn't.  What type was it and why wasn't set to 2 in the first place?  I could have been part of the problem since I did discuss this with Jesse when we were making this happen but I really don't recall deciding it not to be type 2.

Alejandro Galue September 25, 2017 at 5:27 PM

I would agree with reverting the change if we set the alarm-type on a consistent way. I mean, set the alarm-type=1 to the serviceProblem event as it used to be, but then set the alarm-type=2 for the serviceProblemResolved. How does that sound ? If that sounds right, keep in mind that this affects Meridian 2017, so we should decide quickly, I think.

David Hustace September 25, 2017 at 5:18 PM

I would like to revert this change.

Fixed

Details

Assignee

Reporter

Components

Sprint

Affects versions

Priority

PagerDuty

Created July 7, 2017 at 2:49 PM
Updated September 28, 2017 at 3:32 AM
Resolved September 27, 2017 at 8:51 PM