Details
Assignee
UnassignedUnassignedReporter
Will KeaneyWill KeaneyLabels
Components
Affects versions
Priority
Minor
Details
Details
Assignee
Unassigned
UnassignedReporter
Will Keaney
Will KeaneyLabels
Components
Affects versions
Priority
PagerDuty
PagerDuty
PagerDuty
Created September 7, 2017 at 8:21 PM
Updated September 21, 2021 at 6:25 PM
Given a Drools rule that retracts an event, changes its
logmessage.dest
, and sends it back toeventd
, if many events hit this rule in a very short time,alarmd
will callonEvent()
more than once for each resulting event that's published to the bus. This causes multiple reductions against the alarm for each event, and increments the alarm's counter.This does not occur if there is a small delay between events; a delay of 1 second was tested. I do not know the threshold at which the problem happens.
A small Drools engine and eventconf is attached that reliably reproduce this problem for me.
To reproduce:
Install the Event configuration file, and add it to your
eventconf.xml
Enable Drools in your installation
Install the sample Drools engine in your
drools-engine.d
directoryRestart OpenNMS
Ensure that
send-event.pl
is in yourPATH
Update the included
send_test_events.sh
with the nodeId of a valid node in your OpenNMS installation.Execute the included
send_test_events.sh
script to send 15 events in short succession.Results:
An alarm is created, and its Counter increments to some multiple of the events actually reduced. I've seen values between 27-33 for 15 total events. If you enable debug logging on
eventd
andalarmd
, you'll see that each event is published once, butalarmd
handles it more than once.Expected result:
alarmd
should only callonEvent()
once for each event published to the bus, regardless of how quickly the events are published, and the counter should reflect the actual number of events reduced against that alarm.