Enhance PassiveStatusMonitor to create outages with a time other than "now"
Description
Acceptance / Success Criteria
None
Lucidchart Diagrams
Activity
Show:

Christian Pape March 18, 2025 at 7:00 AM
Merged.
Fixed
Details
Assignee
Christian PapeChristian PapeReporter
Dino YanceyDino YanceyLabels
HB Grooming Date
Apr 26, 2022HB Backlog Status
Refined BacklogFD#
1077Components
Sprint
NoneFix versions
Affects versions
Priority
Minor
Details
Details
Assignee

Reporter

Labels
HB Grooming Date
Apr 26, 2022
HB Backlog Status
Refined Backlog
FD#
1077
Components
Sprint
None
Fix versions
Affects versions
Priority
PagerDuty
PagerDuty Incident
PagerDuty
PagerDuty Incident
PagerDuty

PagerDuty Incident
Created April 22, 2022 at 4:06 PM
Updated March 18, 2025 at 7:00 AM
Resolved March 18, 2025 at 7:00 AM
Currently the
PassiveStatusMonitor
triggers poller status changes (up/down) at the time the monitor is triggered. While this makes sense for most monitors, given that this feature is driven by events sent into the system, this is innaccurate in cases where the event is delayed in being received by the core. If the trap/syslog/event processing is able to handle extracting the original timestamp of the message to record on the event, allowing thePassiveStatusMonitor
to "backdate" the serviceDown event would provide more accuracy to outages and in turn give customers a more accurate SLA calculation.Example: An event is firing from an external system and it is sending using
uei.opennms.org/services/passiveServiceStatus
as uei with all the relevant params.<time>
is also sent in the event xml. What can be seen is that thepassiveServiceStatus
event gets the correct timestamp, but whenpollerd
raises a new event, it uses "current" time when it raises it. This is tricky if we have a situation where we get those initial events "late".For example, if some 3rd party system sends an event with timestamp
2022-04-05 13:34:42+0000
, but it actually arrives "now". It would be more accurate to use original timestamp as eventtime, and have Pollerd's eventtime set same as the original timestamp. This way we could get correct timestamps to SLA report.