Maintenance Mode

Description

As an OpenNMS user, I would like to be able to put a node or a set of nodes into a maintenance mode/state.

Maintenance mode can be defined as a state whereby Alarms are suppressed for a duration or specific time window.

User can right click on a node in topology and put the node in maintenance mode from now +x amount of time or use a Calendar/Time picker to set the start and end times.

Maintenance periods should be persisted for reporting purposes and integration with SLA reporting.

Clearly there is a bit of confusion here between scheduled outages currently in the software vs. this new maintenance mode. Initially, maintenance mode is simply to suppress Alarms. Longer term, maintenance mode can be expanded to encapsulate the outage planning functionality as well.

Initial thoughts on behavior:

  • User selects an active node to be in maintenance mode, Alarms are no longer triggered.

  • User selects an active node that has existing alarms to be in maintenance mode, these Alarms are optionally automatically cleared.

  • The UI and ReST API can optionally return "all" Alarms or just Alarms only for nodes not in maintenance mode.

This behavior seems to straddle the fence of simply WebUI Alarm suppression vs. global Alarm suppression and as I think about it more I think the simpler the better and a Global Alarm Suppression makes more sense.

If a node is placed in Maintenance Mode, all existing Alarms are cleared and all future Alarms are suppressed until the Maintenance window expires. Make up your mind, is it maintenance mode or not!

Acceptance / Success Criteria

None

Lucidchart Diagrams

Activity

Show:

Details

Assignee

Reporter

HB Grooming Date

HB Backlog Status

FRC Grooming Date

Priority

PagerDuty

Created September 11, 2020 at 7:59 PM
Updated September 29, 2022 at 2:44 PM