Issues

Select view

List view

Detail view

Select search mode

Basic

JQL

NodeCategorySettingPolicy hit momentarily resolves open outages
NMS-6848
Resolved issue: NMS-6848
Allow provisioning policy to have a "PERSIST_ONLY" action to allow easier provisioning where only given IPs should be persisted
NMS-5320
Unable to use foreign source policy to set surveillance category by IP policy
NMS-5319
Provisiond temporarily deletes policy-based surveillance categories from existing nodes when synchronizing
NMS-5059
Resolved issue: NMS-5059
Provisiond ignores foreign source category policies on subsequent synchronizations
NMS-4990
Resolved issue: NMS-4990
Policies in Provisiond return by default "false" if a SNMP attribute is null
NMS-4985

6 of 6

NodeCategorySettingPolicy hit momentarily resolves open outages

Fixed

Description

This issue created from report in support ticket https://mynms.opennms.com/Ticket/Display.html?id=3239

Steps to reproduce:

1. Create a requisition called "test". Create a node in that requisition called "test-sw-1". Add an interface to this node whose IP address is not reachable with snmp-primary=N. Add the ICMP service to that interface. Save the requisition.

2. Edit the foreign-source definition for "test". Add a policy called "Categorize", class "Set Node Category", key "category" => value "Switches". Match behavior = ALL_PARAMETERS. Add parameter key "label" => value "~.sw.".

3. Synchronize the "test" requisition. Wait for the test-sw-1 node to be created, populated, and to go into a nodeDown outage. Wait another moment for good measure.

4. Synchronize the "test" requisition again.

Expected result: nothing but maybe a nodeUpdated event on the test-sw-1 node

Actual result: new nodeDown outage created for the test-sw-1 node, and the previous outage was closed with no nodeUp event.

Additional remarks:

If I add a second node whose node label does not match the regex in the category-setting policy, that node's outages do not get summarily closed and re-created.

Acceptance / Success Criteria

None

Linked issues

depends on

NMS-5059

Provisiond temporarily deletes policy-based surveillance categories from existing nodes when synchronizing

is duplicated by

NMS-5266

File based Provisioning Groups nodes lose historic Service Outage information after manual Synchronization for services added with detectors.

Lucidchart Diagrams

Details
Assignee
Benjamin Reed
Reporter
Jeff Gehlbach
Labels
policypollerprovisioningsupport
Components
Fix versions
14.0.0
Affects versions
1.12.9
Priority
Blocker

PagerDuty

Created September 12, 2014 at 12:15 PM

Updated September 25, 2014 at 11:43 AM

Resolved September 25, 2014 at 7:50 AM

Configure

Activity

Show:

Benjamin ReedSeptember 25, 2014 at 7:50 AM

OK, this is fixed now. The event-handling for category and asset events was unconditionally removing outages, expecting them to be recreated on the next scan, I guess?

I modified the code to check each service individually to see if it still matched active filters, and add/remove from scanning as necessary.

Benjamin ReedSeptember 23, 2014 at 5:46 PM

OK, I had been doing this using the 'catinc' stuff and it appeared to be behaving properly. However, I just did it with a default poller configuration and I can confirm that this is still an issue. Provisiond is doing the right thing, but Pollerd or the outage service are not.

Benjamin ReedSeptember 22, 2014 at 10:07 PM

Provisiond has been fixed to not delete and add categories in phases, which repairs the issues with outages being cleared.

For details on the new design, see:

https://github.com/OpenNMS/opennms/blob/rc/stable/1.14.0/opennms-provision/opennms-provisiond/design.markdown#category-lifecycle

Jeff GehlbachSeptember 12, 2014 at 12:38 PM
Edited

Very important note: when I've discussed this problem previously in conversation, I've assumed that reproducing it would require having a poller package whose filter keys on node category memberships, like:

<filter>catincSwitches</filter>

This turns out not to be the case. I was able to reproduce the problem using the stock poller-configuration.xml.

Issues

NodeCategorySettingPolicy hit momentarily resolves open outages

Description

Acceptance / Success Criteria

Linked issues

depends on

is duplicated by

Lucidchart Diagrams

DetailsAssigneeBenjamin ReedBenjamin ReedReporterJeff GehlbachJeff GehlbachLabelspolicypollerprovisioningsupportComponentsFix versions14.0.0Affects versions1.12.9PriorityBlocker

Details

Assignee

Reporter

Labels

Components

Fix versions

Affects versions

Priority

PagerDutyPagerDuty Incident

PagerDuty

Activity

Benjamin ReedSeptember 25, 2014 at 7:50 AM

Benjamin ReedSeptember 23, 2014 at 5:46 PM

Benjamin ReedSeptember 22, 2014 at 10:07 PM

Jeff GehlbachSeptember 12, 2014 at 12:38 PMEdited

Details
Assignee
Benjamin Reed
Reporter
Jeff Gehlbach
Labels
policypollerprovisioningsupport
Components
Fix versions
14.0.0
Affects versions
1.12.9
Priority
Blocker

PagerDuty

Jeff GehlbachSeptember 12, 2014 at 12:38 PM
Edited