Pollings seems to stop

Description

Hi,

on one of our installations, polling seems to stop sometimes for a few nodes. We noticed that failure, because a real outage was not recognized in OpenNMS. The poller-configuration seems to be okay (polling-interval and downtime block). When we looked at the response time graphs for the services of this node, we see, that there were no values at this time. In the poller-logfile we can't find any failures for this node. It was not possible to reproduce this behavior, but it appears more often. We check that with a little script which looked at the modification date of the files of the response time graphs and check, that they were updated in the last 15 minutes, which are 3 times of the polling interval on this system.

After a restart of the hole system (not only of OpenNMS) the polling started again, what we can see at the response time graphs.

Environment

Ubuntu Server 10.04 LTS

Acceptance / Success Criteria

None

Lucidchart Diagrams

Activity

Show:

Benjamin Reed May 2, 2012 at 3:49 PM

Hard to say if this is a real issue without more details, but earlier 1.8 definitely had known thread/leak issues, so marking as cannot reproduce.

If you can provide more info, especially with a more up-to-date OpenNMS release, please reopen.

Benjamin Reed March 13, 2012 at 2:42 PM

I believe 1.8.12 included some threading-related fixes that might solve your issue. Could you upgrade to the latest 1.8 or even better, 1.10.0 and see if the problem persists?

Cannot Reproduce

Details

Assignee

Reporter

Components

Affects versions

Priority

PagerDuty

Created January 20, 2012 at 10:57 AM
Updated May 11, 2015 at 2:50 PM
Resolved May 2, 2012 at 3:49 PM