collecting wmi counters causes exception "Too many open files" in logs

Description

Hi,

I've installed ONMS 1.7.4 to test WMI. WMI polling and collecting is working but OpenNMS stops working after several (cca 4) hours with "Too many open files" exception in poller.log.

I've created simple script to log the number of java open files lsof -p $PIDONMS | wc -l and according this log java process reaches the limit 10240 in 5 hours:

Tue Jul 7 08:55:01 CEST 2009
0
Tue Jul 7 09:55:01 CEST 2009
699
Tue Jul 7 10:55:01 CEST 2009
3008
Tue Jul 7 11:55:01 CEST 2009
5252
Tue Jul 7 12:55:01 CEST 2009
7860
Tue Jul 7 13:55:01 CEST 2009
10385

When I turn off WMI collecting, there are about 530 open files for java process in the log. I thing there must be some socket leaking.

Here is the poller log with errors:

2009-07-07 08:36:34,018 ERROR [PollerScheduler-30 Pool-fiber1]
Snmp4JStrategy: send: Could not create SNMP session for agent
AgentConfig[Address: /172.21.1.5, ProxyForAddress: null, Port: 161,
Community: NAME, Timeout: 3000, Retries: 2, MaxVarsPerPdu: 10,
MaxRepititions: 2, Max request size: 65535, Version: v1]:
java.net.UnknownHostException: sara: sara
java.net.UnknownHostException: sara: sara
java.net.SocketException: Too many open files
2009-07-07 08:36:34,089 DEBUG [PollerScheduler-30 Pool-fiber4]
TcpMonitor: TcpMonitor: IOException while polling address: /172.21.1.6
2009-07-07 08:36:36,305 DEBUG [PollerScheduler-30 Pool-fiber4]
WmiMonitor: WMI Poller received exception from client: Failed to connect to host '172.21.1.5': An internal error occurred. [0x8001FFFF]
2009-07-07 08:36:36,308 DEBUG [PollerScheduler-30 Pool-fiber4]
WmiMonitor: WMI Poller received exception from client: Failed to connect to host '172.21.1.5': An internal error occurred. [0x8001FFFF]
2009-07-07 08:36:36,312 DEBUG [PollerScheduler-30 Pool-fiber4]
WmiMonitor: WMI Poller received exception from client: Failed to connect to host '172.21.1.5': An internal error occurred. [0x8001FFFF]
java.net.SocketException: Too many open files
java.net.SocketException: Too many open files
2009-07-07 08:36:37,480 DEBUG [PollerScheduler-30 Pool-fiber4]
SmtpMonitor: SmtpMonitor: IOException while polling address 172.31.0.52
java.net.SocketException: Too many open files

OpenNMS info:
Version: 1.7.4
Java Version: 1.5.0_17 Sun Microsystems Inc.
Java Virtual Machine: 1.5.0_17-b04 Sun Microsystems Inc.
Operating System: Linux 2.6.18-4-amd64 (amd64)
Servlet Container: jetty/6.1.16 (Servlet Spec 2.5)

Regards

Peter

Environment

Operating System: Linux Platform: PC

Acceptance / Success Criteria

None

Lucidchart Diagrams

Activity

Show:

Bryan Fullerton October 8, 2010 at 10:50 AM

I'm fine with closing it, not sure if you want to try and get in touch with the original reporter of the issue to confirm with him too?

And thanks for this work! Getting WMI working is a huge benefit for us, the majority of the machines we're monitoring are Windows and we need to track counters like HTTP and ASP.NET requests that aren't available via SNMP.

Bryan

Matt Raykowski October 8, 2010 at 9:41 AM

Bryan,

Your earlier post you were running 1.8.1 which according to TWiO was released before July 20th. The commit dd0396a7866ceafaccf34fc26d2cbeb8c036cf7d was pushed in on July 26th and squashed and picked from feature-wmi-cleanup some time after that.

I'm glad to hear that after upgrading to 1.8.4 your problem was resolved. It means I finally did some good work. (=

Any opinions, can I close this now?

Bryan Fullerton October 7, 2010 at 2:07 PM

I've updated to OpenNMS 1.8.4 and this issue appears to be resolved. I think it was handled on https://opennms.atlassian.net/browse/NMS-4129#icft=NMS-4129.

Bryan Fullerton October 6, 2010 at 7:08 PM

Confirmed it is still a bug for me.

$ sudo lsof -np 26434 | grep sock | wc -l
98
$ sudo lsof -np 26434 | grep sock | wc -l
120
$ sudo lsof -np 26434 | grep sock | wc -l
156

OpenNMS Version: 1.8.1
Java Version: 1.6.0_20 Sun Microsystems Inc.
Java Virtual Machine: 16.3-b01 Sun Microsystems Inc.
Operating System: Linux 2.6.28-19-server (amd64)
Servlet Container: jetty/6.1.24 (Servlet Spec 2.5)

Benjamin Reed June 23, 2010 at 2:46 PM

this appears to still be an issue for people

Fixed

Details

Assignee

Reporter

Fix versions

Affects versions

Priority

PagerDuty

Created July 9, 2009 at 2:49 AM
Updated January 27, 2017 at 4:26 PM
Resolved August 9, 2011 at 11:11 AM