Allow multiple thresholding-groups in threshd-configuration.xml

Description

While customizing my threshd-configuration.xml file, I realized I needed specializations for most of my major servers. For some, a threshold at 90% disk full is fine while others are in trouble when they are more than 40% full. A system with one core will look at different loadavg thresholds than one with sixteen cores. For some I want to threshold on free memory whereas others might be better served by looking at free + cache + buffers.

Moreover, the systems may mix and match characteristics. I might have a 8 core system who is fine at 80% disk full and another who should scream for help at 40%. For this reason, it made sense to modularize by thresholds.xml. I might break the net-snmp package into several different disk packages, cpu packages, memory packages, etc., and mix and match as needed. Thus thresholds.xml might look something like this:

<group name="netsnmp-disk-relchange"
rrdRepository = "/opt/opennms/share/rrd/snmp/">
<threshold type="relativeChange" ds-name="ns-dskPercent" ds-type="dskIndex" ds-label="ns-dskPath" value="0.5" rearm="0.0" trigger=
<threshold type="relativeChange" ds-name="ns-dskPercentNode" ds-type="dskIndex" ds-label="ns-dskPath" value="0.5" rearm="0.0" trig
</group>

<group name="netsnmp-disk90"
rrdRepository = "/opt/opennms/share/rrd/snmp/">
<threshold type="high" ds-name="ns-dskPercent" ds-type="dskIndex" ds-label="ns-dskPath" value="90.0" rearm="75.0" trigger="2"/>
<threshold type="high" ds-name="ns-dskPercentNode" ds-type="dskIndex" ds-label="ns-dskPath" value="90.0" rearm="75.0" trigger="2"/
</group>

<group name="netsnmp-cpu-1core"
rrdRepository = "/opt/opennms/share/rrd/snmp/">
<expression type="high" expression="loadavg5 / 100.0" ds-type="node" ds-label="" value="1.5" rearm="0.9" trigger="2"/>
<expression type="high" expression="loadavg15 / 100.0" ds-type="node" ds-label="" value="1.1" rearm="0.9" trigger="2"/>
<expression type="high" expression="loadavg1 / 100.0" ds-type="node" ds-label="" value="8.0" rearm="2.0" trigger="2"/>
</group>

<group name="netsnmp-mem8"
rrdRepository = "/opt/opennms/share/rrd/snmp/">
<expression type="low" expression="(memAvailReal + memCached) / memTotalReal * 100.0" ds-type="node" ds-label="" value="8.0" rearm
</group>

<group name="netsnmp-swap10"
rrdRepository = "/opt/opennms/share/rrd/snmp/">
<expression type="low" expression="memAvailSwap / memTotalSwap * 100.0" ds-type="node" ds-label="" value="10.0" rearm="15.0" trigg
</group>
(some of the trigger values were cut off while copying and pasting).

I could then define my threshd-configuration.xml something like:

<package name="netsnmp">
<filter>IPADDR != '0.0.0.0' &amp; (nodeSysOID LIKE '.1.3.6.1.4.1.2021.%' | nodeSysOID LIKE '.1.3.6.1.4.1.8072.%')</filter>
<include-range begin="1.1.1.1" end="254.254.254.254"/>

<service name="SNMP" interval="300000" user-defined="false" status="on">
<parameter key="thresholding-group" value="netsnmp-cpu-1core"/>
<parameter key="thresholding-group" value="netsnmp-disk90"/>
<parameter key="thresholding-group" value="netsnmp-disk-relchange"/>
<parameter key="thresholding-group" value="netsnmp-mem8"/>
<parameter key="thresholding-group" value="netsnmp-swap10"/>
</service>
</package>

I found out the hard way that this does not work. As a result, I had to create one package in thresholds.xml for every possible combination of characteristics. That seems like a lot of redundancy which could be reduce through modularization and reuse. Thanks - John

Environment

Operating System: Linux Platform: PC

Acceptance / Success Criteria

None

Lucidchart Diagrams

Activity

Show:

Benjamin Reed February 8, 2010 at 1:07 PM

looks good, merged to 1.6 in cf8b5db6fdd084e3dc9ed51b1463d5f40390f9c2
merged to 1.7

Alejandro Galue February 5, 2010 at 11:43 AM

Fixed on branches: 1.6-testing and 1.7
The feature will be available on 1.6.9 and 1.7.9.

Fixed

Details

Assignee

Reporter

Components

Fix versions

Affects versions

Priority

PagerDuty

Created October 15, 2009 at 9:34 AM
Updated January 27, 2017 at 4:25 PM
Resolved November 5, 2010 at 4:44 PM