Statsd Reports Incorrect Values

Description

I have the following statsd report to monitor disk percent used on monitored servers

<packageReport name="Top10_Disk_Usage" description="*****" schedule="0 0/15 * * * ?" retainInterval="1800000" status="on">
<parameter key="count" value="10"/>
<parameter key="consolidationFunction" value="AVERAGE"/>
<parameter key="relativeTime" value="SLIDINGHOUR"/>
<parameter key="resourceTypeMatch" value="dskIndex"/>
<parameter key="attributeMatch" value="ns-dskPercent"/>
</packageReport>

<report name="Top10_Disk_Usage" class-name="org.opennms.netmgt.dao.support.TopNAttributeStatisticVisitor"/>

On each report, however, incorrect values are reported. For example one of the monitored servers reports 79% disk space used both on the server itself and in the ns-dskPercent datasource but the statsd report shows that it is 34.42% used.

Pasted below the some of the debug logs from my statsd.log file with the hostnames crossed out (just from the example node explained above):

2015-12-11 10:45:00,985 DEBUG [Statsd_Worker-3] o.o.n.d.s.DefaultResourceDao: getChildResource: returning resource node[**]
2015-12-11 10:45:00,989 DEBUG [Statsd_Worker-3] o.o.n.d.s.DefaultResourceDao: getChildResource: returning resource node[**].dskIndex[_root_fs]
2015-12-11 10:45:00,989 DEBUG [Statsd_Worker-3] o.o.n.d.s.RrdStatisticAttributeVisitor: Aggregating: 79.0
2015-12-11 10:45:00,990 DEBUG [Statsd_Worker-3] o.o.n.d.s.RrdStatisticAttributeVisitor: Aggregating: 79.0
2015-12-11 10:45:00,990 DEBUG [Statsd_Worker-3] o.o.n.d.s.RrdStatisticAttributeVisitor: Aggregating: 79.0
2015-12-11 10:45:00,990 DEBUG [Statsd_Worker-3] o.o.n.d.s.RrdStatisticAttributeVisitor: Aggregating: 79.0
2015-12-11 10:45:00,990 DEBUG [Statsd_Worker-3] o.o.n.d.s.RrdStatisticAttributeVisitor: The value of node[**].dskIndex[_root_fs].ns-dskPercent is 26.34854761904762

2015-12-11 10:45:01,625 DEBUG [Statsd_Worker-3] o.o.n.s.DatabaseReportPersister: Adding org.opennms.netmgt.model.StatisticsReportData@7a8b39aa[report=Top10_Disk_Usage,resourceId=node[**].dskIndex[_root_fs],value=26.34854761904762]

Acceptance / Success Criteria

None

Attachments

1
  • 10 Jun 2016, 03:15 PM

Lucidchart Diagrams

Activity

Show:

jmk December 26, 2016 at 6:54 AM

Hi,
I found an issue (under 18.0.2) which maybe related to your problem : please see the last patch in https://opennms.atlassian.net/browse/NMS-8944#icft=NMS-8944.

Corey Hammerton June 10, 2016 at 3:17 PM

The JRBs previously listed are in the attached tarball file.

Corey Hammerton June 10, 2016 at 3:15 PM

We are still experiencing this issue on 2 environments. One storing datacollection to JRBs the other storing to NEWTS.

In a separate script we're using the Resources REST API to calculate the Top 10 Disk Percent Usage of the last 15 minutes. Below are the Resource IDs of the Top 10 dskIndex resources by the ns-dskPercent object.

node[25].dskIndex[_root_fs]
node[9].dskIndex[_root_fs]
node[4].dskIndex[_root_fs]
node[18].dskIndex[_root_fs]
node[3].dskIndex[_root_fs]
node[2].dskIndex[_root_fs]
node[12].dskIndex[_root_fs]
node[17].dskIndex[_root_fs]
node[10].dskIndex[_root_fs]
node[23].dskIndex[_root_fs]

The following list is the same calculated by the Statsd engine:

node[8].dskIndex[_root_fs]
node[6].dskIndex[_root_fs]
node[16].dskIndex[_root_fs]
node[10].dskIndex[_root_fs]
node[7].dskIndex[_root_fs]
node[12].dskIndex[_root_fs]
node[20].dskIndex[_root_fs]
node[24].dskIndex[_root_fs]
node[18].dskIndex[_root_fs]
node[9].dskIndex[_root_fs]

Seth Leger June 8, 2016 at 2:54 PM

Is this still an issue? If so, can you attach a copy of the RRD files that make up the report so that we can try to reproduce the issue? Thanks.

Seth Leger December 23, 2015 at 3:39 PM

We need to investigate why this is happening before we release the next 17 bugfix release.

Details

Assignee

Reporter

Labels

Affects versions

Priority

PagerDuty

Created December 11, 2015 at 10:52 AM
Updated September 21, 2021 at 6:24 PM

Flag notifications