SNMP Data Collection Interfaces Directory Structure
Description
Environment
Acceptance / Success Criteria
is duplicated by
Lucidchart Diagrams
Activity
Alejandro Galue January 8, 2014 at 3:14 PMEdited
IMPORTANT:
If you are currently using storeByGroup on 1.10, and some of the groups already defined on the data collection groups are changed, those changes must be preserved while upgrading to 1.12 in order to avoid potential issues while merging the data; otherwise, the upgrade tool is going to throw exceptions and abort the process because RRDs with different amount of internal data sources are going to exist.
Remember, when storeByGroup is enabled, one RRD with the name of the group is going to be created, and it will contain several data sources (one per MibObj) according to the group. Because a RRD cannot be changed after it was created, a data collection group cannot be modified after that and a new group must be defined instead.
Alejandro Galue January 8, 2014 at 3:06 PM
IMPORTANT:
If a user is currently running 1.10.x and the nodes are managed by Capsd, there is no need to execute the upgrade tools, because Capsd properly populates all the MAC Addresses on the database for all the SNMP Interfaces no matter if they have or not an IP address associated. The issue with the MAC Address affects only Provisiond.
In 1.12, Capsd is disabled by default, and Provisiond is now handling newSuspect and forceRescan events. But, as Provisiond now handles the MAC Address properly ( like Capsd), Capsd installation are not going to suffer any problem, and the upgrade tools are not going to find any duplicated interface.
Comming from Provisiond based installation with 1.10 to 1.12 is when the upgrade tools are useful.
Alejandro Galue November 1, 2013 at 3:20 PM
Fixed on 49cfcd4ed533e04d1b11bf16950a0e68aff27834 revision for 1.12:
Author: Alejandro Galue <agalue@opennms.org>
Date: Fri Nov 1 15:07:42 2013 -0400
Feature: Upgrade Tools API
This new API provides the ability to perform post-processing tasks after
upgrading OpenNMS, that can be execute with OpenNMS up and running or
with OpenNMS stopped.
By default the Installer class has been modified in order to execute the
Upgrade Tools after its completion, unless the user specify a flag (-S,
--skip-upgrade-tools) from the command line to skip their execution.
There are currently 2 implementations enabled:
Remove non-ip-snmp-primary and non-ip-interfaces from requisitions:
https://opennms.atlassian.net/browse/NMS-5630#icft=NMS-5630, https://opennms.atlassian.net/browse/NMS-5571#icft=NMS-5571 (Offline)Merge SNMP Interface directories (Online Version): https://opennms.atlassian.net/browse/NMS-6056#icft=NMS-6056 (It works
with RRDtool and JRobin, and with or without storeByGroup enabled)
There is one implementation that it is currently disabled until the
required code changes are added:
Fix the JRB names for the new JMX Collector: https://opennms.atlassian.net/browse/NMS-1539#icft=NMS-1539, https://opennms.atlassian.net/browse/NMS-3485#icft=NMS-3485,
https://opennms.atlassian.net/browse/NMS-4592#icft=NMS-4592, https://opennms.atlassian.net/browse/NMS-4612#icft=NMS-4612, https://opennms.atlassian.net/browse/NMS-5247#icft=NMS-5247, https://opennms.atlassian.net/browse/NMS-5279#icft=NMS-5279, https://opennms.atlassian.net/browse/NMS-5824#icft=NMS-5824
The API basically requires to implement an interface called OnmsUpgrade.
The Upgrade Tools are going to find all the implementations of that
interface inside the package namespace "org.opennms" , that doesn't have
an @Ignore annotation. Then, all the implmentations are going to be
ordered, and they are going to be executed according with the current
OpenNMS running status, after checking that they haven't been executed
before.
The execution status is going to be stored on
$OPENNMS_HOME/etc/opennms-upgrade-status.properties.
The implementation also includes a new package called
"opennms-rrd-model" that contains all necesary classes to parse the Dump
of a JRobin or RRDtool file, and the output of an RRDtool Xport command.
Besides that, there are tools to convert from JRobin to RRDtool, to
split one multi-ds file into several single-ds files, and to merge a
list of single-ds files into one multi-ds file. The object
representation is handled thorugh JAXB. In the future, this tools can be
used to create ReST entries for raw or processed data from JRobin or
RRDtool files.
Alejandro Galue October 18, 2013 at 2:31 PM
All the existing KSC Reports must be updated to fix the Resource ID related with SNMP Interfaces.
Alejandro Galue October 8, 2013 at 3:04 PM
Actually, 1.10 was not working properly
In 1.10, the MAC address is stored on the DB only for those SNMP Interfaces that have an IP interface associated with them and are reachable from the OpenNMS server.
Technically, this is not correct because the SNMP interface table must reflect the content of the ifTable and the ifXTable (from the IF-MIB) for each node that supports SNMP, which means, if the MAC address is reported by the ifTable, that MAC address must be inserted into the database no matter if the SNMP interface in question has an IP associated or not.
This behavior has been fixed in 1.12 as part of the solution for https://opennms.atlassian.net/browse/NMS-5418#icft=NMS-5418.
The thing is that the MAC address is used on the algorithm to determinate the unique name of the physical interfaces when storing the JRBs/RRDs (i.e. the folder name for the interfaces), and that's why the directory structure is changed.
We still don't have a fix for this, but the idea is to design a tool to execute post-processing scripts besides checking the DB schema (which is the only thing that the install script does), in order to deal with this kind of problems.
In this particular case, the solution is to merge the old data from the old folder with the new data on the new folder. The solution for this depends on the RRD strategy enabled. If JRobin is being used, this can be done with OpenNMS up and running. But if RRDtool is being used, OpenNMS must be stopped because, by definition, an RRD can only be updated with new data.
This might be intentional change but causes lots of issues with maintaining historical data on most Cisco modular devices (Nexus 7k/Catalyst 6500). In 1.12.0 it appears that the data collection directory structure changes the existing directory from a simple description to adding MAC addresses to the directory name. This causes a big issue because when the name of the directory changes this creates new jrb files and all historical data collection is lost. In addition to that scenario, if you swap line cards or supervisor engines the interface/system MAC address can change and again the directory name changes again where the historical data is lost. This is easy enough to simply move the old jrb files into the new directory but when there are 1000's that is a bigger issue. Others on the list have also reported issues regarding this behavior on other types of systems as well.
Example after upgrade from 1.10.9 to 1.12.0:
/opt/opennms/share/rrd/snmp/1/:
drwxrwxrwx 2 root root 4096 Aug 19 08:47 Vlan99
drwxrwxr-x 2 root root 4096 Aug 19 08:58 Vlan99-0026981a16c1