Big Brother (cough cough) has a matrix that displays a group of servers and indications of the health of such things as CPU, disk, memory, etc.
I would like to see something on the node and/or interface pages that provides a visual cue to the health of various statistic for each device.
We could trigger on sysObjectID and calculate from datacollection-config.xml what .rrds are available for a device. Then cross-reference this with thresholds.xml to see what the thresholds are.
The next bit is more difficult. How do we display whether or not a threshold has been exceeded?
The easiest way would be to drive it from events. A threshold exceeded event turns it red, a threshold rearmed event turns it green. I know people also want "yellow" so as we rework threshd we might keep that in mind.
A cooler way would be to not only indicate via color the condition of the value, but to use one of Tufte's "sparkline" graphs next to it.
Big Brother (cough cough) has a matrix that displays a group of servers and
indications of the health of such things as CPU, disk, memory, etc.
I would like to see something on the node and/or interface pages that provides a
visual cue to the health of various statistic for each device.
We could trigger on sysObjectID and calculate from datacollection-config.xml
what .rrds are available for a device. Then cross-reference this with
thresholds.xml to see what the thresholds are.
The next bit is more difficult. How do we display whether or not a threshold has
been exceeded?
The easiest way would be to drive it from events. A threshold exceeded event
turns it red, a threshold rearmed event turns it green. I know people also want
"yellow" so as we rework threshd we might keep that in mind.
A cooler way would be to not only indicate via color the condition of the value,
but to use one of Tufte's "sparkline" graphs next to it.