Document the logic behind the response time value reported by the SnmpMonitor

Description

During a support session, I've discovered that the response time graphs associated with the SnmpMonitor shows values greater than the configured timeout.

Digging into the code, I found that the reason for this is due to the fact that the TimeTracker responsible to return the actual value of the response time is created outside the retry loop, and it is not re-initialized on each attempt (or retry). So, if the monitor implementation has to retry to get a response, and it actually gets the response within one of the retry attempts, the response time will be the total amount of time spent during all the attempts (which can be greater than the timeout).

A future enhancement could be add an optional parameter to let the user choose the behavior. In this case, we can choose between having the total transaction time, or having the time spent on the last attempt.

For now, update the documentation to reflect the current behavior is enough.

Acceptance / Success Criteria

None

Lucidchart Diagrams

Activity

Show:

Alejandro Galue June 30, 2017 at 6:10 PM

Ron Roskens June 28, 2017 at 11:37 AM

I like the enhancement idea. Having LatencyStoringServiceMonitorAdaptor record these additional polling metrics would be good to have.

PollStatus could record these additional metrics:

  • # of tries

  • # of failed polls

  • # of succeeded polls

  • total duration

  • average failed poll duration

  • success poll duration

Fixed

Details

Assignee

Reporter

Labels

Components

Affects versions

Priority

PagerDuty

Created June 28, 2017 at 10:58 AM
Updated July 5, 2017 at 8:56 PM
Resolved July 5, 2017 at 8:56 PM