We have a number of incident rules based on visually complete measures. The incident seems to be triggering correctly, but the 'incident ended' email contains the message "Was 10096.15 ms but should be lower than 10000.00 ms", which makes no sense, if the value is higher than the threshold, it should still be an open incident.
What we have determined is that the value '10096.15' is 'Measured peak value', according to the Incident Chart dashboard: "Test for alerting - Loading of page viewbill - Count - Visually Complete Time: Measured peak value: 10096.15 [num], Upper Severe Bound: 10000.00"
We are struggling to understand where/how the 'Measured peak value' is being calculated. There is no combination of charts and avg/max aggregated measures that shows anything close to 10096.15ms. We have setup an incident rule with a 1 minute evaluation window based on the average, the BT Visually Complete Time calculated measure aggregation is 'avg', but the Measured peak value never seems to line up with the data we have.
What are we missing?
Not 100% sure but if it doesn't line up exactly with something you can chart I bet it could be the actual single data point as opposed to any aggregation you might be able to chart. If this is the case it wouldn't be calculated in any way but rather just the single data point relevant for that measure that had the highest value. Any measures you chart will be aggregated in some way or another.
Just an educated guess though - I don't often look at the "measured peak value" as the measures themselves should be all that is relevant for incidents.
That makes sense, the issue is that the 'Sever incident ended' email by default states:
Was 12447.00ms but should be lower than 12000.00ms.
And if you look at the incident dashboard it says that the 12447 is the measured peak rate. Why would it put that in the email instead of the current response time?
That's just what they decided to include as part of an incident record. Including peak makes sense to me as you may want to see by how much your thresholds were violated, whereas you can assume that after it has ended it is below that. Also it's probably part just what is possible given how the incident records work and what data is maintained for them.