We recently got a failure rate increase Problem Card with impacted users being around 1.64K, but looking into the errors experienced there was only 12 total 5XX errors. We drilled down into the impacted users but could not see any errors in their User Sessions related to the URL that was experiencing the errors.
How is the impacted users determined? This Problem Card was only for failure rate but we checked the response times as well for the users and all seemed normal. Mainly want to understand why it showed such a high number but looking into them they don't seem to be affected. The only thing we can think of is they follow the same Service Flow so were 'flagged' as potentially impacted.
Solved! Go to Solution.
Yes exactly, Davis follows the ServiceFlow backwards to all the entry points, beginning with each unhealthy service within a problem. Then Davis analyzes all the user ids that were flowing in and hitting the unhealthy service at the time of the problem and shows that in the 'potential' business impact tile. This should give you a valuable information about how many users could be impacted, even if on application or entry point side the error is handled gracefully and you would never see anything on the Website (except a error page).