We have a service that has low requests overnight. This was in 100% failure, but the initial problem at 1:41am closes at 3:20am when traffic drops to zero. A new problem is generated for the exact same failure condition (root cause) when new requests come in. This occurred three times this night.
Question: Is there a way to keep the initial problem open until a successful request comes in?
Hi @richard_guerra ,
Have you tried changing the automatic baseline settings or trying for this service to set a static threshold? Perhaps with low traffic, the data sample does not exceed the set value.
Hello @radek_jasinski , thanks for the idea.
We did not try this (set a static threshold for failure rate) as it would not be a feasible solution for the numerous low volume services and not possible to apply globally.
I do not see that as a solution to the issue though as even if I set a manual threshold, when the traffic drops to zero the failure rate drops to zero as well and the problem shortly closes due to the drop in failure rate. When the traffic picks up again and requests are failing, a new problem is generated.
What else comes to mind is tuning the Smart Alerting configuration https://docs.dynatrace.com/docs/shortlink/automated-baselining#smart-alerting.
You can then set when DT should consider the problem is resolved.