Solved: Understanding the High CPU Throttling Alert

fcalega · ‎17 May 2023

Hello Community.

I need to understand how this alert is formed.
I understand that the calculation is the Sum of workload CPU throttled / Sum of workload CPU usage * 100.

This calculation gives the percentage of throttling that is given from the throttled metric over the CPU usage metric.
When we turn on this alert, even with a threshold of 100%, we have a lot of alerts going off.
In analyzing the microservices, we see that these alerts are triggered by specific spikes in CPU throttling/usage.

The questions that arise for me are as follows:

1 - How to correctly configure the thresholds to actually alert microservices with high CPU throttling/usage that is sustained over time, and not triggered by specific peaks.
2 - What recommendations do you give or what documentation is available to further understand how Dynatrace measures and alarms throttling?

Thank you very much.

Translated with www.DeepL.com/Translator (free version)

ChadTurner · ‎15 Jun 2023

you can create a custom threshold that allows you to define the height of the usage for alerting, and you can also create a custom metric event that allows you to define the timeline of alerting. so say alert me when this threshold hits for 6 Mins every hour. Now it wont be a full 6 mins, rather the sum of the mins where the threshold was breached in your 1 hour slot. So, 1 min of high CPU every 10 mins will trigger the alert, even though there are 9 mins in-between each.

-Chad