In Dynatrace, response time/error rates/loadings of services are monitored by dynamic and automatic baselines.
On the other hand, however, infrastructure monitoring is on the basis of not those baselines but pre-defined or user-defined static thresholds.
I wonder why Dynatrace doesn't use automatic baselines for infrastructure monitoring.
Are there any reasons for this?
According to my experience infrastracture metrics by default are using baselines and static thresholds as well... we’ve got reports from Dynatrace not only when static threshold were used but when change from pattern was to big as well. Why do you assume that there are only static thresholds?
Oh, is that so?
The reason why I consider that infrastructure monitoring uses only static thresholds is the following URL.
This page seems to say that automated baseline can be used only for applications or services monitoring and infrastructure metrics aren't monitored with baselines.
Maybe, am I misunderstanding this description...?
Hmm I'm not sure but generaly situation like high CPU or low disk space is quite static. 95% of CPU usage isnt good the same like 2% of free storage. So such thresholds may be static. Maybe I've got notifications that were matching configured static thresholds and I've taken them as baselines violations. It's interesting
Your understanding is exactly my understanding as well:
For service and application:
- 'Automatic' means baseline is used, we can configured how far off we are, from the baseline, before an alert is raised.
- 'Static' means use static threshold/SLA albeit baseline has been generated
- No baseline is ever generated. Everything is static threshold, 'automatic' means use default threshold of the tool
- 'Static' means you use your own threshold.
So seems like the word 'automatic' carries different meaning (huge different) in the context of service and application, vs in the context of infra.
But again, this is my observation, I might be wrong, let's wait for some Dynatrace staff to chirp in.
thanks for your additional explanation!
Yes, what you mean is as well as my understanding.
'Automatic' in infrastructure thresholds means just like 'default thresholds Dynatrace suggests'
By the way, that reminds me that I posted a question about the meaning of 'Automatic' in the point of view of infrastructure monitoring.
The post is the following:
Within Infrastructure metrics you can go with the 'automatic' mode as well but that does mean that Dynatrace decides on a good default threshold. You can overwrite the automatic mode by setting your own static thresholds.
You are right that we do automatically baseline all key performance metrics, error rate and traffic in a dimensional baseline cube (time to first byte, speed index, response time, visually complete, DOM interactive) as well as the service response time and error rates.
Within infrastructure metrics the automatic mode in many cases means to use the best practice standards that the individual vendors propose, such as thresholds introduced by AWS, VMWare, etc.
From the blog above, it appears more that it looks at deviation from the norm. Similar to a baseline, but I believe a bit more flexible. This quote is key: "The example below shows how the new AI detects root causes without triggering false-positive alerts. The root-cause analysis states that the unhealthy host shows a 75% CPU usage increase as well as an increased number of Tomcat busy threads."