Solved: Why infrastructure baselines are defined only by static thresholds?

kohei-saito · ‎16 Jan 2019

Hi,

In Dynatrace, response time/error rates/loadings of services are monitored by dynamic and automatic baselines.

On the other hand, however, infrastructure monitoring is on the basis of not those baselines but pre-defined or user-defined static thresholds.

I wonder why Dynatrace doesn't use automatic baselines for infrastructure monitoring.

Are there any reasons for this?

Kohei

skrystosik · ‎16 Jan 2019

According to my experience infrastracture metrics by default are using baselines and static thresholds as well... we’ve got reports from Dynatrace not only when static threshold were used but when change from pattern was to big as well. Why do you assume that there are only static thresholds?

Sebastian

Regards, Sebastian

kohei-saito · ‎16 Jan 2019

Hi @sebastian k.

Oh, is that so?

The reason why I consider that infrastructure monitoring uses only static thresholds is the following URL.

https://www.dynatrace.com/support/help/shortlink/p...

This page seems to say that automated baseline can be used only for applications or services monitoring and infrastructure metrics aren't monitored with baselines.

Maybe, am I misunderstanding this description...?

skrystosik · ‎16 Jan 2019

Hmm I'm not sure but generaly situation like high CPU or low disk space is quite static. 95% of CPU usage isnt good the same like 2% of free storage. So such thresholds may be static. Maybe I've got notifications that were matching configured static thresholds and I've taken them as baselines violations. It's interesting

Regards, Sebastian

waikeat_chan · ‎16 Jan 2019

Hi Kohei,

Your understanding is exactly my understanding as well:

For service and application:

- 'Automatic' means baseline is used, we can configured how far off we are, from the baseline, before an alert is raised.

- 'Static' means use static threshold/SLA albeit baseline has been generated

For Infra

- No baseline is ever generated. Everything is static threshold, 'automatic' means use default threshold of the tool

- 'Static' means you use your own threshold.

So seems like the word 'automatic' carries different meaning (huge different) in the context of service and application, vs in the context of infra.

But again, this is my observation, I might be wrong, let's wait for some Dynatrace staff to chirp in.

kohei-saito · ‎18 Jan 2019

Hi Wai,

thanks for your additional explanation!

Yes, what you mean is as well as my understanding.

'Automatic' in infrastructure thresholds means just like 'default thresholds Dynatrace suggests'

By the way, that reminds me that I posted a question about the meaning of 'Automatic' in the point of view of infrastructure monitoring.

The post is the following:

https://community.dynatrace.com/questions/208354/what-does-automatically-in-the-view-of-anomaly-det....

wolfgang_beer · ‎16 Jan 2019

Within Infrastructure metrics you can go with the 'automatic' mode as well but that does mean that Dynatrace decides on a good default threshold. You can overwrite the automatic mode by setting your own static thresholds.

You are right that we do automatically baseline all key performance metrics, error rate and traffic in a dimensional baseline cube (time to first byte, speed index, response time, visually complete, DOM interactive) as well as the service response time and error rates.

Within infrastructure metrics the automatic mode in many cases means to use the best practice standards that the individual vendors propose, such as thresholds introduced by AWS, VMWare, etc.

skrystosik · ‎16 Jan 2019

thx for explanation 🙂

Regards, Sebastian

rohit_sharma · ‎16 Jan 2019

@Wolfgang B. Can we export the automatic configured threshold values(individual vendor proposed) for infrastructure being used in dynatrace ?

dave_mauney · ‎16 Jan 2019

The static threshold technique is evolving with the new AI 2.0:

https://www.dynatrace.com/news/blog/enhanced-ai-ro...

kohei-saito · ‎18 Jan 2019

Hi Dave,

thanks for your information.

Exactly, 2nd generation AI makes better analysis!

Does this enable Dynatrace system to use automatically generated baseline for infrastructure monitoring?

dave_mauney · ‎18 Jan 2019

From the blog above, it appears more that it looks at deviation from the norm. Similar to a baseline, but I believe a bit more flexible. This quote is key: "The example below shows how the new AI detects root causes without triggering false-positive alerts. The root-cause analysis states that the unhealthy host shows a 75% CPU usage increase as well as an increased number of Tomcat busy threads."