cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Looking to upgrade from Dynatrace Managed to SaaS? See how

Maintains window during Dynatrace Managed Cluster update

deni
Advisor

Hi,

I have a case with one of our customers who claims that during an update of the Dynatrace Managed Cluster, they receive false-positive alarms — specifically Host monitoring unavailable.

While I still need to check their environment (and I doubt that these alarms are truly false positives), their request made me think about how Dynatrace actually handles problems and alerts during an upgrade.

When we create a Maintenance Window, it’s clear — for the given filters and time period, all alarms or problems are ignored. However, during a Managed Dynatrace Cluster update, it obviously doesn’t work the same way (which is absolutely expected).

My question is: How does Dynatrace decide which alerts or problems to ignore during the upgrade and actually how the upgrade works?

Thanks!

Regards, Deni

 

Dynatrace Integration Engineer at CodeAttest
6 REPLIES 6

Yosi_Neuman
DynaMight Guru
DynaMight Guru

Hi @deni 

If customer's cluster is running on a single node the unavailable alerts are expectable since the downtime of the cluster is longer then 3 minute and after the cluster is upgraded and running again it will open those "false" alarms.

If the cluster is running on 3 nodes or more that's odd and you will need to look at the logs / open a ticket.

HTH

Yos 

dynatrace certificated professional - dynatrace master partner - Matrix Soft Ware Division - Israel

Hi @Yosi_Neuman 
Thanks a lot for your reply.
Actually, we already opened a ticket, but still don’t have a resolution.

This is a cluster with more than 3 nodes, and for the last 1–2 years everything was working fine. The client confirmed that this behavior only started about a month ago.

We were able to reproduce and confirm:

  • If we stop the automatic updates, the alarms also stop.

  • If we manually trigger the update, the alarms immediately start appearing.

I’ll continue digging into the logs, but would really appreciate any advice on what specifically to look for in this case.

Regards, Deni

Dynatrace Integration Engineer at CodeAttest

Ramprasath
Newcomer

Hi @deni  we are starting to notice similar alerts for Host or Monitoring Unavailable during out AKS cluster upgrade. please check if anything got changed in the Dynatrace side. We have upgrade to the latest version of the Dynatrace operator 1.6.1

We opened a ticket for this — the Dynatrace team is still investigating. What we know so far is:

  • Once the cluster update finishes successfully, the OneAgent and ActiveGate updates are triggered.
  • This updates activity seems to cause the alarm.
  • In the logs, we can see they are never down for more than 2 minutes.

Once we have some resolution I'll update this post.

Regards, Deni

Dynatrace Integration Engineer at CodeAttest

Babar_Qayyum
DynaMight Guru
DynaMight Guru

Hello @deni 

What is the Cluster version?

Please provide us with the findings from your support team, as we have not previously encountered this type of issue.

Regards,

Babar

Hi @Babar_Qayyum,

As DynaMight Guru I suppose you have  access to the ticket?

I'm sharing it the information is a lot: https://one.dynatrace.com/hc/en-us/requests/532933

If not - I'll try to summarize what we have till now.

About the versions - actually I requested the customer to check when the problem starts e.g to get working and not working version - they couldn't tell me, but its NOT in the last month as they said the first time, before that it also had opened problems, but a month ago they started to monitor them. Since the problem is reproducible  I start digging exactly when the pods receives termination signal, when they stats again and how this can be mapped to the opened Problem times. 

Regards, Deni

Dynatrace Integration Engineer at CodeAttest

Featured Posts