Solved: Re: When containers recycle itself, Managed mark it as a 'host or monitoring unavailable' problem, or doesn't raise a problem but somehow mark the con...

waikeat_chan · ‎21 Apr 2019

Is there any way to make this won't be marked as a problem? (I encountered this for my two different customer as well: One customer we are monitoring using OneAgent Operator in their K8S cluster; Another customer we are doing OneAgent integration with their docker image)

Is this by-design? or it has something to do with the way we carry out the docker instrumentation?

If this is by-design, any work-around or any RFE for this already? currently I am using maintenance window to suppress availability problem with containers, which I don't think is ideal in case 'real' availability problem really happens.

I've asked the same question before (https://community.dynatrace.com/spaces/482/dynatrace-open-qa/questions/219550/how-to-make-a-containe...), thought that this time I should be more verbose and clear about this.

Best Regards,

Wai Keat

rodrigo_alvare1 · ‎22 Apr 2019

Hello Wai,

When an agent is going to get shut down gracefully, it sends a signal to Dynatrace server notifying that it will be disconnected. With this behaviour, you will not get any alert/new problem, and the host will appear as offline.

The problem comes when this "goodbye" signal is not being sent. This could be due to different reasons like if the VM gets destroyed before the agent can send the signal.

I have seen this behaviour in different scenarios (CloudFoundry, this is fixed with the last oneagent release, and Azure autoscaling, I think this was also solved). But I have not seen this at container level.

Can you share details of Dynatrace version, the platform where you are deploying K8s, K8s version... ?

Regards

wolfgang_beer · ‎23 Apr 2019

Yes we recently fixed the behavior also for AWS spot instances, were instances are reclaimed on a regular basis. In those cases OneAgent does no longer alert on an unexpected unavailable instance.

We are also working on a more generic fix for handling CloudFoundry scale up and scale downs by introducing an info event that can be used by platforms to inform Dynatrace about an expected shutdown.

larry_roberts · ‎10 Jul 2019

We are having the same issue right now only with GCP. Maintenance windows will not work because Google can spin things up and down at any time.

When containers recycle itself, Managed mark it as a 'host or monitoring unavailable' problem, or doesn't raise a problem but somehow mark the container as 'offline' or 'unmonitored' (instead of shutdown)