cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

Known Issue: Kubernetes Dynatrace Operator v0.9.0 upgrade to newer version might get stuck

stefan_penner
Dynatrace Helper
Dynatrace Helper

TL;DR

While upgrading from Dynatrace Operator v0.9.0 to a newer version (for example, v0.9.1), after applying the new manifests, you may need to execute an additional command to force-delete “old” Dynatrace CSI Driver pods:

 

kubectl delete pod -n dynatrace --selector=app.kubernetes.io/component=csi-driver,app.kubernetes.io/name=dynatrace-operator,app.kubernetes.io/version=0.9.0 --force --grace-period=0

 

 

What is the Issue?

Dynatrace CSI Driver pods are deployed as DaemonSets on all nodes to provide a writeable volume storage for OneAgent and OneAgent binaries to pods. If a Dynatrace CSI driver pod is stuck in “Terminating”, OneAgents on this node can't be injected anymore into any Pods which are scheduled on this node.

 

Starting with Dynatrace Operator v0.9.0, Dynatrace CSI driver pods can get stuck in “Terminating” if the CSI driver mounted volumes recently and then gets restarted. This has been fixed with Dynatrace Operator v0.9.1 (#1252). Unfortunately, this issue affects the upgrade from Operator v0.9.0 itself to any newer version.  

 

Who is impacted?

Any Kubernetes/OpenShift clusters which use Dynatrace Operator v0.9.0 in combination with Dynatrace CSI drivers and want to upgrade to a newer version (for example, v0.9.1). There is no action needed in case you configured your Dynakube(s) to use classicFullstack injection or application-only monitoring without CSI drivers.

Here’s how you can check which version you run:

 

kubectl get pod -l name=dynatrace-operator -n dynatrace -L app.kubernetes.io/version

 


Note: If you don’t get a version/value when executing this command, you're using Dynatrace Operator < 0.6.0 and are not impacted.

 

How can I fix the issue?

Kubernetes allows to force-delete pods in case pods are stuck in termination. In order to fix the upgrade issue, all CSI driver pods need to be force deleted after applying the new operator manifests:

Kubernetes:

 

kubectl delete pod -n dynatrace --selector=app.kubernetes.io/component=csi-driver,app.kubernetes.io/name=dynatrace-operator,app.kubernetes.io/version=0.9.0 --force --grace-period=0

 

 

OpenShift:

 

oc delete pod -n dynatrace --selector=app.kubernetes.io/component=csi-driver,app.kubernetes.io/name=dynatrace-operator,app.kubernetes.io/version=0.9.0 --force --grace-period=0

 

Note: Please adapt the name of the namespace in case you don't use the dynatrace-defaults

 

Can I still use Dynatrace Operator Version 0.9.0?

Yes, you can still use Dynatrace Operator v0.9.0 and there is no immediate need for an upgrade. There shouldn’t be any issues related to CSI driver pods unless you upgrade or uninstall the Dynatrace Operator.

For all customers who plan to upgrade from an older Dynatrace Operator version (<v0.9), we do not recommend upgrading to v0.9.0. Please directly upgrade to v0.9.1 instead.  

 

 

We hope this community post helps to raise awareness and avoid troubleshooting within the Dynatrace Operator upgrade procedure.

https://www.dynatrace.com/news/blog/kubernetes-in-the-wild-2023/
4 REPLIES 4

Mizső
DynaMight Leader
DynaMight Leader

Hi @stefan_penner,

Thank you very much to share this important information.

Best regards,

Mizső

Certified Dynatrace Professional

haris
Dynatrace Enthusiast
Dynatrace Enthusiast

Hi @stefan_penner, is it possible that this could lead to this:
The customresourcedefinitions "dynakubes.dynatrace.com" is invalid: metadata.resourceVersion: Invalid value: 0x0: must be specified for an update
I have checked the yaml file and it seems like the version is specified, so I'm not sure.

 

PS. I have found that in some cases this was solved by using kubectl replace instead of kubectl apply, but I'm not that familiar with it so no idea if it helps. Thanks. 

Hi @haris , 

we have not seen and would not expect any error message related to the resource version of the Dynakube CRD caused by Operator 0.9.0 - so no, this should not be caused by the upgrade issue mentioned above. 

In your case, it sounds like there has been an invalid version of the CRD applied. In this case, you could either use kubectl replace or delete the Dynakube and re-apply it again. Kubectl apply does a three-way-merge between your local file, the live kubernetes object manifest as well as the last-applied-configuration annotation. 

Hope that helps!

https://www.dynatrace.com/news/blog/kubernetes-in-the-wild-2023/

DanielS
DynaMight Leader
DynaMight Leader

Thanks @stefan_penner 

The true delight is in the finding out rather than in the knowing.

Featured Posts