Re: Cloud Native Full Stack: Workloads require manual restart after full OpenShift cluster reboot to get Deep Monitoring

deni · ‎09 Jun 2026

Hi,

I'm observing a reproducible behavior in an OpenShift cluster monitored with Dynatrace Cloud Native Full Stack and would like to understand whether this is expected or if others have seen something similar.

Environment

OpenShift 4.19
Dynatrace Operator 1.8.1
Cloud Native Full Stack enabled
Dynatrace CSI Driver running
Dynatrace Webhook running
ActiveGate running

What happens

After a full cluster shutdown/startup, the cluster comes back healthy and workloads start successfully.

However, multiple processes appear in Dynatrace as:

Restart required
Failed to enable

Examples include:

Spring Boot application workloads
kube-apiserver
kubelet
openshift-apiserver
etcd-related processes

Application workload example

For our Spring Boot application we verified the following:

Immediately after cluster startup:

Application pod is running
Service is reachable
Process appears in Dynatrace
Deep Monitoring is not fully active
Dynatrace reports Restart required

After performing a rollout restart of the deployment:

/opt/dynatrace/oneagent-paas is mounted
OneAgent libraries are loaded
Deep Monitoring becomes enabled
Services and process details appear correctly

What makes this interesting

This is not a one-time occurrence.

We can reproduce it after every full cluster reboot:

Shut down the entire cluster.
Start the cluster again.
Workloads start successfully.
Dynatrace reports multiple processes as Restart required.
Manual pod restart fixes application workloads.

Additional observation

Some processes occasionally disappear from Host → Processes view while remaining visible and active in Process Group view. Opening the host through the process relationship sometimes makes the process visible again.

Question

Has anyone seen similar behavior with Cloud Native Full Stack after a complete OpenShift cluster restart?

Is it expected that workloads may start before the Dynatrace CSI driver/webhook are fully ready, requiring a restart to receive Deep Monitoring?

Are there any recommended practices to ensure workloads are instrumented automatically after cluster recovery without requiring manual rollout restarts?

Thanks!

Regards, Deni

Dynatrace Integration Engineer at CodeAttest

Julius_Loman · ‎09 Jun 2026

This is not a standard situation and should not happen. Dynatrace uses priorityClass to have its components started first.

I'd recommend either opening a support case or checking your Dynatrace component logs and diagnosing the pod events for any Dynatrace startup issues. What can happen is that the download of Dynatrace images takes too much time, the pods do not wait for it and are started without Dynatrace. But this is just my assumption, and it needs to be diagnosed in your environment.

Dynatrace Ambassador | Alanata a.s., Slovakia, Dynatrace Master Partner

deni · ‎09 Jun 2026

@Julius_Loman

A bit more context from my side:

This is my own lab Bare Metal OpenShift cluster which I use to learn and test Dynatrace features.

The applications are demo workloads and the traffic is synthetic/test traffic.

The entire environment was built from scratch by me, including the OpenShift setup, storage configuration, networking, Dynatrace deployment, applications development and deployment, supporting services ... . Because of that, it is entirely possible that I have introduced a configuration issue somewhere rather than encountering an actual Dynatrace product problem.

My goal is not only to make the monitoring work, but also to better understand:

how Cloud Native Full Stack injection works,
the startup dependencies between Operator, Webhook, CSI Driver and workloads,
where to look when instrumentation does not happen as expected,
which logs and components are most useful during troubleshooting.

Given the behavior I'm seeing after a full cluster reboot, could you suggest what evidence you would collect first?

For example:

Which Dynatrace component logs would you inspect first?
Are there specific webhook, CSI or Operator messages that indicate failed or missed injection?
Is there a way to verify whether a pod started before Dynatrace injection became available?
Are there any OpenShift events or Dynatrace diagnostics that would help prove or disprove a startup ordering issue?

I'm mainly trying to learn the correct troubleshooting approach and understand what "good" versus "bad" startup behavior should look like in a Cloud Native Full Stack environment.

Thanks!

Regards, Deni

Dynatrace Integration Engineer at CodeAttest