<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>article [Kubernetes Deployments] Troubleshooting CrashLoopBackOff errors in Troubleshooting</title>
    <link>https://community.dynatrace.com/t5/Troubleshooting/Kubernetes-Deployments-Troubleshooting-CrashLoopBackOff-errors/ta-p/263804</link>
    <description>&lt;H1&gt;&lt;STRONG&gt;Self Service Summary&lt;/STRONG&gt;&lt;/H1&gt;
&lt;P&gt;This article of type Full-Self-Service helps with root cause analysis of failed OneAgent deployments or updates on Kubernetes/OpenShift and explains the&amp;nbsp;CrashLoopBackOff state.&lt;/P&gt;
&lt;TABLE&gt;
&lt;TBODY&gt;
&lt;TR&gt;
&lt;TD&gt;
&lt;P&gt;&lt;STRONG&gt;Issue&lt;/STRONG&gt;&lt;/P&gt;
&lt;/TD&gt;
&lt;TD&gt;
&lt;P&gt;&lt;STRONG&gt;Solution&lt;/STRONG&gt;&lt;/P&gt;
&lt;/TD&gt;
&lt;TD&gt;
&lt;P&gt;&lt;STRONG&gt;Tasks&lt;/STRONG&gt;&lt;/P&gt;
&lt;/TD&gt;
&lt;TD&gt;
&lt;P&gt;&lt;STRONG&gt;Alternative&lt;/STRONG&gt;&lt;/P&gt;
&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;
&lt;P&gt;OneAgent installation/update: Pod failed to start with CrashLoopBackOff&lt;/P&gt;
&lt;/TD&gt;
&lt;TD&gt;
&lt;P&gt;Check logs for root cause and address on your side&lt;/P&gt;
&lt;/TD&gt;
&lt;TD&gt;
&lt;P&gt;Follow the steps outlined below&lt;/P&gt;
&lt;/TD&gt;
&lt;TD&gt;
&lt;P&gt;Contact Dynatrace Customer Success and Support via chat or ticket&lt;/P&gt;
&lt;/TD&gt;
&lt;/TR&gt;
&lt;/TBODY&gt;
&lt;/TABLE&gt;
&lt;H1&gt;&amp;nbsp;&lt;/H1&gt;
&lt;H1&gt;&lt;STRONG&gt;Introduction&lt;/STRONG&gt;&lt;/H1&gt;
&lt;P&gt;&lt;STRONG&gt;CrashLoopBackOff&lt;/STRONG&gt;&amp;nbsp;is a&amp;nbsp;&lt;STRONG&gt;Kubernetes state&lt;/STRONG&gt;&amp;nbsp;representing a&amp;nbsp;&lt;STRONG&gt;restart loop&lt;/STRONG&gt;&amp;nbsp;that is happening in a Pod: a container in the Pod is started, but&amp;nbsp;&lt;STRONG&gt;crashes and is then restarted&lt;/STRONG&gt;, over and over again. Kubernetes will wait an increasing back-off time between restarts to give you a chance to fix the error. As such, CrashLoopBackOff is not an error on itself, but indicates that there’s an error happening that prevents a Pod from starting properly.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;H1&gt;Preserve the crime scene&lt;/H1&gt;
&lt;P&gt;First of all, please make sure to gather the logs for troubleshooting and future reference. If the problem resolves after a restart or re-installation, you may still want to know the root cause and what happened that day.&amp;nbsp;&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;Follow this link and collect the &lt;A href="https://docs.dynatrace.com/docs/shortlink/installation-k8s-troubleshooting#support-archive" target="_self"&gt;operator support archive&lt;/A&gt;&amp;nbsp;- this file contains Dynatrace pods logs including Operator and OneAgent pod if applicable.&lt;/LI&gt;
&lt;LI&gt;Additionally, run the command&amp;nbsp;&lt;CODE&gt;kubectl describe -n dynatrace&amp;nbsp;CrashLoopBackOff-pod&lt;/CODE&gt; to find useful information indicating what happened in the event of this output.&lt;/LI&gt;
&lt;/OL&gt;
&lt;H1&gt;&amp;nbsp;&lt;/H1&gt;
&lt;H1&gt;&lt;STRONG&gt;Root cause analysis&lt;/STRONG&gt;&lt;/H1&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;H2&gt;&lt;STRONG&gt;Unsupported&amp;nbsp;&lt;/STRONG&gt;&lt;/H2&gt;
&lt;P&gt;This first scenario covers unsupported &lt;STRONG&gt;Kubernetes distributions&lt;/STRONG&gt;. Please verify if you are using a supported Kubernetes environment here:&amp;nbsp;&lt;A href="https://docs.dynatrace.com/docs/ingest-from/technology-support#kubernetes" target="_blank" rel="noopener"&gt;https://docs.dynatrace.com/docs/ingest-from/technology-support#kubernetes&lt;/A&gt;.&lt;/P&gt;
&lt;H2&gt;&lt;STRONG&gt;Permission denied&lt;/STRONG&gt;&lt;/H2&gt;
&lt;P&gt;In Support we see such reports with OneAgent and K8s/OpenShift if another &lt;STRONG&gt;security tool prevents &lt;/STRONG&gt;the &lt;STRONG&gt;execution of &lt;/STRONG&gt;the &lt;STRONG&gt;oneagentwatchdog&lt;/STRONG&gt; process. You'll see this OneAgent log:&lt;/P&gt;
&lt;PRE&gt;Failed to execute /opt/dynatrace/oneagent/agent/lib64/oneagentwatchdog: error code: 13 (Permission denied)&lt;/PRE&gt;
&lt;P&gt;Validate if oneagentwatchdog has execute permission&amp;nbsp;(&lt;CODE&gt;ls -l&lt;/CODE&gt;&lt;SPAN&gt;&amp;nbsp;or&amp;nbsp;&lt;/SPAN&gt;&lt;CODE&gt;stat &amp;lt;filename&amp;gt;&lt;/CODE&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;show the full permissions).&lt;/P&gt;
&lt;P&gt;If the execute bit was set, look into these logs to identify and fix the root cause:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;journalctl&lt;/LI&gt;
&lt;LI&gt;kubelet&lt;/LI&gt;
&lt;LI&gt;SELinux or AppArmor if applicable&lt;/LI&gt;
&lt;LI&gt;other security tools installed on the host.&lt;/LI&gt;
&lt;/UL&gt;
&lt;H2&gt;&lt;STRONG&gt;Unsupported downgrade&lt;/STRONG&gt;&lt;/H2&gt;
&lt;P&gt;We also see such failures when attempting a downgrade &lt;STRONG&gt;from a higher to a lower OneAgent version&lt;/STRONG&gt; which is not supported. You'll see those errors in the logs:&lt;/P&gt;
&lt;PRE&gt;[ERROR] Downgrading OneAgent is not supported, please uninstall the old version first&amp;nbsp;&lt;BR /&gt;[ERROR] Attempted downgrade from 1.&amp;lt;...&amp;gt; to 1.&amp;lt;...&amp;gt;&lt;/PRE&gt;
&lt;P&gt;Uninstall and reinstall the OneAgent.&lt;/P&gt;
&lt;H2&gt;Network connection errors&lt;/H2&gt;
&lt;P&gt;If you see a log indicating&amp;nbsp;&lt;CODE&gt;Failed to pull images&lt;/CODE&gt;&amp;nbsp;&lt;SPAN&gt;with&amp;nbsp;&lt;CODE&gt;&amp;lt;IP&amp;gt;: i/o timeout&lt;/CODE&gt;&lt;/SPAN&gt;, make sure that the mentioned IP is available. Possible reasons include network &lt;STRONG&gt;outages&lt;/STRONG&gt;/unavailable nodes&amp;nbsp;&lt;STRONG&gt;or&amp;nbsp;firewall&lt;/STRONG&gt; &lt;STRONG&gt;misconfigurations&lt;/STRONG&gt;.&lt;/P&gt;
&lt;H2&gt;Configuration errors&lt;/H2&gt;
&lt;P&gt;Additionally, incorrect &lt;STRONG&gt;variables or settings&amp;nbsp;&lt;/STRONG&gt;could cause the &lt;STRONG&gt;failure&lt;/STRONG&gt;. Review all configuration files, environment variables and make sure certificate are valid. You could also start trying the default config setup.&lt;/P&gt;
&lt;H2&gt;Resources issues&lt;/H2&gt;
&lt;P&gt;Another possibility is that the pod runs into resource issues and &lt;STRONG&gt;requests more CPU or memory than available&lt;/STRONG&gt;, leading to crashes. Check the resource allocation using&amp;nbsp;&lt;CODE&gt;kubectl describe pod [pod-name]&lt;/CODE&gt;&amp;nbsp;and adjust resource requests and limits as needed.&lt;/P&gt;
&lt;H2&gt;Application errors&lt;/H2&gt;
&lt;P&gt;As mentioned above, we typically see that OneAgent installation issues on Kubernetes can be resolved by customers themselves. Edge cases may exist where our code causes the application to fail. Contact Dynatrace Support to investigate why that's happening by providing&amp;nbsp;&lt;CODE&gt;kubectl logs [pod-name]&lt;/CODE&gt;&amp;nbsp;and other previously gathered information. Try disabling OneAgent features to isolate which feature might be causing the issue.&lt;/P&gt;</description>
    <pubDate>Wed, 25 Jun 2025 07:54:46 GMT</pubDate>
    <dc:creator>stefanie_pachne</dc:creator>
    <dc:date>2025-06-25T07:54:46Z</dc:date>
    <item>
      <title>[Kubernetes Deployments] Troubleshooting CrashLoopBackOff errors</title>
      <link>https://community.dynatrace.com/t5/Troubleshooting/Kubernetes-Deployments-Troubleshooting-CrashLoopBackOff-errors/ta-p/263804</link>
      <description>&lt;H1&gt;&lt;STRONG&gt;Self Service Summary&lt;/STRONG&gt;&lt;/H1&gt;
&lt;P&gt;This article of type Full-Self-Service helps with root cause analysis of failed OneAgent deployments or updates on Kubernetes/OpenShift and explains the&amp;nbsp;CrashLoopBackOff state.&lt;/P&gt;
&lt;TABLE&gt;
&lt;TBODY&gt;
&lt;TR&gt;
&lt;TD&gt;
&lt;P&gt;&lt;STRONG&gt;Issue&lt;/STRONG&gt;&lt;/P&gt;
&lt;/TD&gt;
&lt;TD&gt;
&lt;P&gt;&lt;STRONG&gt;Solution&lt;/STRONG&gt;&lt;/P&gt;
&lt;/TD&gt;
&lt;TD&gt;
&lt;P&gt;&lt;STRONG&gt;Tasks&lt;/STRONG&gt;&lt;/P&gt;
&lt;/TD&gt;
&lt;TD&gt;
&lt;P&gt;&lt;STRONG&gt;Alternative&lt;/STRONG&gt;&lt;/P&gt;
&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;
&lt;P&gt;OneAgent installation/update: Pod failed to start with CrashLoopBackOff&lt;/P&gt;
&lt;/TD&gt;
&lt;TD&gt;
&lt;P&gt;Check logs for root cause and address on your side&lt;/P&gt;
&lt;/TD&gt;
&lt;TD&gt;
&lt;P&gt;Follow the steps outlined below&lt;/P&gt;
&lt;/TD&gt;
&lt;TD&gt;
&lt;P&gt;Contact Dynatrace Customer Success and Support via chat or ticket&lt;/P&gt;
&lt;/TD&gt;
&lt;/TR&gt;
&lt;/TBODY&gt;
&lt;/TABLE&gt;
&lt;H1&gt;&amp;nbsp;&lt;/H1&gt;
&lt;H1&gt;&lt;STRONG&gt;Introduction&lt;/STRONG&gt;&lt;/H1&gt;
&lt;P&gt;&lt;STRONG&gt;CrashLoopBackOff&lt;/STRONG&gt;&amp;nbsp;is a&amp;nbsp;&lt;STRONG&gt;Kubernetes state&lt;/STRONG&gt;&amp;nbsp;representing a&amp;nbsp;&lt;STRONG&gt;restart loop&lt;/STRONG&gt;&amp;nbsp;that is happening in a Pod: a container in the Pod is started, but&amp;nbsp;&lt;STRONG&gt;crashes and is then restarted&lt;/STRONG&gt;, over and over again. Kubernetes will wait an increasing back-off time between restarts to give you a chance to fix the error. As such, CrashLoopBackOff is not an error on itself, but indicates that there’s an error happening that prevents a Pod from starting properly.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;H1&gt;Preserve the crime scene&lt;/H1&gt;
&lt;P&gt;First of all, please make sure to gather the logs for troubleshooting and future reference. If the problem resolves after a restart or re-installation, you may still want to know the root cause and what happened that day.&amp;nbsp;&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;Follow this link and collect the &lt;A href="https://docs.dynatrace.com/docs/shortlink/installation-k8s-troubleshooting#support-archive" target="_self"&gt;operator support archive&lt;/A&gt;&amp;nbsp;- this file contains Dynatrace pods logs including Operator and OneAgent pod if applicable.&lt;/LI&gt;
&lt;LI&gt;Additionally, run the command&amp;nbsp;&lt;CODE&gt;kubectl describe -n dynatrace&amp;nbsp;CrashLoopBackOff-pod&lt;/CODE&gt; to find useful information indicating what happened in the event of this output.&lt;/LI&gt;
&lt;/OL&gt;
&lt;H1&gt;&amp;nbsp;&lt;/H1&gt;
&lt;H1&gt;&lt;STRONG&gt;Root cause analysis&lt;/STRONG&gt;&lt;/H1&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;H2&gt;&lt;STRONG&gt;Unsupported&amp;nbsp;&lt;/STRONG&gt;&lt;/H2&gt;
&lt;P&gt;This first scenario covers unsupported &lt;STRONG&gt;Kubernetes distributions&lt;/STRONG&gt;. Please verify if you are using a supported Kubernetes environment here:&amp;nbsp;&lt;A href="https://docs.dynatrace.com/docs/ingest-from/technology-support#kubernetes" target="_blank" rel="noopener"&gt;https://docs.dynatrace.com/docs/ingest-from/technology-support#kubernetes&lt;/A&gt;.&lt;/P&gt;
&lt;H2&gt;&lt;STRONG&gt;Permission denied&lt;/STRONG&gt;&lt;/H2&gt;
&lt;P&gt;In Support we see such reports with OneAgent and K8s/OpenShift if another &lt;STRONG&gt;security tool prevents &lt;/STRONG&gt;the &lt;STRONG&gt;execution of &lt;/STRONG&gt;the &lt;STRONG&gt;oneagentwatchdog&lt;/STRONG&gt; process. You'll see this OneAgent log:&lt;/P&gt;
&lt;PRE&gt;Failed to execute /opt/dynatrace/oneagent/agent/lib64/oneagentwatchdog: error code: 13 (Permission denied)&lt;/PRE&gt;
&lt;P&gt;Validate if oneagentwatchdog has execute permission&amp;nbsp;(&lt;CODE&gt;ls -l&lt;/CODE&gt;&lt;SPAN&gt;&amp;nbsp;or&amp;nbsp;&lt;/SPAN&gt;&lt;CODE&gt;stat &amp;lt;filename&amp;gt;&lt;/CODE&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;show the full permissions).&lt;/P&gt;
&lt;P&gt;If the execute bit was set, look into these logs to identify and fix the root cause:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;journalctl&lt;/LI&gt;
&lt;LI&gt;kubelet&lt;/LI&gt;
&lt;LI&gt;SELinux or AppArmor if applicable&lt;/LI&gt;
&lt;LI&gt;other security tools installed on the host.&lt;/LI&gt;
&lt;/UL&gt;
&lt;H2&gt;&lt;STRONG&gt;Unsupported downgrade&lt;/STRONG&gt;&lt;/H2&gt;
&lt;P&gt;We also see such failures when attempting a downgrade &lt;STRONG&gt;from a higher to a lower OneAgent version&lt;/STRONG&gt; which is not supported. You'll see those errors in the logs:&lt;/P&gt;
&lt;PRE&gt;[ERROR] Downgrading OneAgent is not supported, please uninstall the old version first&amp;nbsp;&lt;BR /&gt;[ERROR] Attempted downgrade from 1.&amp;lt;...&amp;gt; to 1.&amp;lt;...&amp;gt;&lt;/PRE&gt;
&lt;P&gt;Uninstall and reinstall the OneAgent.&lt;/P&gt;
&lt;H2&gt;Network connection errors&lt;/H2&gt;
&lt;P&gt;If you see a log indicating&amp;nbsp;&lt;CODE&gt;Failed to pull images&lt;/CODE&gt;&amp;nbsp;&lt;SPAN&gt;with&amp;nbsp;&lt;CODE&gt;&amp;lt;IP&amp;gt;: i/o timeout&lt;/CODE&gt;&lt;/SPAN&gt;, make sure that the mentioned IP is available. Possible reasons include network &lt;STRONG&gt;outages&lt;/STRONG&gt;/unavailable nodes&amp;nbsp;&lt;STRONG&gt;or&amp;nbsp;firewall&lt;/STRONG&gt; &lt;STRONG&gt;misconfigurations&lt;/STRONG&gt;.&lt;/P&gt;
&lt;H2&gt;Configuration errors&lt;/H2&gt;
&lt;P&gt;Additionally, incorrect &lt;STRONG&gt;variables or settings&amp;nbsp;&lt;/STRONG&gt;could cause the &lt;STRONG&gt;failure&lt;/STRONG&gt;. Review all configuration files, environment variables and make sure certificate are valid. You could also start trying the default config setup.&lt;/P&gt;
&lt;H2&gt;Resources issues&lt;/H2&gt;
&lt;P&gt;Another possibility is that the pod runs into resource issues and &lt;STRONG&gt;requests more CPU or memory than available&lt;/STRONG&gt;, leading to crashes. Check the resource allocation using&amp;nbsp;&lt;CODE&gt;kubectl describe pod [pod-name]&lt;/CODE&gt;&amp;nbsp;and adjust resource requests and limits as needed.&lt;/P&gt;
&lt;H2&gt;Application errors&lt;/H2&gt;
&lt;P&gt;As mentioned above, we typically see that OneAgent installation issues on Kubernetes can be resolved by customers themselves. Edge cases may exist where our code causes the application to fail. Contact Dynatrace Support to investigate why that's happening by providing&amp;nbsp;&lt;CODE&gt;kubectl logs [pod-name]&lt;/CODE&gt;&amp;nbsp;and other previously gathered information. Try disabling OneAgent features to isolate which feature might be causing the issue.&lt;/P&gt;</description>
      <pubDate>Wed, 25 Jun 2025 07:54:46 GMT</pubDate>
      <guid>https://community.dynatrace.com/t5/Troubleshooting/Kubernetes-Deployments-Troubleshooting-CrashLoopBackOff-errors/ta-p/263804</guid>
      <dc:creator>stefanie_pachne</dc:creator>
      <dc:date>2025-06-25T07:54:46Z</dc:date>
    </item>
  </channel>
</rss>

