cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

Dynatrace oneagent miss the logs of a kubernetes pod during a crash/termination/restarts

amr_1509
Visitor

Has anyone encountered issues with Kubernetes pods losing shutdown logs during crashes, restarts, or terminations?

We’ve noticed that during these events, the shutdown logs are generated but quickly deleted along with the pod's log files. As a result, the Dynatrace OneAgent is unable to capture all those logs before they’re removed.

If you’ve experienced this before and found a solution, I’d greatly appreciate your insights!

3 REPLIES 3

gopher
Mentor

Hi @amr_1509, this occurs at start up as well, since injection isn't ready.

The only way I've found to combat this is to move away from the agent capturing the logs and getting a feed from Fluent Bit or equivalent.  This will give you full logs including any Init container logs as well that might be critical to the investigation.  
Stream Kubernetes logs with Fluent Bit — Dynatrace Docs

amr_1509
Visitor

Thank you @gopher ! Yes thats absolutely right that fluentbit is capable of capturing all the logs. We use fluentbit to read the logs for all of our ECS services. We have 250+ kubernetes services running with oneagent scrapping the logs and its currently not a good idea to change everything to use fluentbit.
I was wondering if there is any workaround to fix this with oneagent itself.

Unfortunately, not that I've come across (maybe someone else has) with OneAgent - this is because of where and when it loads. it only exists in the pods & lives from injection until it receives the terminate signal as part of pod shutdown. same as during startup the agent has to be connected and running before logs can be sent.  

FluentBit is able to handle that number of services without issue. 


Featured Posts