27 Nov 2024 08:16 PM
Has anyone encountered issues with Kubernetes pods losing shutdown logs during crashes, restarts, or terminations?
We’ve noticed that during these events, the shutdown logs are generated but quickly deleted along with the pod's log files. As a result, the Dynatrace OneAgent is unable to capture all those logs before they’re removed.
If you’ve experienced this before and found a solution, I’d greatly appreciate your insights!
27 Nov 2024 10:37 PM
Hi @amr_1509, this occurs at start up as well, since injection isn't ready.
The only way I've found to combat this is to move away from the agent capturing the logs and getting a feed from Fluent Bit or equivalent. This will give you full logs including any Init container logs as well that might be critical to the investigation.
Stream Kubernetes logs with Fluent Bit — Dynatrace Docs
28 Nov 2024 01:05 AM
Thank you @gopher ! Yes thats absolutely right that fluentbit is capable of capturing all the logs. We use fluentbit to read the logs for all of our ECS services. We have 250+ kubernetes services running with oneagent scrapping the logs and its currently not a good idea to change everything to use fluentbit.
I was wondering if there is any workaround to fix this with oneagent itself.
28 Nov 2024 04:10 AM
Unfortunately, not that I've come across (maybe someone else has) with OneAgent - this is because of where and when it loads. it only exists in the pods & lives from injection until it receives the terminate signal as part of pod shutdown. same as during startup the agent has to be connected and running before logs can be sent.
FluentBit is able to handle that number of services without issue.