19 Nov 2020 03:38 PM
We have instrumented OpenShift cluster with Dynatrace OneAgent in our environment and one of the services running within the container had OOM error on Metaspace. As a result the process was terminated and restarted automatically.
We do see events related to the process restart but there is NO event related to the OOM error, but we could clearly see that in the application log file. It was really difficult for us to figure out why the process was restarted as Dynatrace was NOT providing any details.
We have added the following events to be captured in the Kubernetes settings in Dynatrace. Is there anything else that we should be adding to capture the OOM errors? Please advise.
involvedObject.kind=Node
type=Warning
involvedObject.kind=Pod
reason=BackOff
Thanks,
Ganesh
Solved! Go to Solution.
01 Dec 2020 02:23 PM
sometimes this can be trial and error. Id make sure that within your Kubernetes settings, you have defined out the Events Field Selectors as this will guarantee you'll capture the data. Once the data is captures, you can always create a custom event for alerting as well.
05 Jan 2022 04:11 PM
Hi Chad,
Could you please elaborate as to what needs to be done/configured to create custom events from the captured kubernetes events?
Thanks. Tibebe
05 Jan 2022 07:57 PM
@tibebe_m_digafe you need to create Log Event to get alerted on a particular kubernetes event type.
06 Jan 2022 10:55 PM
Thanks @Julius_Loman