I have a question about two different but very similar built-in Kubernetes metrics.
At the moment in my environment all the "builtin:cloud.kubernetes.*" metrics have values in them and I can use them and as far as the monitoring of the Kubernetes cluster everything is there and working properly.
But I have noticed the presence of the "builtin:kubernetes.*" metrics which have no value and nothing has ever been written and I don't really understand the difference between these 2 metrics, I didn't find much explanation of what's different on the documentation, the Kubernetes hyperlink takes me to the "builtin:cloud.kubernetes" one but the "builtin:kubernetes" one is also present but I couldn't figure out why just one kind of metrics holds values, does anyone knows why?
Solved! Go to Solution.
I have already observed it. Builtin:kubernetes metrics are empty on my side also however I do not have colud integration (AWS,AZURE,GCP) in my environments only have onprem Openshift and OKD integrations. These builtin:kubernetes metrics are listed in the documentation but I could no find any other information about them: Built-in metrics | Dynatrace Docs
As I have checked them I have found a uniqe one especially the frist in the list: builtin:kubernetes.container.oom_kills. This is not exist in builtin:colud.kubernetes.
Maybe I am wrong but once I have seen a perf clininc video with Henrik Rexed. (at 32:10)
He has used an OOM metric on his dashboard, but it can be come from log.kuberenets event also. Maybe he has more info about these mysterious kubernetes metrics.
we are introducing (and deprecating) multiple metrics in the Kubernetes area with the upcoming versions of our product. The metrics starting with builtin:kubernetes, are already a first sign for this. I've just published a community post with lots of details on this here.
Happy to answer any questions on that over there 🙂
Thanks for the clarifiaction. It's clear now. Will these new metric set be the base of the coming separate Kubernetes anomaly detection settings?
Hi Mizső - it's planned for later this year and has very high prio internally - so I'd hope this plan holds 🙂 There's no fixed GA version yet.
Hi @MariusNicolae - I can understand. sorry for the inconvenience - we really tried hard to come up with some auto-migration, but with the new formats/dimensions, we could not find a reliable way of doing so.
On the positive side: Did you check out the "metric audit tool" mentioned in my other post? This hopefully helps you a lot with this task - if not, feedback is appreciated 🙂
I gave the tool a try and while the dashboard part is right the alerting doesn't seem to be working, at least not in my environment, I have set up a few custom events for alerting using the metrics and on the report they did not show up
I have a few alerts that all use the same metric but with different tags and none of these got picked up by the tool
Is the alerts part not about the custom alerts that are defined?
Hi @MariusNicolae ,
as you mentioned, this should work for custom alerts as shown in your screenshot. However,
the builtin:cloud.kubernetes.cluster.readyz and builtin:cloud.kubernetes.node.conditions metric are likely updated with AG 1.249 (engineering is currently looking into that), which is why bot of them are not yet mentioned here - consequently, both of them are also not yet part of the tool. Could you please try again with a custom alert based on any of the mentioned metrics from here.