24 Sep 2025
10:40 AM
- last edited on
25 Sep 2025
07:51 AM
by
MaciejNeumann
How could we get uptime for a k8s cluster (backend) services?
We are setup our SLO calculations based on the number of error free requests by total number of requests received!
However, when the service is completely down, we are worried it may affect the actual SLA but it will not change the calculation method as there won't be any change in terms of requests rate!
Hence, we would like to know the metric for k8s backend service availability or uptime monitoring!
Kinldy advise!
Thanks,
Nava
Solved! Go to Solution.
24 Sep 2025 12:16 PM
Hi Nava,
You are right — if you calculate SLO only as error-free requests / total requests, then a complete outage with zero traffic won’t change the ratio, even though the service is down. To cover this gap you should add a time-based availability metric from Kubernetes.
A simple approach is to use the ratio of available replicas vs desired replicas for each deployment:
( k8s.deployment.available:splitBy() / k8s.deployment.desired:splitBy() ) * 100
When all desired replicas are running, the value = 100%.
If some pods are not available, the percentage drops.
If the whole deployment is down (available = 0, desired > 0), the metric = 0%.
You can use this ratio either in a dashboard tile or directly as the numerator/denominator in a metric-based SLO. That way you combine your request-based SLO with a replica-based uptime SLO, and you can report SLA only when both conditions are satisfied.
25 Sep 2025 05:45 AM
We are in Dynatrace managed and i am not able to see any metric related to
k8s.deployment.available
or
k8s.deployment.desired
Let me know, if i am missing something here!
25 Sep 2025 01:16 PM
Unfortunately, I don’t have a way to test this myself, but you can try the following approach.
Metric keys (Classic):
builtin:kubernetes.pods — count of pods (filter by phase “Running”)
builtin:kubernetes.workload.pods_desired — desired pods per workload
Metric selector (Advanced mode), grouped by cluster/namespace/workload:
(
builtin:kubernetes.pods
:filter(eq(k8s.pod.phase,"Running"))
:splitBy("k8s.cluster.name","k8s.namespace.name","k8s.workload.name")
:sum
)
/
(
builtin:kubernetes.workload.pods_desired
:splitBy("k8s.cluster.name","k8s.namespace.name","k8s.workload.name")
:avg
)
*100
28 Sep 2025 06:17 AM
Many thanks.
Above approach helped!