Solved: Re: Kubernetes service uptime

nsethurama · ‎24 Sep 2025

How could we get uptime for a k8s cluster (backend) services?

We are setup our SLO calculations based on the number of error free requests by total number of requests received!

However, when the service is completely down, we are worried it may affect the actual SLA but it will not change the calculation method as there won't be any change in terms of requests rate!

Hence, we would like to know the metric for k8s backend service availability or uptime monitoring!

Kinldy advise!

Thanks,

Nava

t_pawlak · ‎24 Sep 2025

Hi Nava,

You are right — if you calculate SLO only as error-free requests / total requests, then a complete outage with zero traffic won’t change the ratio, even though the service is down. To cover this gap you should add a time-based availability metric from Kubernetes.

A simple approach is to use the ratio of available replicas vs desired replicas for each deployment:

( k8s.deployment.available:splitBy() / k8s.deployment.desired:splitBy() ) * 100

When all desired replicas are running, the value = 100%.
If some pods are not available, the percentage drops.
If the whole deployment is down (available = 0, desired > 0), the metric = 0%.

You can use this ratio either in a dashboard tile or directly as the numerator/denominator in a metric-based SLO. That way you combine your request-based SLO with a replica-based uptime SLO, and you can report SLA only when both conditions are satisfied.

nsethurama · ‎25 Sep 2025

We are in Dynatrace managed and i am not able to see any metric related to

 k8s.deployment.available

or

k8s.deployment.desired

Let me know, if i am missing something here!

t_pawlak · ‎25 Sep 2025

Unfortunately, I don’t have a way to test this myself, but you can try the following approach.

Metric keys (Classic):

builtin:kubernetes.pods — count of pods (filter by phase “Running”)
builtin:kubernetes.workload.pods_desired — desired pods per workload

Metric selector (Advanced mode), grouped by cluster/namespace/workload:

(
  builtin:kubernetes.pods
    :filter(eq(k8s.pod.phase,"Running"))
    :splitBy("k8s.cluster.name","k8s.namespace.name","k8s.workload.name")
    :sum
)
/
(
  builtin:kubernetes.workload.pods_desired
    :splitBy("k8s.cluster.name","k8s.namespace.name","k8s.workload.name")
    :avg
)
*100

nsethurama · ‎28 Sep 2025

Many thanks.

Above approach helped!