Container platforms
Questions about Kubernetes, OpenShift, Docker, and more.
cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

Error % and Request Count for Kubernetes workload

Vikas_g1997
Dynatrace Guide
Dynatrace Guide

Hi Team ,

Can anyone please guide me on how we can achieve error percentage and request count for a workload?
 We have tried using the metric below, but currently, we are only able to get the error codes from the metrics.
on how to calculate error % and request count would be greatly appreciated.

Vikas_g1997_0-1764065438188.png

Expected:-

Vikas_g1997_1-1764065481688.png

 

5 REPLIES 5

dannemca
DynaMight Guru
DynaMight Guru

Try this DQL and let us know

timeseries by:{k8s.workload.name}, {failure_count = sum(dt.service.request.failure_count), total_count = sum(dt.service.request.count)}
| filter isNotNull(k8s.workload.name)
| fieldsadd total_requests = arraysum(total_count), total_failure = arraysum(failure_count), sumrequests = arraySum(total_count)
| fieldsadd failure_rate = 100.0 * total_failure / total_requests
| fields k8s.workload.name, total_requests, failure_rate
| sort failure_rate desc

Regards

Site Reliability Engineer @ Kyndryl

Hi @dannemca ,
Thank you for your support so far. Could you please help me understand how we can calculate failures based on 4xx errors? Additionally, is there a way to filter these errors by a specific namespace?
I really appreciate your guidance on this.

@Vikas_g1997 Try with the below 

timeseries {failure_count = sum(dt.service.request.failure_count), total_count = sum(dt.service.request.count)}, 
by: { k8s.workload.name, k8s.namespace.name }, 
filter: { http.response.status_code >= 400 AND http.response.status_code < 499 }
| filter isNotNull(k8s.workload.name)
| fieldsadd total_requests = arraysum(total_count), total_failure = arraysum(failure_count), sumrequests = arraySum(total_count)
| fieldsadd failure_rate = 100.0 * total_failure / total_requests
| fields k8s.namespace.name,k8s.workload.name, total_requests, failure_rate
Phani Devulapalli

Hi @p_devulapalli ,

Thanks for your suggestion. We are currently using the following query to get the details:

timeseries by:{k8s.workload.name,dt.entity.cloud_application_namespace} ,{failure_count = sum(dt.service.request.failure_count), total_count = sum(dt.service.request.count),fourXX_count = sum(dt.service.request.count, scalar: true, filter: {http.response.status_code >= 500 and http.response.status_code <= 599}, default: 0)}
| filter isNotNull(k8s.workload.name) AND  matchesValue(entityAttr(dt.entity.cloud_application_namespace, "entity.name"), "transformer")
| fieldsadd total_requests = arraysum(total_count), total_failure = arraysum(failure_count), sumrequests = arraySum(total_count)
| fieldsadd failure_rate_5xx = 100.0 * fourXX_count / total_requests
| fields k8s.workload.name, total_requests,  failure_rate_5xx
| sort failure_rate_5xx des

Additionally, could you please suggest how we can retrieve requests/API details for a workload?
Currently, we can get key request endpoints, but we are looking for a way to include:

  • Workload
  • API/Request name
  • Status code
  • Request count

Any guidance on achieving this would be greatly appreciated.

Vikas_g1997_0-1764134193845.png

 

One way to achieve this is you use the field 'endpoint.name', but the API/Request must be set as Key Requests, otherwise, you will get the value as 'NON_KEY_REQUESTS'.

Other way to achieve this is using the spans (fetch spans), but in my tests, I was not able to display the HTTP status code, and also, the spans data may be sampled and not the 100% of your transactions.

Site Reliability Engineer @ Kyndryl

Featured Posts