Re: Service Failure rate by host

jmistry · ‎16 Apr 2026

Hello,

Is there a way where i can see service failure rate but have it split by host.
we have the same services on multi different host and want to see all services on all host and their failure rate?

Maybe a DQL query?

for example:

Host	Service	Failure Rate
Server111	Service111
Server111	Service222
Server222	Service111
Server222	Service222

Julius_Loman · ‎16 Apr 2026

You can see that directly in the services app in Failures tab. Playground link .

You can adjust grouping as necessary. You can also grab SQL from there.

Dynatrace Ambassador | Alanata a.s., Slovakia, Dynatrace Master Partner

jmistry · ‎17 Apr 2026

Ive had a look at this but when i go to find "Host" in Group by it doesnt appear.

t_pawlak · ‎17 Apr 2026

Hi,
try this:

fetch spans, from:now()-2h, to:now()
| filter request.is_root_span == true
| filter isNotNull(dt.entity.host) and isNotNull(dt.entity.service)
| summarize 
    failed = countIf(request.is_failed == true),
    total = count(),
  by: { dt.entity.host, dt.entity.service }
| fieldsAdd 
    host = entityName(dt.entity.host),
    service = entityName(dt.entity.service),
    failure_rate = 100.0 * failed / if(total == 0, 1, else: total)
| fields host, service, failure_rate
| sort failure_rate desc

you should recive this:

jmistry · ‎17 Apr 2026

Hi,

Thank you for this. this has worked but Is there a way you can convert this to timeseries please? i have tried but i get no records coming.

Thanks

t_pawlak · ‎17 Apr 2026

Yes, because your working query starts from fetch spans, you should use makeTimeseries, not timeseries.
timeseries is the starting command for metrics, while makeTimeseries is the command used after fetching records such as spans. Dynatrace also notes that for spans, makeTimeseries uses the start_time field automatically.
try this:

fetch spans, from:now()-2h, to:now()
| filter request.is_root_span == true
| filter isNotNull(dt.entity.host) and isNotNull(dt.entity.service)
| makeTimeseries {
    failed = countIf(request.is_failed == true),
    total = count()
  },
  by:{dt.entity.host, dt.entity.service},
  interval: 5m
| fieldsAdd host = entityName(dt.entity.host),
            service = entityName(dt.entity.service),
            failure_rate = if(total[] == 0, 0.0, else: 100.0 * failed[] / total[])
| fields timeframe, interval, host, service, failure_rate

jmistry · ‎17 Apr 2026

Hi,

Thank you for this, this has worked and can put it as a graph. Is there a way we can help reduce the cost for this by doing it a different way as at the minute it is scanning over 100gb and will require it to run quite often?

I tried this query but it doesnt give me all services on all hosts like yours does.

timeseries {
operand1 = sum(dt.service.request.failure_count),
operand2 = sum(dt.service.request.count)
}, by: { dt.entity.service}, nonempty:true
| fieldsAdd failureRate_timeseries = operand1[] / operand2[] * 100
| fieldsRemove operand1, operand2 | fieldsAdd metricName = "Failure rate"
| fieldsadd service_name = entityName(dt.entity.service)
| lookup [fetch dt.entity.host
| fields hostid = id, hostname = entity.name, serviceid = runs[dt.entity.service]
| expand serviceid], lookupField:serviceid, sourceField:dt.entity.service
| fieldsRemove dt.entity.service, metricName, lookup.hostid, lookup.serviceid

t_pawlak · ‎17 Apr 2026

Hmmmm,
IMO, the best way to reduce cost is to stop querying raw spans repeatedly and instead extract a metric from spans and query that metric afterward.
Right now this query works, but it scans a lot of trace data each time. Dynatrace’s own direction is that for this kind of use case, you should use OpenPipeline metric extraction from spans. That gives you a metric you can later query with cheap timeseries instead of repeatedly scanning large amounts of span data.

jmistry · ‎17 Apr 2026

Perfect, Thank you for your help on this. Much appreciated. Will you be able to show me how to do this via openpipeline please?

t_pawlak · ‎17 Apr 2026

try this:
Calculated metrics for services
and
Extract metrics from spans and distributed traces

jmistry · ‎17 Apr 2026

Thank you for this. appreciate it