Re: Pro-Tip: Advanced Queries for Grail Partitioning

martin_goodwell · ‎15 Apr 2026

As an extension to the Grail Partitioning Guide, this thread shows a collection of advanced analytics queries. These allow you to check if a partitioning strategy based on query behavior works for your environment.

Using the bucket filter is among the most powerful ways to optimize query performance. The following queries show, if some fields are used so frequently for filtering that you should use them for partitioning.

martin_goodwell · ‎15 Apr 2026

Step 1: Find the most-used filter fields

The following query returns the filter fields, how often they’re used, and which table they belong to. It looks complex at first, but it is generic. So, you can just copy/paste it into a notebook of your choice and run it. For those inclined, feel free to see what it does, so you can tweak it to your specific needs.

fetch dt.system.query_executions

//only apps where users write queries
| filter in(client.application_context, array("dynatrace.dashboards", "dynatrace.automations", "dynatrace.notebooks", "dynatrace.logs", "dynatrace.site.reliability.guardian"))
         OR client.internal_service_context == "dt.davis.anomaly-detector"

//filter only on spans & logs buckets
| filter in(table,array("logs"))
| fieldsAdd dt.system.table = table
| fieldsAdd query_string = lower(query_string)
| filter contains(query_string, "matchesvalue")
| parse `query_string`, """$mv = 'matchesvalue' [' ']* '(' LD:field ',';

DATA $mv"""
| parse `query_string`, """$mv = 'matchesvalue' [' ']* '(' LD:field ',';
$eq = LD:field <<([' ']{1,10}) [' ']* '==' [' ']* '"*"';
ARRAY{DATA ($mv|$eq) (PUNCT|SPACE|ALNUM)}{1,200}:arr"""

//array ops
| expand arr
| fieldsAdd filterfield = arr[field]
| filter isNotNull(filterfield)

//sanitize fieldnames
| fieldsAdd filterfield = if (matchesPattern(filterfield, """ alpha '(' ("`"|alpha) LD?"""), splitString(filterfield, "(")[1], else: filterfield)
| fieldsAdd filterfield = if(startsWith(filterfield, "`"), splitString(filterfield, "`")[1], else: filterfield)
| fieldsAdd filterfield = if(startsWith(filterfield, "("), splitString(filterfield, "(")[1], else: filterfield)

| summarize count = count(), by: {filterfield, dt.system.table}
| sort count desc

The k8s.container.name field is the most frequently used when querying logs. The next step shows how to further investigate these specific queries.

martin_goodwell · ‎15 Apr 2026

Step 2: Check the Variation of the top used fields

When investigating specific fields, we also need to account for null values. Not every log line will be related to a Kubernetes container, so we’ll consolidate these as ‘other’.

fetch logs, scanLimitGBytes:-1
| fieldsAdd k8s.container.name = if(isNull(k8s.container.name), "other", else: k8s.container.name)
| summarize {size = sum(stringLength(content)), records = count()}, by: { k8s.container.name}
| sort records desc

And you should not just check for the top result. Here’s the result of k8s.cluster.name, which was the second most used field.

fetch logs, scanLimitGBytes:-1
| fieldsAdd k8s.cluster.name = if(isNull(k8s.cluster.name), "other", else: k8s.cluster.name)
| summarize {size = sum(stringLength(content)), records = count()}, by: { k8s.cluster.name }
| sort records desc

Again, we consolidated all log queries unrelated to k8s.cluster.name under the ‘other’ label.

Of course, we’d like to know how we could further split ‘other’. Ideally, you’d use fields that are filled for as many entries as possible.

These are just indicators you can use to reason about what works best for your environments. For example, the ‘status’ field (containing the log level) is used very frequently for filtering. But in practice, it’s clear that you’d only filter by ‘status’ in combination with additional filters.

You know your environment better than anyone else. So, align the numbers with what you think makes sense and base your decision on both aspects.