cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

How to summarize multiple array entries, with the max value per array element

joostm
Frequent Guest

I have a dataset of 1000+ hosts, each with a fixed-sized array of 120+ numeric values. Most hosts have only one entry, but a few can have multlple rows. I want to summarize those, per host, with the maximum value of each element within the array. Which is the best (most economic) way to do this in DQL? 

Example extract:

host data
host_A (3,8,6)
host_A (5,0,7)
host_B (1,7,3)
host_B (4,6,2)
host_B (8,5,2)
host_C (5,3,9)
host_D (4,5,6)

 

 

Desired outcome:

host result
host_A (5,8,7)
host_B (8,7,3)
host_C (5,3,9)
host_D (4,5,6)

 

I was looking into 'dedup', summarize, and max() , takeMax, functions but didn't get any further than filtering on the entry with highest first element of the array, per host

1 REPLY 1

sinisa_zubic
Dynatrace Champion
Dynatrace Champion

Hi @joostm 

One way how you could achieve this is with summarize, please have a look: 

 

data record(content = """host_A	(3,8,6)
host_A	(5,0,7)
host_B	(1,7,3)
host_B	(4,6,2)
host_B	(8,5,2)
host_C	(5,3,9)
host_D	(4,5,6)""")
| parse content, """ARRAY{STRUCTURE{LD:host '	' LD:arr }:i (LF|EOS)}{1,}:arr
"""
| expand elem = arr 
| fields host = elem[host], arr = elem[arr]
| parse arr, """'(' ARRAY{LONG:i (','|')')}{1,}:data"""
| summarize result = arrayDistinct(arrayFlatten((collectArray(data)))), by: { host }

 

 

Best,
Sini

Featured Posts