10 Mar 2025
	
		
		10:44 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
 - last edited on 
    
	
		
		
		11 Mar 2025
	
		
		07:28 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
 by 
				
		
		
			MaciejNeumann
		
		
		
		
		
		
		
		
	
			
		
I have a dataset of 1000+ hosts, each with a fixed-sized array of 120+ numeric values. Most hosts have only one entry, but a few can have multlple rows. I want to summarize those, per host, with the maximum value of each element within the array. Which is the best (most economic) way to do this in DQL?
Example extract:
| host | data | 
| host_A | (3,8,6) | 
| host_A | (5,0,7) | 
| host_B | (1,7,3) | 
| host_B | (4,6,2) | 
| host_B | (8,5,2) | 
| host_C | (5,3,9) | 
| host_D | (4,5,6) | 
Desired outcome:
| host | result | 
| host_A | (5,8,7) | 
| host_B | (8,7,3) | 
| host_C | (5,3,9) | 
| host_D | (4,5,6) | 
I was looking into 'dedup', summarize, and max() , takeMax, functions but didn't get any further than filtering on the entry with highest first element of the array, per host
Solved! Go to Solution.
10 Mar 2025 12:11 PM
Hi @joostm
One way how you could achieve this is with summarize, please have a look:
data record(content = """host_A	(3,8,6)
host_A	(5,0,7)
host_B	(1,7,3)
host_B	(4,6,2)
host_B	(8,5,2)
host_C	(5,3,9)
host_D	(4,5,6)""")
| parse content, """ARRAY{STRUCTURE{LD:host '	' LD:arr }:i (LF|EOS)}{1,}:arr
"""
| expand elem = arr 
| fields host = elem[host], arr = elem[arr]
| parse arr, """'(' ARRAY{LONG:i (','|')')}{1,}:data"""
| summarize result = arrayDistinct(arrayFlatten((collectArray(data)))), by: { host }
Best,
Sini