Solved: Re: Metric Timeseries with Array Dimension

StrangerThing · ‎24 Oct 2024

I have a log metric that has a dimension of the array type. When I put this metric in a timeseries, I split by this dimension, however it collects the entire array when doing so, instead of each element. I tried using the expand command and then summarize to do some counting on the elements, but in doing so I lose my timeseries data. When I try to put this back in a timeseries with the makeTimeseries command, I get an error about the time.

I know I can do what I want by querying the raw logs and using makeTimeseries, which was my workaround for this. But what I'd like to know (mostly because I'm curious) is if there's a way to do what I'm talking about and I just couldn't figure it out. Here's the queries below.

Metric query that I'd like to get a timeseries of each element:

timeseries sum(log.metric), by:{dimension}
| expand dimension
| summarize count(), by:{dimension}

Log query I've used as a workaround:

fetch logs
| filter dt.system.bucket == "my_bucket"
| filter isNotNull(dimension)
| expand dimension
| makeTimeseries count(), by:{dimension}

krzysztof_hoja · ‎24 Oct 2024

When you do this:

| summarize count(), by:{dimension}

the only fields which survive directly such operation are the ones listed in by: parameter and these created by expressions listed after summarize. So there is no possibility to put anything back.

I prepared some example data which create timeseries resembling what you have :

data 
  record(timestamp=now(), v=1, dimension=array("a","b")),
  record(timestamp=now()-1m, v=2, dimension=array("a","b")),
  
  record(timestamp=now(), v=3, dimension=array("b")),  
  record(timestamp=now()-1m, v=1, dimension=array("b")),
  
  record(timestamp=now(), v=4, dimension=array("c")),  
  record(timestamp=now()-1m, v=2, dimension=array("c")),
  record(timestamp=now()-2m, v=2, dimension=array("c")),

  record(timestamp=now(), v=1, dimension=array("c","b")),
  record(timestamp=now()-1m, v=2, dimension=array("c","b")),
  record(timestamp=now()-2m, v=2, dimension=array("c","b"))

| makeTimeseries {log.metric=sum(v)}, by:{dimension}, from: now()-5m, interval: 1m

As I understand the goal is to count over time how many times each element of array being dimension was present in data. This query is calculating this:

data 
  record(timestamp=now(), v=1, dimension=array("a","b")),
  record(timestamp=now()-1m, v=2, dimension=array("a","b")),
  
  record(timestamp=now(), v=3, dimension=array("b")),  
  record(timestamp=now()-1m, v=1, dimension=array("b")),
  
  record(timestamp=now(), v=4, dimension=array("c")),  
  record(timestamp=now()-1m, v=2, dimension=array("c")),
  record(timestamp=now()-2m, v=2, dimension=array("c")),

  record(timestamp=now(), v=1, dimension=array("c","b")),
  record(timestamp=now()-1m, v=2, dimension=array("c","b")),
  record(timestamp=now()-2m, v=2, dimension=array("c","b"))

| makeTimeseries {log.metric=sum(v), timestamp=start()}, by:{dimension}, from: now()-5m, interval: 1m

| expand dimension
| fieldsAdd d=record(timestamp=timestamp[], log.metric=log.metric[])
| expand d
| filterOut isNull(d[log.metric])

| makeTimeseries count(), by: {dimension}, time:d[timestamp], from:now()-5m, interval: 1m

In order to have time we need to use start() function which creates timeseries with timestamps and associate each datapoint with timestamp. I assumed that you do not want to count when metric value was null (data not present)

Result looks like this:

I hope it helps

Kris

StrangerThing · ‎25 Oct 2024

This is exactly what I wanted. Thanks so much for the help!