24 Oct 2024 02:43 PM
I have a log metric that has a dimension of the array type. When I put this metric in a timeseries, I split by this dimension, however it collects the entire array when doing so, instead of each element. I tried using the expand command and then summarize to do some counting on the elements, but in doing so I lose my timeseries data. When I try to put this back in a timeseries with the makeTimeseries command, I get an error about the time.
I know I can do what I want by querying the raw logs and using makeTimeseries, which was my workaround for this. But what I'd like to know (mostly because I'm curious) is if there's a way to do what I'm talking about and I just couldn't figure it out. Here's the queries below.
Metric query that I'd like to get a timeseries of each element:
timeseries sum(log.metric), by:{dimension}
| expand dimension
| summarize count(), by:{dimension}
Log query I've used as a workaround:
fetch logs
| filter dt.system.bucket == "my_bucket"
| filter isNotNull(dimension)
| expand dimension
| makeTimeseries count(), by:{dimension}
Solved! Go to Solution.
24 Oct 2024 09:07 PM
When you do this:
| summarize count(), by:{dimension}
the only fields which survive directly such operation are the ones listed in by: parameter and these created by expressions listed after summarize. So there is no possibility to put anything back.
I prepared some example data which create timeseries resembling what you have :
data
record(timestamp=now(), v=1, dimension=array("a","b")),
record(timestamp=now()-1m, v=2, dimension=array("a","b")),
record(timestamp=now(), v=3, dimension=array("b")),
record(timestamp=now()-1m, v=1, dimension=array("b")),
record(timestamp=now(), v=4, dimension=array("c")),
record(timestamp=now()-1m, v=2, dimension=array("c")),
record(timestamp=now()-2m, v=2, dimension=array("c")),
record(timestamp=now(), v=1, dimension=array("c","b")),
record(timestamp=now()-1m, v=2, dimension=array("c","b")),
record(timestamp=now()-2m, v=2, dimension=array("c","b"))
| makeTimeseries {log.metric=sum(v)}, by:{dimension}, from: now()-5m, interval: 1m
As I understand the goal is to count over time how many times each element of array being dimension was present in data. This query is calculating this:
data
record(timestamp=now(), v=1, dimension=array("a","b")),
record(timestamp=now()-1m, v=2, dimension=array("a","b")),
record(timestamp=now(), v=3, dimension=array("b")),
record(timestamp=now()-1m, v=1, dimension=array("b")),
record(timestamp=now(), v=4, dimension=array("c")),
record(timestamp=now()-1m, v=2, dimension=array("c")),
record(timestamp=now()-2m, v=2, dimension=array("c")),
record(timestamp=now(), v=1, dimension=array("c","b")),
record(timestamp=now()-1m, v=2, dimension=array("c","b")),
record(timestamp=now()-2m, v=2, dimension=array("c","b"))
| makeTimeseries {log.metric=sum(v), timestamp=start()}, by:{dimension}, from: now()-5m, interval: 1m
| expand dimension
| fieldsAdd d=record(timestamp=timestamp[], log.metric=log.metric[])
| expand d
| filterOut isNull(d[log.metric])
| makeTimeseries count(), by: {dimension}, time:d[timestamp], from:now()-5m, interval: 1m
In order to have time we need to use start() function which creates timeseries with timestamps and associate each datapoint with timestamp. I assumed that you do not want to count when metric value was null (data not present)
Result looks like this:
I hope it helps
Kris
25 Oct 2024 04:16 PM
This is exactly what I wanted. Thanks so much for the help!