Solved: Re: Converting DQL into timeseries

AK · ‎17 Sep 2024

Hello Folks,

Need some help in converting following query into timeseries as I wanted to use it in Davis anomaly detection.

Is it possible?

fetch logs, from:now() - 4h
| filter contains(dt.process.name, "process_name")
| filter contains (content, "Error when user:")
| parse content, "JSON:json"
| parse json[log],  "LD ': ' LD:user_id ' - ' LD ': ' BOOL:userid_generated LD ': ' BOOL:userid_is_not_NULL"
| fields timestamp, log = json[log], user_id, userid_generated, userid_is_not_NULL, user_type = json[`user-type`]
| summarize requestId = collectDistinct(user_id), by:{bin(timestamp,1m), log, userid_generated, userid_is_not_NULL, user_type}

Regards,

AK

Fin_Ubels · ‎17 Sep 2024

Hi AK,

I would look at using the makeTimeseries command instead of summarize as documented here https://docs.dynatrace.com/docs/shortlink/aggregation-commands#makeTimeseries. The third example on that page would likely be the best fit for you. It would probably look something like:

fetch logs, from:now() - 4h
| filter contains(dt.process.name, "process_name")
| filter contains (content, "Error when user:")
| parse content, "JSON:json"
| parse json[log],  "LD ': ' LD:user_id ' - ' LD ': ' BOOL:userid_generated LD ': ' BOOL:userid_is_not_NULL"
| fields timestamp, log = json[log], user_id, userid_generated, userid_is_not_NULL, user_type = json[`user-type`]
| makeTimeseries count(), by:{log, userid_generated, userid_is_not_NULL, user_type}

I would however recommend first turning it into a metric on ingest so that you aren't constantly querying logs for anomaly detection. This would also allow you to reuse the metric on dashboards or in notebooks more easily as well. Docs for that can be found here: https://docs.dynatrace.com/docs/platform/openpipeline/use-cases/tutorial-log-processing-pipeline

AK · ‎18 Sep 2024

@Fin, thank you for the query, its working fine.

Is it possible to use collectDistinct to collect distinct values of user_id under makeTimeseries

Regards,

AK

Fin_Ubels · ‎19 Sep 2024

Good to hear!
If you want user_id to be one of the dimensions, then you would include it in the by:{} section. Is that what you're looking for?

AK · ‎19 Sep 2024

No, if you take a look at summarize command, there we are trying to collect distinct userIds and all other fields are under "by" section, can we do the same with makeTimeseries

summarize requestId = collectDistinct(user_id),by:{bin(timestamp,1m), log, userid_generated, userid_is_not_NULL, user_type}

krzysztof_hoja · ‎19 Sep 2024

makeTimeseries command produces arrays of numbers (track changes of metric value over time), so it will not work with collectDistinct as it produces array of unique user_id values.
What would make sense to have metric "number of unique user_id values" over time. Then you should use countDistinctExact or countDistinctApprox function.

Kris