04 Nov 2025 05:08 PM
We had a requirement:
To alert when a certain event ID in the Application Log didn't occur in a time window of 2 hours.
Since metric events have a limit of 60 minutes in its violating window, we talked internally and thought about the following process:
1. First create a CUSTOM_INFO event through pipeline to look for that event with a dt.davis.timeout of 120 minutes.
2. Create a Davis Anomaly Detector to look at the amount of ACTIVE events at ANY time to see if it was below 1. If it was, then we would create the problem.
Problem:
We're currently having trouble creating the necessary DQL to create a timeseries based on a Summarize (which is the count of the total ACTIVE events at any time).
This is the DQL we currently managed to create:
fetch events
| filter event.status == "ACTIVE"
and event.type == "CUSTOM_INFO"
and matchesPhrase(event.name, "Event")
| summarize totalCount = count(), by:{timestamp=bin(timestamp,1m)}
| makeTimeseries test = avg(totalCount), by:{timestamp}
Appreciate your help! Thanks in advance.
João
05 Nov 2025 02:10 PM
Hi João 👋
You’re very close — your DQL is already correct in principle.
However, it’s not necessary to explicitly specify by:{timestamp} in the makeTimeseries function.
The time dimension is implicit: makeTimeseries automatically interprets the aggregated data over time (based on the timestamp field). So you can simplify your query as follows:
fetch events
| filter event.status == "ACTIVE"
and event.type == "CUSTOM_INFO"
and matchesPhrase(event.name, "Event")
| summarize totalCount = count(), by:{timestamp=bin(timestamp,1m)}
| makeTimeseries test = avg(totalCount)This will produce the same result — the by:timestamp is redundant because the function already treats the timestamp as the time dimension by default.
Best regards,
05 Nov 2025 02:15 PM
Hi Jean,
I used your query, with the following results:
The problem is still the same, this query shows when the events happened at a certain time. And what we want is the count of the total ACTIVE events at any time, which should be a continuous line of (in this case with a timeframe of 2 hours) value 4. Is this possible?
05 Nov 2025 02:37 PM
Ah, I see — I might have misunderstood your goal earlier.
If what you want is to group and count the ACTIVE events over a 2-hour period, you can try something like this:
fetch events
| filter event.status == "ACTIVE" and event.type == "CUSTOM_INFO" and matchesPhrase(event.name, "Event")
| summarize totalCount = count(), by:{timestamp=bin(timestamp, 2h)}
| makeTimeseries test = avg(totalCount)This should group your active events into 2-hour windows and show a more stable count over time.
Best,
Jean
05 Nov 2025 02:44 PM
Hey Jean,
Still not what we're trying to accomplish. That query still returns the datapoints based on when they arrived. We want to create a anomaly detector based on the amount of active events with that filter at any time completely disregarding when they arrived. But i do understand we need a timeframe for the anomaly detector to work, and that is the issue.
05 Nov 2025 02:49 PM
Yes, Davis Anomaly Detector only works with a timeseries metric and a timeframe from 1 to 60 minutes.
Maybe you need to create a metric based on your needs and then create a detector to alert when an issue occured.
Best regards,
05 Nov 2025 02:53 PM
We didn't create a metric because of specific problem of a timeframe from 1 to 60 minutes. The solution we found is explained in the original post in the step 1 and 2. And we're stuck on step 2.
06 Nov 2025 09:48 AM - edited 06 Nov 2025 09:54 AM
@Joao1 to solve the challenge, I'd not use Davis Anomaly Detector (which requires time series data and works on last hour data only), but site reliability guardian instead. Trigger it from a simple workflow. Raise Davis problem opening event in an open pipeline processing the site reliability guardian.
CC @PedroDeodato / @PedroSantos
06 Nov 2025 06:06 PM
You are truly a legend, @Julius_Loman !!
Will definitely try it out!
Thank you very much!
06 Nov 2025 01:43 PM
Hello Julius,
So, something like this?
Regards,
Joao
07 Nov 2025 10:29 AM
You can do that on the logs directly.
On events, I would not filter on active status. Initially, your requirement is:
To alert when a certain event ID in the Application Log didn't occur in a time window of 2 hours.
So if you create some events from logs, you just want to query and summarize. Just if you want to grab it as a davis event, you might need to filter on event.start and not only on timestamp as for non-active events, the timestamp represents the time when the event changed its status to closed for example.
fetch dt.davis.events, from:-2h
| filter matchesPhrase(event.name, "Event")
| filter event.category=="CUSTOM_ALERT"
| filter event.start>now()-2h
| summarize count()