04 Nov 2025 05:08 PM
We had a requirement:
To alert when a certain event ID in the Application Log didn't occur in a time window of 2 hours.
Since metric events have a limit of 60 minutes in its violating window, we talked internally and thought about the following process:
1. First create a CUSTOM_INFO event through pipeline to look for that event with a dt.davis.timeout of 120 minutes.
2. Create a Davis Anomaly Detector to look at the amount of ACTIVE events at ANY time to see if it was below 1. If it was, then we would create the problem.
Problem:
We're currently having trouble creating the necessary DQL to create a timeseries based on a Summarize (which is the count of the total ACTIVE events at any time).
This is the DQL we currently managed to create:
fetch events
| filter event.status == "ACTIVE"
and event.type == "CUSTOM_INFO"
and matchesPhrase(event.name, "Event")
| summarize totalCount = count(), by:{timestamp=bin(timestamp,1m)}
| makeTimeseries test = avg(totalCount), by:{timestamp}
Appreciate your help! Thanks in advance.
João
05 Nov 2025 02:10 PM
Hi João 👋
You’re very close — your DQL is already correct in principle.
However, it’s not necessary to explicitly specify by:{timestamp} in the makeTimeseries function.
The time dimension is implicit: makeTimeseries automatically interprets the aggregated data over time (based on the timestamp field). So you can simplify your query as follows:
fetch events
| filter event.status == "ACTIVE"
and event.type == "CUSTOM_INFO"
and matchesPhrase(event.name, "Event")
| summarize totalCount = count(), by:{timestamp=bin(timestamp,1m)}
| makeTimeseries test = avg(totalCount)This will produce the same result — the by:timestamp is redundant because the function already treats the timestamp as the time dimension by default.
Best regards,
05 Nov 2025 02:15 PM
Hi Jean,
I used your query, with the following results:
The problem is still the same, this query shows when the events happened at a certain time. And what we want is the count of the total ACTIVE events at any time, which should be a continuous line of (in this case with a timeframe of 2 hours) value 4. Is this possible?
05 Nov 2025 02:37 PM
Ah, I see — I might have misunderstood your goal earlier.
If what you want is to group and count the ACTIVE events over a 2-hour period, you can try something like this:
fetch events
| filter event.status == "ACTIVE" and event.type == "CUSTOM_INFO" and matchesPhrase(event.name, "Event")
| summarize totalCount = count(), by:{timestamp=bin(timestamp, 2h)}
| makeTimeseries test = avg(totalCount)This should group your active events into 2-hour windows and show a more stable count over time.
Best,
Jean
05 Nov 2025 02:44 PM
Hey Jean,
Still not what we're trying to accomplish. That query still returns the datapoints based on when they arrived. We want to create a anomaly detector based on the amount of active events with that filter at any time completely disregarding when they arrived. But i do understand we need a timeframe for the anomaly detector to work, and that is the issue.
05 Nov 2025 02:49 PM
Yes, Davis Anomaly Detector only works with a timeseries metric and a timeframe from 1 to 60 minutes.
Maybe you need to create a metric based on your needs and then create a detector to alert when an issue occured.
Best regards,
05 Nov 2025 02:53 PM
We didn't create a metric because of specific problem of a timeframe from 1 to 60 minutes. The solution we found is explained in the original post in the step 1 and 2. And we're stuck on step 2.
06 Nov 2025 09:48 AM - edited 06 Nov 2025 09:54 AM
@Joao1 to solve the challenge, I'd not use Davis Anomaly Detector (which requires time series data and works on last hour data only), but site reliability guardian instead. Trigger it from a simple workflow. Raise Davis problem opening event in an open pipeline processing the site reliability guardian.
CC @PedroDeodato / @PedroSantos
06 Nov 2025 06:06 PM
You are truly a legend, @Julius_Loman !!
Will definitely try it out!
Thank you very much!
06 Nov 2025 01:43 PM
Hello Julius,
So, something like this?
Regards,
Joao
07 Nov 2025 10:29 AM
You can do that on the logs directly.
On events, I would not filter on active status. Initially, your requirement is:
To alert when a certain event ID in the Application Log didn't occur in a time window of 2 hours.
So if you create some events from logs, you just want to query and summarize. Just if you want to grab it as a davis event, you might need to filter on event.start and not only on timestamp as for non-active events, the timestamp represents the time when the event changed its status to closed for example.
fetch dt.davis.events, from:-2h
| filter matchesPhrase(event.name, "Event")
| filter event.category=="CUSTOM_ALERT"
| filter event.start>now()-2h
| summarize count()
19 Nov 2025 09:37 AM
So, i assume guardian can validate on a determined timeframe, in this case 2 hours, so that means i don't have to strictly use events. Instead i can query logs directly as you said and then configure the guardian to alert when the count is below 0 on a 2 hour timeframe, correct?
19 Nov 2025 11:37 AM
@Joao1 Correct, trigger the validation from a simple workflow like this, for example every 1 hour and set the timeframe for SRG to 2 hours. So every hour, the SRG will be called for timeframe of last two hours:
In the SRG you need to create the objective on logs like this:
fetch logs
| filter matchesPhrase(content, "this message should occur")
| summarize count=count()And add a condition of your choice:
19 Nov 2025 09:42 AM
Julian, my question now is if the Guardian opens the problem by itself, or do i need a workflow to look at the guardian and open a problem based on its validation?
19 Nov 2025 11:30 AM - edited 19 Nov 2025 11:39 AM
@Joao1 SRG does not open any Davis problems. You can do that in the openpipeline:
First create a bizevents OpenPipeline pipeline:
Add a DQL processing step matching:
event.type == "guardian.validation.objective"with DQL:
parse `guardian.objective`, """JSON:objective"""
| fieldsAdd objective.name=objective[name]
, objective.objectiveType=objective[objectiveType]
, objective.dqlQuery=objective[dqlQuery]
, objective.comparisonOperator=objective[comparisonOperator]
, objective.target=objective[target]
, objective.warning=objective[warning]
, objective.status=objective[status]
, objective.value=objective[value]
, objective.fieldName=objective[fieldName]
| fieldsRemove objectiveThis will help you to preprocess the objective validation results.
In the Davis tab in the pipeline, create a Davis event when conditions match, example below. Adapt Objective Name and Guardian ID
event.provider == "dynatrace.site.reliability.guardian"
and event.type=="guardian.validation.objective"
and guardian.id=="vu9U3hXa3q0AAAABADFhcHA6ZHluYXRyYWNlLnNpdGUucmVsaWFiaWxpdHkuZ3VhcmRpYW46Z3VhcmRpYW5zAAZ0ZW5hbnQABnRlbmFudAAkYmZiMGQ3NjktNzQyMS0zOTdmLWIyYjItMDEzZmE5ZjI1NmYxvu9U3hXa3q0"
and objective.status=="fail"
and matchesPhrase(objective.name,"My Objective")And provide data for the Davis event you want (don't forget to add dt.davis.timeout to set the problem duration for example).
And finally create a route for bizevents for SRG:
SRG bizevents should be then processed by the pipeline you just created. You will need to create the pipeline just once and then you can add davis events for other SRG related objective validations, (different SRGs, objectives, etc...).
19 Nov 2025 11:59 AM
Hey @Julius_Loman , i created everything as you said, but the problem is not being created.
I can see that the event is entering the pipeline as follows:
As you can that graph, when it goes up it's me executing the workflow to create the bizevent based on the guardian validation.
i have the parsing as you said inside the pipeline, and then i created in the Data Extraction the following:
event.type == "guardian.validation.finished"
and matchesPhrase(guardian.name, "test")
and matchesPhrase(validation.status, "fail")
in the properties, event.type is set as CUSTOM_ALERT.
I have these properties though:
Can i do that or is that the reason for the problem to not open?
Thanks in advance,
Joao
19 Nov 2025 12:17 PM
@Joao1, please see the validation results in a notebook:
fetch bizevents
| filter event.type == "guardian.validation.finished"
and check the data you have.
19 Nov 2025 12:32 PM
executed the dql query as requested, i have various entries, the most recent one:
{
"timestamp": "2025-11-19T12:26:28.541000000Z",
"validation.triggered_by.user.id": "30683445-5db3-4298-ba95-ed1beb67c869",
"validation.id": "d7c1aba1-698e-4b5c-94d6-97ee25d1e47c",
"event.id": "450b6a24-acbf-48da-bd25-be573d191701",
"guardian.description": "This guardian is used to validate how many events occured in a timeframe of two hours. Event is there is no events in that timeframe we want to alert.",
"validation.from": "2025-11-19T10:26:28.149Z",
"validation.workflow.trigger_type": "Manual",
"dt.openpipeline.pipelines": [
"bizevents:default"
],
"validation.workflow.type": "STANDARD",
"guardian.id": "vu9U3hXa3q0AAAABADFhcHA6ZHluYXRyYWNlLnNpdGUucmVsaWFiaWxpdHkuZ3VhcmRpYW46Z3VhcmRpYW5zAAZ0ZW5hbnQABnRlbmFudAAkNjhhNzdhZWUtZmQwZi0zYzliLTg4YjktMDE4MGJjMGZjMjY3vu9U3hXa3q0",
"event.provider": "dynatrace.site.reliability.guardian",
"guardian.name": "Application Name (sss) - Event 114",
"validation.summary": "{\"pass\":0,\"warning\":0,\"fail\":1,\"error\":0,\"info\":0}",
"validation.errors": "[]",
"guardian.variables": "[]",
"dt.openpipeline.source": "/api/v2/bizevents/ingest",
"event.kind": "BIZ_EVENT",
"validation.workflow.execution.id": "d4133133-864c-447e-b6b2-100b58bd4bcc",
"validation.workflow.id": "648494f6-58c1-4b33-9017-1783cf7e0f79",
"validation.to": "2025-11-19T12:26:28.149Z",
"event.type": "guardian.validation.finished",
"validation.status": "fail"
}
This is the DQL matcher i have in the data extraction:
event.type == "guardian.validation.finished"
and it's not opening the problem.
event type:
19 Nov 2025 03:46 PM
Looks like the event did not go to a custom pipeline:
"dt.openpipeline.pipelines": [
"bizevents:default"
],I don't see the objective fields (see DQL processor in my pipeline)
19 Nov 2025 04:40 PM
Hey @Julius_Loman , it's fixed.
My dynamic route was wrongly configured. It had event.type == "guardian.validation.objective" instead of event.type == "guardian.validation.finished"
I fixed it and the problem opened. Thanks so much 🙂
19 Nov 2025 06:09 PM
@Joao1 I'd recommend using this filter to route every SRG bizevent into that pipeline,
event.provider == "dynatrace.site.reliability.guardian" not just objectives. You may want to have just a pipeline per guardian, but it seems overengineered.