30 Jan 2025 10:24 AM
Hello all,
Already discussed in:
https://community.dynatrace.com/t5/Alerting/Usage-of-the-new-Davis-anomaly-App/td-p/268120
https://community.dynatrace.com/t5/Open-Q-A/Why-the-30-minute-wait-in-the-Default-alerting-profile/t...
Dynatrace waits a little while to combine events, and after this time a Problem is created. In the "old" Alerting profile an option exists to tell Dynatrace to wait before a problem becomes an Alert and wakes up everybody up, if the problem gets resolved within this period. So there is a separation between problem evaluation and alerting!
(So a problem could be in a state waiting!)
In the new situation the workflows replacing the profile can not hold on does not know if he should wait?
Maybe its time to add an extra problem attibute?
"event.status": "WAITING"
"event.status_transition": "CREATED",
"dt.davis.is_frequent_event": false,
"maintenance.is_under_maintenance": false
Who can comment on this?
1-The wait option is used for infra problems (by my customers)
2-For now ServiceNow will be handling this (we have a modified SN)
31 Jan 2025 04:35 AM
Heya @henk_stobbe ,
Not sure If I get the requirement completely but workflows are supposed to be automation utils and the way they are currently working is that no step can run for more than 120 seconds, so if we are discussing about replacing alerting profiles with these, definitely not a good option in the current setup. The control that you have setup on ServiceNow is really Ideal and kudos for that.
However, if you really want to control it in Dynatrace itself, what I would do is, instead of having an event trigger or a problem trigger for Dynatrace, I will use a cron trigger.
Let's say your delays are set this way,
Availability - 0 mins
Error - 0 mins
Monitoring Unavailable - 0 mins
Resource - 30 mins
Slowdown - 45 mins
We can have two workflows here 1st one is for all the immediate alerts as you do not want any wait time on them.
2nd one is for the other two which have delays set.
30 mins, 45 mins so we have a cron workflow here which would be of 15 minutes (Leaving the math of how we decided 15)
Now every 15 minutes this workflow runs and gets the data of the problems that correspond to these severities.
We already have a field for problem duration, so when the problem duration goes beyond 30 or 45 minutes respectively, we call the webhook to ingest that data to ServiceNow. However again here lies another issue, if the problem is generated at 11:23 and the workflow has run at 11:15, next workflow run would be at 11:30 and the problem duration would be 7 minutes for the next run it would be 7+15 minutes and for the next it would be 7+15+15 so either we create a problem 8 minutes earlier or 7 minutes late here. Really depends on how efficiently we handle it 😃
If this is not what you were looking for, please ignore this conversation 😅