cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

Last on 30 minutes wait in Alerting Profile

henk_stobbe
DynaMight Leader
DynaMight Leader

Hello all,

 

Already discussed in:

https://community.dynatrace.com/t5/Alerting/Usage-of-the-new-Davis-anomaly-App/td-p/268120
https://community.dynatrace.com/t5/Open-Q-A/Why-the-30-minute-wait-in-the-Default-alerting-profile/t...

 

Dynatrace waits a little while to combine events, and after this time a Problem is created. In the "old" Alerting profile an option exists to tell Dynatrace to wait before a problem becomes an Alert and wakes up everybody up, if the problem gets resolved within this period. So there is a separation between problem evaluation and alerting!

(So a problem could be in a state waiting!)

In the new situation the workflows replacing the profile can not hold on does not know if he should wait?

 

Maybe its time to add an extra problem attibute?

"event.status": "WAITING"

"event.status_transition": "CREATED",
"dt.davis.is_frequent_event": false,
"maintenance.is_under_maintenance": false

 

Who can comment on this?

 

1-The wait option is used for infra problems (by my customers)

2-For now ServiceNow will be handling this (we have a modified SN)

 

 

1 REPLY 1

Maheedhar_T
Mentor

Heya @henk_stobbe ,
Not sure If I get the requirement completely but workflows are supposed to be automation utils and the way they are currently working is that no step can run for more than 120 seconds, so if we are discussing about replacing alerting profiles with these, definitely not a good option in the current setup. The control that you have setup on ServiceNow is really Ideal and kudos for that.

However, if you really want to control it in Dynatrace itself, what I would do is, instead of having an event trigger or a problem trigger for Dynatrace, I will use a cron trigger.
Let's say your delays are set this way,
Availability - 0 mins
Error - 0 mins
Monitoring Unavailable - 0 mins
Resource - 30 mins
Slowdown - 45 mins

We can have two workflows here 1st one is for all the immediate alerts as you do not want any wait time on them.
2nd one is for the other two which have delays set.
30 mins, 45 mins so we have a cron workflow here which would be of 15 minutes (Leaving the math of how we decided 15)
Now every 15 minutes this workflow runs and gets the data of the problems that correspond to these severities.
We already have a field for problem duration, so when the problem duration goes beyond 30 or 45 minutes respectively, we call the webhook to ingest that data to ServiceNow. However again here lies another issue, if the problem is generated at 11:23 and the workflow has run at 11:15, next workflow run would be at 11:30 and the problem duration would be 7 minutes for the next run it would be 7+15 minutes and for the next it would be 7+15+15 so either we create a problem 8 minutes earlier or 7 minutes late here. Really depends on how efficiently we handle it 😃

If this is not what you were looking for, please ignore this conversation 😅

Maheedhar

Featured Posts