cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

This product reached the end of support date on March 31, 2021.

incidents alert rules with high response time and number violation occurrence

tbala
Contributor

Hi,

I'm trying to configure incident alert for business transaction for max response time and number of max response time breach or violation of 15 times in 1 hour to trigger the alert.

How do i create such rule?

thanks,

thanes

5 REPLIES 5

Radu
Dynatrace Pro
Dynatrace Pro

Hi Thanes

1. Create the Business Transaction

2. Find the measure for Response Time associated with the Business Transaction and set warning and/or severe thresholds against it

3. Create a threshold violation measure -> reference the threshold of the measure from step 2., and set the threshold of this measure to 15 (e.g. 15 violations of the response time measure's threshold)

4. Create an incident based on the measure from step 3., set the timeframe for 1 hour (e.g. 15 violations of your response time threshold within any 1 hour timeframe)

Best regards,

Radu

Thank You Radu!

It worked as per the expectations.

Regards,

Thanes

@Radu S.

Thanks for the solution but I came across the following scenario;

* I have setup an Incident that triggers an alert if a response time violation of 1.8 secs is exceeded 4 times in 1 hour. But the triggered alert duration was 1 hour and 17 mins.

I'm little confused with the settings please see the attached screen shot of the config and the alert;

Here's the alert;

I assumed 1 hr evaluation time frame setting only takes account # of violation per hours and drops the count of occurrence that are older than 60mins.

Can you please clarify based on the given config?

Thanks,

Thanes

Radu
Dynatrace Pro
Dynatrace Pro

Hi Thanes,

You can think of the evaluation timeframe as a rolling window where data is constantly being analysed against the measures/thresholds in the incident. The incident duration is much time passed from when the incident was triggered until the incident was closed.

So, let's imagine we have a 1 hr timeframe, and 4 violations of response time triggers the incident. It's 14:00 now, so the timeframe for data being analysed is 13:00 - 14:00; conclusion is only 3 violations have happened, so the incident doesn't trigger. It's now 14:10 so the data analysed is 13:10 - 14:10 and now there is 4 violations so the incident is triggered; the incident start time is 14:10. Time passes on, so the 1 hr window shifts as time passes. More violations go into the system, the old ones are pushed out of the 1 hr timeframe, but I still have at least 4 violations in the timeframe. It's now 15:20, so the window is 14:20 - 15:20, and finally there are only 3 violations in this timeframe, so the incident is closed, with the closed time being 15:20. So we have incident start at 14:10, incident end at 15:20, and incident duration is 1 hr 10 min. Even though the timeframe is 1 hr.

Does this make it more clear?

Best regards,

Radu

Thanks for the clarification.

Regards,

Thanes