my team is setting-up custom events for alerting with JMX checks on Dynatrace and we are facing some issues. We have a Java-Spring application with some MBeans that increase a counter (numeric value) once there is an error. Our desired scenario is that whenever this counter goes up we receive a notification and the problem raised do not get closed automatically. So far, we have tried configuring the delta attribute of the JMX plugin as false and true and do not get the behavior we want because:
1 - when delta is set to true the problem raised gets closed automatically after the analysis window is finished
2 - when delta is set to false, we can only raise one error. The problem raised when using delta false does not get closed automatically, but once closed manually, we are unable to open another problem, even after resetting the counter and raising it up again.
Do you know if there is any way we can achieve the desired scenario?
We recently increased the sliding window eval size for custom events from max 20min to 1 hour. So what you can do to keep the event open for a longer period of time is to set the timeframe to 1 out of 60min within the configuration.
So if one threshold violation occurs, the event is at least kept open for 1 hour. If another error occurs within that hour, you get another hour and so on.
What we cant do is to keep the event in active state forever even if the condition that raised the event is long gone. If you want to do that, you have to attach an incident management system such as JIRA to Dynatrace where tickets can stay in 'Open' state forever.
I hope that helps,
Hi @Wolfgang B.,
thanks for your answer.
We observed also that if we set the delta configuration to false on the JMX plugin metric related to the custom event for alerting, the problem stays open until one of the following happens:
- it gets closed manually
- the count metric goes down of the threshold (it is reset)
If there is a problem open and then it gets closed manually with the count value not being reset, when the count value goes up again a new problem is not open. Do you know why this behavior happens? It seems not really correct, because the count value is above the threshold and yet no problem is open.