24 Apr 2026
10:46 PM
- last edited on
28 Apr 2026
07:29 AM
by
MaciejNeumann
I'm trying to alert on a common problem with our RabbitMQ queues. In our environment, we need to alert off of BOTH of these conditions being true:
1. Count of messages > 0
2. Slope of message count == 0
That is, the message count must not be 0, and none coming in or going out, then we need an alert. Acknowledge/confirm rate don't really cover this, in previous cases where we should be alerting, those rates were still fluctuating, and they're 0 when the message count is 0 anyway.
So, we need a way to alert on these combined cases. Right now, the best solution we have is DQL that looks something like this:
```
timeseries
ready = max(cloud.aws.amazonmq.message_ready_count_average),
by:{`dt.entity.custom_device`},
filter:`dt.entity.custom_device` == "CUSTOM_DEVICE-XYZ",
interval:10m
| fieldsAdd
band_min = arrayMovingMin(ready, 10),
band_max = arrayMovingMax(ready, 10)
| fieldsAdd
stuck_signal = iCollectArray(
if(
ready[] > 0 and (band_max[] - band_min[]) <= 1,
1,
else: 0
)
)
| fieldsKeep
timeframe,
interval,
`dt.entity.custom_device`,
stuck_signal
```
(this is for a all-queue max, but this would be specifized to individual queues once we have that set up)
This has several issues, like it will always be set to 1 for the beginning of whatever time window it's set to track over. Additionally, it's fiddly and very manual--I have to tweak a lot of the parameters here to get something that *mostly* works to alert us when we need it to. And, it's not very idiomatic.
Is there a better way for Dynatrace to alert us on these conditions?
Solved! Go to Solution.
25 Apr 2026 03:42 PM
Hello @idudneymitchell,
Your approach is valid, but I would not try to alert on “slope == 0” directly. I would convert the problem into a binary signal (1) when the queue depth is greater than zero and the min/max value over the last N samples has not changed beyond a small tolerance.
The key improvement is to ignore the beginning of the timeframe, because the moving min/max window is not fully populated yet. For example, if you use a 10-sample window, only evaluate from index 9 onward:
timeseries
ready = max(cloud.aws.amazonmq.message_ready_count_average, default: 0),
by: { `dt.entity.custom_device` },
filter: `dt.entity.custom_device` == "CUSTOM_DEVICE-XYZ",
interval: 10m
| fieldsAdd
window_min = arrayMovingMin(ready, 10),
window_max = arrayMovingMax(ready, 10)
| fieldsAdd
stuck_signal = iCollectArray(
if(
iIndex() >= 9
and ready[] > 0
and window_max[] - window_min[] <= 1,
1,
else: 0
)
)
| fieldsKeep timeframe, interval, `dt.entity.custom_device`, stuck_signalThen create an advanced custom alert on stuck_signal > 0
Once you have queue-level dimensions available, split by queue instead of only by the custom device. Also, I’d recommend treating the flatness threshold as a tolerance rather than exact zero, because exact slope checks tend to be fragile in real queue metrics.
I hope it helps you 😄
27 Apr 2026 03:37 PM
This worked perfectly! Thank you, the most difficult thing was dropping the beginning of the timeframe, that fix was exactly what we needed.
27 Apr 2026 04:39 PM
Hi, @idudneymitchell
I'm happy to see that helped!
Could you gently give a Kudo for the answer? I'm aiming to be, someday, member of the month 😁
Featured Posts