03 Oct 2023 06:49 PM
Hello,
Currently when you make a custom metric alert, it can only be based off of one metric. For instance, you can't tell an alert to go off if there is both a cpu threshold of 95% and a memory threshold of 95%. You can only tie one metric to each alert that you make.
With using grail / dql, are there any planned changes for this?
Thanks
03 Oct 2023 08:14 PM
Probably not the answer you are looking for, but you can combine metrics when you create SLOs and then set custom alerts on these SLOs.
Unfortunately, SLOs do not support dimensions so you need 1 SLO for each host (and consequently, 1 metric event per host).
The below SLO example adds CPU usage and Memory usage of a specific host "hostname" and divides by 2. When the SLO reaches 90% it means both CPU and Memory are at their 90% usage:
(builtin:host.cpu.usage:filter(and(or(in("dt.entity.host",entitySelector("type(host),entityName.equals(~"hostname~")"))))):splitBy("dt.entity.host")+builtin:host.mem.usage:filter(and(or(in("dt.entity.host",entitySelector("type(host),entityName.equals(~"hostname~")"))))):splitBy("dt.entity.host"))/2