I'm trying to create an SLO definition based on the number of fast and slow key requests, WITHOUT the need to create a custom metric for slow requests.
My idea was to define the SLO as:
100* (number of slow key requests / number of total requests)
nothing special, really.
I could create a custom service metric for the keyrequest that counts the number of requests that exceed a certain response time. But could this be done without that step, just with metric queries?
I thought of this metric query:
builtin:service.keyRequest.response.time :filter( <filter for my key request goes here> ) :avg :partition( "time", value(fast,lt(300000)), value(slow,otherwise) ) :splitBy("time") :filter(eq("time","fast")) :count :fold(sum)
The idea was to use partition as a way to get the metric data points above and below a certain slow/fast level and then use count to calculate the number of datapoints and use those in the SLO definition: 100*(slow/fast).
(it wouldn't be perfectly accurate due to aggregation, but it should work for the SLO)
The metric partitioning works:
but I'm struggling in "counting" the number of metric points required for the SLO calculation.
Any ideas how this could be achieved?
Thankyou @r_weber for this post, I dont have any solution yet but due to your post came across these below blogs which helped me to understand the SLO integration much better 😉