cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

I’ve noticed slight differences between values reported as Synthetic Availability in the latest Dynatrace and in Classic Dynatrace

GosiaMurawska
Community Team
Community Team

Can you explain those?  

10 REPLIES 10

Cezary_Tomaszew
Dynatrace Participant
Dynatrace Participant

Hi Gosia, 

Thank you for asking. Availability metrics are crucial in Synthetic Monitoring, and it's important to understand how they're calculated in Grail and what differences can be expected. 

The new availability metrics in Grail are calculated by dividing the number of successful executions ('up') by the total number of executions. In contrast, the classic approach measures availability based on the duration that a monitor is considered 'up' (see the documentation). 

The new approach requires all executions to happen at the same rate. However, in a real environment, this is not always the case, as monitors' execution frequency may be changed, or additional executions may be triggered in on-demand mode. So, we added an interpolation mechanism to adjust the number of executions to a fixed, minute-level resolution. In the diagram below, the orange dots show actual executions, while the blue dots are added by the interpolation mechanism. 

In this new approach, availability is calculated as follows:  

Availability = (Number of "Up" Executions / Total Number of Executions) × 100 

Both the Classic and Grail approaches provide estimates of availability, and slight differences between them are expected. This is because Synthetic Monitoring operates at discrete intervals, meaning it does not capture the exact moment when downtime begins and ends. These differences simply stem from the way availability is measured, not from a change in accuracy or reliability.  

See the example below: 

Cezary_Tomaszew_0-1744021265352.png

 

Real uptime and downtime are based on assumptions. Synthetic downtime is detected with the first failed execution of the monitor, and similarly, uptime is detected with the first successful execution after an outage. 

In the given example, Synthetic Monitoring was executed every 5 minutes between 08:25 AM and 09:00 AM.  

Time 

Real  

Detected 

Downtime started 

08:41:20 AM 

08:41:57 AM 

Uptime resumed 

08:46:40 AM 

08:46:50 AM 


Real outage calculation
: 

Availability = (Up time / total time) * 100% = (5 min 20 s / 35 min) * 100% = 84.76% 

Classic-approach calculation: 

Availability = (Up time / total time) * 100% = (4 min 53 s / 35 min) * 100% = 86.04% 

Grail-approach calculation: 

Availability = (Up executions / total executions) * 100% = (5 / 35) * 100% = 85.71% 

There is one more aspect impacting availability, a matter of including or not including the maintenance window in calculations. We're about to deliver mechanisms for excluding the period of time during which MW happened from synthetic availability calculations on Grail. 

@Cezary_Tomaszew will publish more details about that soon.

@Cezary_Tomaszew,
What is the meaning of 5, highlighted below, in Grail? Is it 5 minutes, or 5 counts?

Availability = (Up executions / total executions) * 100% = (5 / 35) * 100% = 85.71% 

Antonio Sousa

@AntonioSousa That would be  Number of "Up" Executions (count)

Phani Devulapalli

@p_devulapalli,

Hope that's not the case, as there is potential for disaster here.

I remember vividly the discussions when transitioning from Keynote/Gomez to the Ruxit/Dynatrace synthetics...

Antonio Sousa

Hi @AntonioSousa 
Could you please explain it further? I would love to understand better the potential for disaster you're referring to here.

Best Regards,

Jacek

Calculating availability based on count, I just cannot imagine how many scenarios are going to be wrong? What it will mean for long time-series comparison? Synthetic availability is a little science in it's own, and now we are just going to do counts?

But it's going to be easy, just put the two values alongside each other. I don't have the time to dig into it now, but as I said, this seems like a deja-vu, from a long time ago...

Antonio Sousa

@Cezary_Tomaszew On a similar note, I noticed difference in the way Total duration is being calculated for Browser monitors . I see the values seems to be fairly different when I compare with what I see in the classic app vs what's in new app /grail for the browser monitors .

Could you please shed some light on this ?

Example below - 

p_devulapalli_1-1744092628533.png

 

p_devulapalli_2-1744092719520.png

 

 

 

Phani Devulapalli

Hi @p_devulapalli 
Thanks for raising that question.
We have also noticed that we're reporting inaccurate metrics as performance values within the new Synthetic app. We will fix this soon.
We're about to release the next significant update of the Synthetic app, focusing mainly on Browser Monitors. One element of this initiative will be an update of performance metrics (yes, plural). Expect them to be described in the documentation, but I will also provide an update about those in the community, likely even earlier.
For now, let me suggest using values reported in the classic app as a source of truth for Browser monitors. Also, the mechanism that compares performance metrics vs. defined thresholds to decide whether to raise performance problems uses that value.

Further updates soon

Best Regards, 

Jacek

Thanks for the update @Jacek_Janowicz 

Phani Devulapalli

Featured Posts