Hi, @olegus , try the metric builtin:pgi.availability, https://www.dynatrace.com/support/help/observe-and-explore/metrics/built-in-metrics#other-process-me...
Combined with the GET Metric Data points API: https://www.dynatrace.com/support/help/dynatrace-api/environment-api/metric-v2/get-data-points
curl --location 'https://tenant.live.dynatrace.com/api/v2/metrics/query?from=now-7d&metricSelector=builtin%3Apgi.availability%3Afilter(and(or(in(%22dt.entity.process_group_instance%22%2CentitySelector(%22type(process_group_instance)%2CentityId(~%22PROCESS_GROUP_INSTANCE-E0AD0B6FE4F5EBC8~%22)%22)))))%3AsplitBy(%22dt.entity.process_group_instance%22)%3Asort(value(auto%2Cdescending))%3Alimit(20)' \ --header 'Authorization: Api-token dt0c01.4OHVEPBHJGFYCVRCPHAI4FU4.tokenwithrightscope'
If you are looking for a "non supporter technology" or non detected service, you can set up the OS Monitoring Service Monitoring, https://www.dynatrace.com/support/help/platform-modules/infrastructure-monitoring/hosts/monitoring/o... , and then the metric builtin:host.osService.availability, https://www.dynatrace.com/support/help/observe-and-explore/metrics/built-in-metrics#os-service
Try and let us know.
That kinda works, thx!
Is it possible to use builtin:host.osService.availability metric to get same results ?
Here is my Get from Postman:
I cant figure how to select entities, getting this warning:
There is no entity type as os:service. When configured, the entity type for the OS Service is a Custom_Device, but can be kinda tricky to use the entity selector like this. I suggest you to use the data explorer to create the metric selection there, using the filters by host or by specific OS Service, then call the API.
Try and let us know.
That does not return any results for me, but I played a bit with this metric and I believe I'm going to the right direction -
It seems to return nice results,
the only my concern so far is that it always returns 100% or null , so I'm trying to find any monitored service that was down for some time to prove that data is correct. Null is probably fine as we started to monitor hosts recently (resolution is set to 1w in the request above)
BTW, my goal is to present service availability per a "product", that has multiple hosts, so it means that I need to collect and merge service metrics from all hosts for the related management zone
Well.. I'm confused how availability metric is calculated.
I am playing with different options and looks like this short form returns what I need :
(builtin:osservice.availability:filter(contains("dt.osservice.display_name","SQL Server B")))
EXCEPT that result values does not look like percentage to me. For instance, I found a day and time where SQL services were down for a short period and I am trying to get availability metric for "SQL Server Browser" service (see request above) with 1 day resolution and 5 days time period.
This request returns these values:
The metric that is used to show service availability on Host page has a filter for a specific entity ID. :
I'd like to use another filter - filter(contains("dt.osservice.display_name",...) that would in theory return availability for all services from all hosts in the specified management group that fall under this filter.
Is it feasible?
Looks like the most reliable way to get an average service availability percentage per all monitored OS services on all hosts that belong to a specific management zone is to query metrics for the specific service and then aggregate numbers in code.
So my workflow is :
- to get all hosts for a specific Mgr Zone using Monitored Entities endpoint:
- for each host get monitored OS Services using same endpoint:
- for each service get its availability metrics:
- from metrics response grab values array, in my case it would have just one entry as I need this data per month and I set resolution to 1M
One more question about this metric - does builtin:osservice.availability consider maintenance windows?
If I have a scheduled maintenance window say, 3 hours weekly on Sundays and outside of this window my service is running 100% , would this metric reflect 3 hours downtime? Would it show 100% for this week or less?