04 Oct 2022 03:18 AM - last edited on 04 Oct 2022 04:06 AM by MaciejNeumann
This will be a pretty long post, since it's based upon several different issues, and describes a workaround we currently do for working around inconsistent behaviour of dynatrace itself.
Description of the issue
First of all, the requirement triggered from activating OS service monitoring and the inconsistent method you use for actually displaying problems, so I'll post here why we had to do the workaround we are currently doing.
Usually, the problem notifications only feature the type of issue occured in the "title", as in example "Memory saturation", "CPU saturation", "Failure rate increase", as follows:
This is not the case for OS service monitoring, where you have an actual service and state in the problem title:
But actually, you do gather multiple OS service events into ONE problem, with the title of another service failing, which is to say the least very inconsistent and very confusing:
The description of the issue with the actual problem notifications
We used the "Title" to actually match which kind of issues get automatically closed in the ticketing system when dynatrace sent a RESOLVED status.
Background to this is, you may want to close OS services monitoring events if the service is running again, but you probably do not want to automatically close tickets for disk space if it just triggers below the required space and above it again and again, you'd like an employee to look at this case.
So we implemented tagging for certain hostgroups to tell which kind of Problems should automatically close incidents, and which ones should not.
That worked, until OS service monitoring came around and used the title for actual descriptions. Since you can't specify the title for the OS services monitoring, we required a workaround for that.
In the ProblemDetailsJSON (so ProblemsV1 structure), we received rankedEvents (in example), sporting "customProperties" which actually told us the source was OS services monitoring:
So as a workaround to not being able to use the Title any longer, we used this "Event source" to check if the event eventually was a OS services monitoring event.
Now, the V1 API gets deprecated, and I have no doubts the ProblemDetailsJSON in the notifications will get deprecated at some point too, so we inteded to switch to the V2 API and the ProblemDetailsJSONv2.
The actual V2 API does sport something similar (if undocumented) in the evidenceDetails/details/data/properties structure:
Which is fine for the V2 API, but in the ProblemDetailsJSONv2 in the Problem notifications, this is not around (I used another event since I had to take one which actually triggered at the time of testing):
This way, I can not tell that the event was an actual "OS services monitoring" event, since there is no event source, and you butchered the coherence of the title to do something which is inconsistent with the rest.
Furthermore I have issues actually using the evidenceDetails/details/data/properties path, since it's undocumented.
I request therefore:
As a reply, with a ticket/support call, dynatrace (support) enabled a feature which provides evidence details to the ProblemDetailsJSONv2, so we can work with that now.
This is a feature only dynatrace employees can enable. That's not really desirable, but works for now. I'd still prefer a general solution to this kind of issue, even though, due to the setting the dynatrace employees set specifically it works now (including evidenceDetails), but I'd like a general solution to the issue, since this can't be the way forward to require dynatrace support to be able to identify OS service monitoring events.
Hi @STiAT, happy to see that you have a workaround for now! But were you able to find a general solution to your problem maybe?
If not, let's hope this post shows up on the main page once again and someone with the answer shares it here 😉
No, we're still waiting for that to be enabled by default, and actually the service monitoring being in line with the rest of the monitoring.