cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

ProblemsV2 API, Problem notifications and OS service monitoring

STiAT
Advisor

Hi there,

 

This will be a pretty long post, since it's based upon several different issues, and describes a workaround we currently do for working around inconsistent behaviour of dynatrace itself.

 

Description of the issue

First of all, the requirement triggered from activating OS service monitoring and the inconsistent method you use for actually displaying problems, so I'll post here why we had to do the workaround we are currently doing.

 

Usually, the problem notifications only feature the type of issue occured in the "title", as in example "Memory saturation", "CPU saturation", "Failure rate increase", as follows:

STiAT_0-1664877666732.png

 

This is not the case for OS service monitoring, where you have an actual service and state in the problem title:

STiAT_1-1664877717730.png

But actually, you do gather multiple OS service events into ONE problem, with the title of another service failing, which is to say the least very inconsistent and very confusing:

STiAT_3-1664878024854.png

 

 

The description of the issue with the actual problem notifications

We used the "Title" to actually match which kind of issues get automatically closed in the ticketing system when dynatrace sent a RESOLVED status.

Background to this is, you may want to close OS services monitoring events if the service is running again, but you probably do not want to automatically close tickets for disk space if it just triggers below the required space and above it again and again, you'd like an employee to look at this case.

 

So we implemented tagging for certain hostgroups to tell which kind of Problems should automatically close incidents, and which ones should not.

 

That worked, until OS service monitoring came around and used the title for actual descriptions. Since you can't specify the title for the OS services monitoring, we required a workaround for that.

 

Problem Notifications

In the ProblemDetailsJSON (so ProblemsV1 structure), we received rankedEvents (in example), sporting "customProperties" which actually told us the source was OS services monitoring:

STiAT_7-1664878309653.png

 

So as a workaround to not being able to use the Title any longer, we used this "Event source" to check if the event eventually was a OS services monitoring event.

 

Now, the V1 API gets deprecated, and I have no doubts the ProblemDetailsJSON in the notifications will get deprecated at some point too, so we inteded to switch to the V2 API and the ProblemDetailsJSONv2.

 

The actual V2 API does sport something similar (if undocumented) in the evidenceDetails/details/data/properties structure:

STiAT_8-1664878356543.png

 

Which is fine for the V2 API, but in the ProblemDetailsJSONv2 in the Problem notifications, this is not around (I used another event since I had to take one which actually triggered at the time of testing):

 

STiAT_9-1664878405452.png

 

This way, I can not tell that the event was an actual "OS services monitoring" event, since there is no event source, and you butchered the coherence of the title to do something which is inconsistent with the rest.

 

Furthermore I have issues actually using the evidenceDetails/details/data/properties path, since it's undocumented.

 

I request therefore:

  • Provide us with something stable in the ProblemV2 API and Problem Notification (ProblemDetailJSONv2) which reliably tells what type of event it is.
  • Alternatively actually fix OS service monitoring to be coherent with the rest of the problem management in dynatrace (in example: use something like "OS services in undesirable state" as the title, and give the rest of the information into the details as you do with all other problems. And it would make sense to display "OS services in undesirable state" in the title without any service name or state if you group different services into one problem).

Best regards,

Georg

1 REPLY 1

STiAT
Advisor

As a reply, with a ticket/support call, dynatrace (support) enabled a feature which provides evidence details to the ProblemDetailsJSONv2, so we can work with that now.

 

This is a feature only dynatrace employees can enable. That's not really desirable, but works for now. I'd still prefer a general solution to this kind of issue, even though, due to the setting the dynatrace employees set  specifically it works now (including evidenceDetails), but I'd like a general solution to the issue, since this can't be the way forward to require dynatrace support to be able to identify OS service monitoring events.