When Dynatrace server is loosing connection to mission control for longer time, it will result event on cluster. In general server can work without this connection for 2 weeks, so you should be able to see it while login to cluster. But I think such email should be send to from cluster itself, not mission control. I’m not sure if such email is sending because I’ve didn’t have such issues.
Cluster sends out a notification to Cluster admin if the connectivity to Mission Control is down for approximately 1 hour.
About the first question - if the cluster or cluster node goes down, I'm afraid this is not possible at the moment as Mission Control does not alert customers. But I've never seen a cluster node crash or going down unintentionally. Even failing upgrades restore cluster to the previous working state.
Thanks for the input, appreciate it.
Yes for the event of losing connectivity to Mission Control we can indeed know it.
How about the event of Dynatrace Managed down?
There is this one time where Dynatrace Managed isn't accessible (web browser would said connection refused despite all services are running fine) and it turns out we have to restart all the java processes related to Dynatrace Managed then the problem resolved.
Luckily this only happens one-time, but one interesting question from customer though: "of course we can't know it during the incident, but at least we gotta know it posthumously. Think of AppMon for example, of course it couldn't alert when it is down, but at least after it is recovered we can know from the 'incidents' dashboard when the FrontEndServer and/or Server is down"
Well, I can confirm, that you get a notification of failed upgrades. So far I did not encounter a situation when a cluster node could not recover itself. Of course, that may happen (hardware problems, OS problems), but I think you have to monitor this yourself with other means (simple shell script).
If your customer is really worried about this situation, they should probably go for SaaS 🙂