I've got a little bit of a doubt; this is the situation: on Friday 100 hosts were gracefully shutdown and, since I set Dynatrace to open a problem even for this kind of event, I got 100 problems opened.
Today (Monday) I noticed that all the problems were closed after 12h from the shutdown event on Friday even tought the hosts are still down.
Is this normal? Does Dynatrace close problems for gracefull shutdown after a certain amount of time even if the hosts are still down?
Can I change this behavior somewhere in the setting?
Solved! Go to Solution.
Perhaps dynatrace decided that this was planned behaviour. In general if shutdown was gracefull, better idea is configuring maintenance window to prevent falls positive alerting.
I needed that list of opened problems to remember the 100 hosts that I shutted down on Friday.
Sure I could have done a Maintenance window but I didn't in order to pick every single problem tile today and start rebooting all hosts listed there.
Can I avoid this automagic problem management system so I can keep my problem open even after 12h of shutted down hosts?
No, but there is no issue with it. When you go to host list and pick filter that will list your offline hosts (by tags, by name etc) with timeframe 72 hours you will see all of them. Offline and online ones. In such case you see which of them still needs restarts or agent installation.
Thank you for your time Sebastian
Yes that's by design, as documented here:
In previous versions we kept the problems open for 7days but many customers were annoyed by the open problems and demanded a shorter timeout period of 12hours for closing the host unavailable problems.
Alerts are sent out anyway, so it does not make much sense to keep those problems open forever if the host is not coming up again.
As Ben also started, this is a big Process problem for us with ServiceNow Integration. The PRoblem is closing the ServiceNow Incidents after 12 hours even if the server is still down. This is unacceptable from our customers perspective who uses ServiceNow to rack the status and availability of their servers from ServiceNow. We need the ability to either change the configuration to a much longer time, like 7 days ;-), or never to close the Alert to ServiceNow unless the problem is corrected.
It would be nice if this was configurable. In some scenarios it is nice that the problems time out and disappear, but in other scenarios that can be an issue
One scenario where automatic closure after twelve hours causes problems is with the ServiceNow integration. The host going offline creates an incident in ServiceNow that is then closed after 12 hours. This in some cases prevents teams returning from the weekend from investigating these issues and hosts are then left in a offline state.