I am working on an AppMon environment where if there's a problem with the application or host, it will be killed and then rebuilt. This is fine from an instrumentation perspective, where the agents map and start collecting data successfully.
When this happens though, if the issue caused the process to die, this raises an agent disconnected (unexpected) alert. The problem is that when the agent comes back up, it looks like it's on a different host. Especially during testing where servers are frequently rebuilt, this causes the infrastructure overview to fill up with a large number of offline servers, as well as causing alerts that never close (because the agent, even though it came back up, looks like it's on a different server, thus a different agent).
Is there a good way to handle this situation? I'd prefer not to disable the alert, because it's still valuable to know when processes have had an issue and needed to be rebuilt. I suspect the offline servers will disappear after 72 hours - is there any way to reduce this time, maybe to 24 hours?
Solved! Go to Solution.