07 Dec 2021 11:21 AM - last edited on 30 Mar 2022 12:35 AM by MaciejNeumann
I would like to create an alert for when a process has crashed.
Has anyone done something similar?
Solved! Go to Solution.
Does Process group availability monitoring and alerting work?
Yes it does, granted you need to have the following set:
And then the associated Alert Profile will trigger then trigger the alert integration and send out an alert as you have set up.
In our use-case we have Processes which are not running all the time, to it's normal that they become "unavailable" by design.
On the Processes->RandomProcess.exe Page there is this beautiful "Events" Graph which would be nice if we could use them for custom metrics. Or from the Graph in the "Application & Microservices" - "Profiling and optimization" - "Crashes" section.
Doesn't help in a scenario where there are multiple worker processes and we want to alert when any of the workers crash. We can see it in DT, availability does not help here. It would be nice to separate availability events caused by crashes anyway.
This can be achieved with Event API, via a developed script to make an Event API call periodically and limited it to only crash events. In my environment I am monitoring a large scale of servers with more than 600 applications, most of the crashed events actually come from unimportant processes which only executed at ad hoc basis. To filter out the noise, I will proceed further to verify the crashed process uptime using the time-series CPU usage metric to filter out the false alarm.
Please refer to the attached picture below to understand how it works, hope it helps.