cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

How to setup incident rules to capture Windows Services not running?

cheeyean_lee
Participant

I have a windows services running and dyntrace able to capture purepath activity. When there is an error happen to my windows services, the services will stop running but the status still show "Running" in Services.msc. Everytime this happen, I have to restart my services to get it running again.

I'm trying to setup an incident rules in Dynatrace so that it will trigger me with email when this happens. How should I configure the incident rules for this scenarios?

As attached, I notice the "Size" shows "1" after the error happen.

windows-service-stopped.png


6 REPLIES 6

Radu
Dynatrace Pro
Dynatrace Pro

Hi Chee,

You could create an incident for PurePath size in the context of your transaction so that it alerts if it goes to only 1 node, but would be interesting to see what that one node is in those purepaths. Can you share a screenshot?

Regards,

Radu


cheeyean_lee
Participant

Hi Radu,

Attached with the screen of the node. By the way, anywhere I can get the steps to setup the incident for this scenarios?

purepath-node-detail.png


Radu
Dynatrace Pro
Dynatrace Pro

Hi Chee,

You can read our documentation on setting up incidents here.

Without knowing more about your app/service/transaction, I can only recommend an incident which uses 2 metrics: the invocation of the CheckStatus method and the PurePath Node Count. Set the upper severe threshold for the invocation measure to 1 and set the lower severe threshold for the node count measure also to 1 (so you have at least 1 invocation of CheckStatus and the PurePath is 1 node long).

This should alert you if your PurePaths aren't progressing due to service unavailability.

Alternatively, if you have a web based check for your service (e.g. a URL based endpoint which you can poll for status), you can set up a URL Monitor; also if you have a script which can run a command to get the status of the service, you can use the Generic Execution Plugin to run that command periodically and report on the status (then you can use that in your incident).

Best regards,

Radu


cheeyean_lee
Participant

Hi Radu,

I have tried created the incident but it triggered only for the first time. the subsequent incident did not trigger me.

Can you please let me know anything I miss out?

dynatrace-purepath.png

incident-setting.png


Hi Chee,

Apologies for this, I realised a flaw in the setup I suggested - these measures will work independently. So as the invocation of checkStatus and PurePath size will be checked independently and not on the same PurePath.

What you need to do is create a Business Transaction, which has these 2 measures in the filter section. Then you will be focused on just the checkStatus purepaths with 1 node. Then, in your incident you will use the PurePath Response Time measure created from this Business Transaction with the Count aggregation and the upper severe threshold set to 1 (so if the count of these 1 node purepaths goes over 1, it will trigger). I would recommend an incident timeframe of no longer than 1 minute as you want this to be a fairly sensitive incident and be alerted immediately as this happens.

Let me know how this works.

Regards,

Radu


Hi Radu,

Thanks! I'm able to trigger the incident now.

By the way, how should we create the rule it there is no activity by the agent within certain period of time(no pure path captured)?