cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

setting value of Default incident

ksk_natsume
Inactive

I would like you to tell me the threshold of the setting for each item in the default incident list.
Is there a description in the manual?

I want to know the reference value (How many% or how many sec etc ...)

Best Regards

Ryotaro Natsume

6 REPLIES 6

Hello Ryo,

I think there is no such manual for selection of time-frame or threshold(except the documentation.)

The correct practice is either to get the threshold requirements clarified from the Business/ Team or estimate from the Trend by plotting a weeks data in dynatrace. So hit the new dashboard, plot the metric, set the granularity (important). And see for anomaly (rise/fall) in the trend. Then decide the threshold and the timeframe duration.

For example:

1. Create a dip alert whenever the number of login count dips less than 300 in 5 minutes.

2. Create a slowness alert when 2000+ Purepath have response time above 2 second in last 5 minutes.

Regards, Rajesh.

Cody_Kachelski
Dynatrace Helper
Dynatrace Helper

Hello Ryo,

Rajesh is right, and creating a custom alert is often the best option. Many of the default incidents are based on pre-defined conditions. However, the "response time degraded" and "failure rate" incidents for baselined business transactions, can be edited by changing the violation detection settings of the individual business transaction.

ksk_natsume
Inactive

Thank you for your answer.

I just want to know the threshold that is set for the incidents that exist by default.


for example, "Host Memory Unhealthy" , Is this displayed when the host's memory usage exceeds what percentage?
same as
"Host Disk Unhealthy" , what percentage?
"Host Network Unhealthy" , what percentage?
"Host CPU Unhealthy" , what percentage?
"Response time degraded" Howmany sec?
I couldn't edit Incident rule so I couldn't see the threshold.
and I couldn't get the information on documentation.

Do I have to make new rules?

Best Regards

Ryotaro Natsume

Hi Ryo,

The out-of-the-box incidents are triggered by dyntrace on multiple detection rules and these measures might not have EXPLICIT measures.

1. Host Memory Unhealthy: This is triggered when high number of page faults occur or <=10% of RAM is free or less than 1 GB is free.

2. Host Disk Unhealthy: The Host has at least one disk with insufficient free space.

3. Host Network Unhealthy: Network traffic exceeds 90% of Ethernet's bandwidth.

4. Host CPU Unhealthy: CPU utilization (System Time > 15%) is high or CPU Load factor is high

5. Response time degraded: The response time of a web request or BT is higher than baseline. You can also set static threshold. But It is better to leave it auto-baselined.

Please look up these on the Incident Dashboard or the Documentation.

Also note, you can create separate custom alerts. Let me know if you need any inputs.

Do I have to make new rules?

It depends upon requirements of your application environment. However, I would advise to leave the default incidents as it is. For most of the time these incidents work well and good.

BabarQayyum
Leader

Hello Ryotaro,

You can't edit and delete the built-in or default incidents.

Follow the instructions form the link and to define the new thresholds of infrastructure:

https://community.dynatrace.com/community/display/...

Regards,

Babar