cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

Custom alerts for log monitoring

jc__
Frequent Guest

Hi

I would like to create a custom alert on log monitoring whereby an alert is raised when more than 1 log events occurred in a day. I tried to configure the custom event to raise an error when the metric threshold is above 1 in the given day, which is 1440 minutes. However, the maximum value of minutes period is 60. Can anyone advise me on this issue?

jc___0-1669347452192.png

Thank you.

5 REPLIES 5

dannemca
DynaMight Guru
DynaMight Guru

Correct me if I am wrong, but if you get an alert in 1h, it means that you got an alert that day, and you will need to be notified about it.

The only problem with the 1h instead 24h is that you may end up receiving more than one alert per day, which may indicates that the system you are monitoring is not so healthy and the attention is required.

You can also work with metrics transformation, for example, limiting the data points to the last day with :timeshift(-1d) and then combine the data to a single point with :fold.

Example:

your.custom.metric.for.log:timeshift(-1d):fold

Try and let us know.

Site Reliability Engineer @ Kyndryl

jc__
Frequent Guest

Hi dannemca ,

Thank you for responding.

I might not have make myself clear in the question asked.

The requirement:

Raise an alert if the metric is above the static threshold of 1 in 2 one minute slot during a day (24 hrs)

The scenario:

Our client has a server that will restart once everyday, hence, one "initialized" keyword will be observed in the log. If there is more than 1 "initialized" keyword raised in a day, an alert should be raised so that our client can look into the issue.

May I know if there is any way to achieve the above requirement?

Thank you.

Hi @dannemca! Do you know if that alerting requirement that jc_ mentioned above is actually possible to achieve? Thank you for your help in advance!

Tom_Eaton
Dynatrace Advisor
Dynatrace Advisor

You could look into a possible similar solution as I have seen once before:

  1. Create a log metric to find the log messages with 'initialized' - use matchesPhrase
  2. Then create an SLO using a metric selector with the defualt(0,always) for your log metric - this will give every 1 minute sample a 0 default value when data is missing. 

    I have seen this used for a similar scenario for a batch job with a log being written once a day, looking for a day when it was not written - so someone used a SLO to look back -25 hours

     This might be a possible avenue to explore. 

@dannemca - the timeshift only moves the time, it is used as a timeframe window. So -1d will not give the last 24 hours, but just make the current date and time, 1 day before.. I.E 31st March becomes 30th March

Featured Posts