cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

How to derive a field in DQL

susmita_ally
Participant

I have a problem where i need to check log content for job status. If the job status is STARTED and not ENDED yet then need to show as Running, and if the Job status is started and ended ok, then Completed, else if job status is Started and ended not ok then failed. Any help is appreciated. Thank you

12 REPLIES 12

dannemca
DynaMight Guru
DynaMight Guru

You should be able to achieve this with the Conditional IF:

https://docs.dynatrace.com/docs/platform/grail/dynatrace-query-language/functions/conditional-functi...

 

Site Reliability Engineer @ Kyndryl

I have tried below, it is only working for the first condition, but when its NOT OK, its not returning as Failed. Is it not how it should be done ?

fetch logs //, scanLimitGBytes: 500, samplingRatio: 1000
| filter matchesValue(log.source, "/opt/controlm/ctmagent/ctm/dailylog/daily_ctmag_*")
and matchesphrase(jobstatus, "STARTED")
|filter in(job,array($jobname))
|parse content , "TIMESTAMP('MMdd hhmmss'):Starttime"
|fields job, orderid, runno,Starttime,host.name
| join [fetch logs
| filter matchesValue(log.source, "/opt/controlm/ctmagent/ctm/dailylog/daily_ctmag_*")
| filter matchesphrase(jobstatus, "ENDED OK") OR matchesphrase(jobstatus, "ENDED NOTOK")
|filter in(job,array($jobname))
|parse content , "TIMESTAMP('MMdd hhmmss'):EndTime"
|parse content , "LD 'elapsed -', LD:RunTime 'cpu'"

|fieldsadd Current_status =if(matchesphrase(jobstatus, "ENDED OK"), "Completed"),if(matchesphrase(jobstatus, "ENDED NOTOK"),"Failed",else: "Running")
|fields orderid,runno,jobstatus,Current_status,EndTime,RunTime

] , on: {orderid,runno}
|fields host.name, job, orderid, runno,right.jobstatus, right.Current_status, Starttime,right.EndTime, right.RunTime

Try this If condition:

data record(jobstatus="ENDED OK"), record(jobstatus="ENDED NOTOK"), record(jobstatus="Danne")
| fieldsadd Current_status =if((not matchesphrase(jobstatus, "ENDED OK")) and (not matchesphrase(jobstatus, "ENDED NOTOK")), "Running", else: (if(matchesphrase(jobstatus, "ENDED OK"), "Completed", else: "Failed")))

Results:

image.png

Regards.

Site Reliability Engineer @ Kyndryl

Thank you! it works as expected.

I'm trying to add an alert for the jobs with 'Running' status, to see if the job is running more than certain (for e.g 30 mins) time, then send an alert saying this job is long running. I'm not sure how can I achieve this. Any pointer would be very helpful.

First you will need to create a metric, following this procedure:

https://docs.dynatrace.com/docs/observe-and-explore/logs/lma-analysis/lma-log-metrics#create-metric-...

Then, once the metric starts to receive data points, you can create a metric event for alerting:

https://docs.dynatrace.com/docs/platform/davis-ai/anomaly-detection/metric-events

making sure to adjust the interval to 30 mins.

Site Reliability Engineer @ Kyndryl

@dannemca Thanks for the pointer, however, I need to do join to get the job run time for running jobs, and below is my query  -

| filter matchesValue(log.source, "/opt/controlm/ctmagent/ctm/dailylog/daily_ctmag_*") and matchesphrase(jobstatus, "STARTED")
|parse content , "TIMESTAMP('MMdd hhmmss'):startts"
|fields startts,job, orderid, runno
| join [fetch logs
| filter matchesValue(log.source, "/opt/controlm/ctmagent/ctm/dailylog/daily_ctmag_*")
| filter matchesphrase(jobstatus, "ENDED OK") OR matchesphrase(jobstatus, "ENDED NOTOK")
|fieldsadd Current_status =if((not matchesphrase(jobstatus, "ENDED OK")) and (not matchesphrase(jobstatus, "ENDED NOTOK")), "Running")
|fieldsadd duration= if (jobstatus == "Running", (unixSecondsFromTimestamp(now()) - unixSecondsFromTimestamp(startts)))
| fields orderid,runno,jobstatus,duration

] , on:{orderid, runno}
| fields orderid,runno,job,right.jobstatus,right.duration

 

However, I am not able to do the join condition in processor, so not able to get this duration for the log metrics. Is there another place where I can run this on a scheduled time to identify long running job and alert ? 

We are not able to create metrics directly from raw DQL (there is an idea for that).

In this case, you want to be notified if a job is still running, means jobstatus is not "ENDED OK" nor "ENDED NOTOK", right? So you can just set these conditions in the matcher, like:

 

 

matchesValue(log.source, "/opt/controlm/ctmagent/ctm/dailylog/daily_ctmag_*") and  (matchesphrase(jobstatus, "ENDED OK") OR matchesphrase(jobstatus, "ENDED NOTOK"))

 

 

Then you can set some dimensions like timestamp, job, orderid ,runno and jobstatus.

You can use these dimensions to work on the metric selector and define the alerts rules by it.

Try and let us know.

Site Reliability Engineer @ Kyndryl

Unfortunately only matcher is not giving the intended result as we need to identify the duration from start time to now(). We are looking for a solution where we would run this dql in certain time and then get the job run time and if any of them have run time more than 30 mins (not ended yet) then send an alert. Would workflow be appropriate for this kind of scenario ?

Yep, that would be my next suggestion. You can create a workflow to run in a crontab way, that execute your DQL and then take some action based on its results.

Site Reliability Engineer @ Kyndryl

Thanks @dannemca, workflow solution is working as expected.  On a different log content, I am tryin gto parse the content like "REPLACE_WEB_SERVICE_UI=TRUE" with below parse dql : PARSE content, " DATA:propkey '=', DATA:propValue EOL;"

However, it does not seem to work. is there another way to find out the value from = to end of line ?

 

Try this:

data record(content="REPLACE_WEB_SERVICE_UI=TRUE")
| parse content, "LD:key '=' LD:value"

Result:

image (15).png

Regards.

Site Reliability Engineer @ Kyndryl

Thanks so much! works like a charm.

Featured Posts