05 Feb 2024 05:05 PM - last edited on 07 Feb 2024 10:57 AM by MaciejNeumann
I have a problem where i need to check log content for job status. If the job status is STARTED and not ENDED yet then need to show as Running, and if the Job status is started and ended ok, then Completed, else if job status is Started and ended not ok then failed. Any help is appreciated. Thank you
Solved! Go to Solution.
05 Feb 2024 05:31 PM
You should be able to achieve this with the Conditional IF:
06 Feb 2024 05:25 PM
I have tried below, it is only working for the first condition, but when its NOT OK, its not returning as Failed. Is it not how it should be done ?
fetch logs //, scanLimitGBytes: 500, samplingRatio: 1000
| filter matchesValue(log.source, "/opt/controlm/ctmagent/ctm/dailylog/daily_ctmag_*")
and matchesphrase(jobstatus, "STARTED")
|filter in(job,array($jobname))
|parse content , "TIMESTAMP('MMdd hhmmss'):Starttime"
|fields job, orderid, runno,Starttime,host.name
| join [fetch logs
| filter matchesValue(log.source, "/opt/controlm/ctmagent/ctm/dailylog/daily_ctmag_*")
| filter matchesphrase(jobstatus, "ENDED OK") OR matchesphrase(jobstatus, "ENDED NOTOK")
|filter in(job,array($jobname))
|parse content , "TIMESTAMP('MMdd hhmmss'):EndTime"
|parse content , "LD 'elapsed -', LD:RunTime 'cpu'"
|fieldsadd Current_status =if(matchesphrase(jobstatus, "ENDED OK"), "Completed"),if(matchesphrase(jobstatus, "ENDED NOTOK"),"Failed",else: "Running")
|fields orderid,runno,jobstatus,Current_status,EndTime,RunTime
] , on: {orderid,runno}
|fields host.name, job, orderid, runno,right.jobstatus, right.Current_status, Starttime,right.EndTime, right.RunTime
06 Feb 2024 06:01 PM
Try this If condition:
data record(jobstatus="ENDED OK"), record(jobstatus="ENDED NOTOK"), record(jobstatus="Danne")
| fieldsadd Current_status =if((not matchesphrase(jobstatus, "ENDED OK")) and (not matchesphrase(jobstatus, "ENDED NOTOK")), "Running", else: (if(matchesphrase(jobstatus, "ENDED OK"), "Completed", else: "Failed")))
Results:
Regards.
06 Feb 2024 06:52 PM
Thank you! it works as expected.
I'm trying to add an alert for the jobs with 'Running' status, to see if the job is running more than certain (for e.g 30 mins) time, then send an alert saying this job is long running. I'm not sure how can I achieve this. Any pointer would be very helpful.
06 Feb 2024 06:58 PM
First you will need to create a metric, following this procedure:
Then, once the metric starts to receive data points, you can create a metric event for alerting:
https://docs.dynatrace.com/docs/platform/davis-ai/anomaly-detection/metric-events
making sure to adjust the interval to 30 mins.
06 Feb 2024 08:40 PM
@dannemca Thanks for the pointer, however, I need to do join to get the job run time for running jobs, and below is my query -
| filter matchesValue(log.source, "/opt/controlm/ctmagent/ctm/dailylog/daily_ctmag_*") and matchesphrase(jobstatus, "STARTED")
|parse content , "TIMESTAMP('MMdd hhmmss'):startts"
|fields startts,job, orderid, runno
| join [fetch logs
| filter matchesValue(log.source, "/opt/controlm/ctmagent/ctm/dailylog/daily_ctmag_*")
| filter matchesphrase(jobstatus, "ENDED OK") OR matchesphrase(jobstatus, "ENDED NOTOK")
|fieldsadd Current_status =if((not matchesphrase(jobstatus, "ENDED OK")) and (not matchesphrase(jobstatus, "ENDED NOTOK")), "Running")
|fieldsadd duration= if (jobstatus == "Running", (unixSecondsFromTimestamp(now()) - unixSecondsFromTimestamp(startts)))
| fields orderid,runno,jobstatus,duration
] , on:{orderid, runno}
| fields orderid,runno,job,right.jobstatus,right.duration
However, I am not able to do the join condition in processor, so not able to get this duration for the log metrics. Is there another place where I can run this on a scheduled time to identify long running job and alert ?
06 Feb 2024 08:55 PM - edited 06 Feb 2024 08:56 PM
We are not able to create metrics directly from raw DQL (there is an idea for that).
In this case, you want to be notified if a job is still running, means jobstatus is not "ENDED OK" nor "ENDED NOTOK", right? So you can just set these conditions in the matcher, like:
matchesValue(log.source, "/opt/controlm/ctmagent/ctm/dailylog/daily_ctmag_*") and (matchesphrase(jobstatus, "ENDED OK") OR matchesphrase(jobstatus, "ENDED NOTOK"))
Then you can set some dimensions like timestamp, job, orderid ,runno and jobstatus.
You can use these dimensions to work on the metric selector and define the alerts rules by it.
Try and let us know.
06 Feb 2024 10:28 PM
Unfortunately only matcher is not giving the intended result as we need to identify the duration from start time to now(). We are looking for a solution where we would run this dql in certain time and then get the job run time and if any of them have run time more than 30 mins (not ended yet) then send an alert. Would workflow be appropriate for this kind of scenario ?
07 Feb 2024 12:21 PM
Yep, that would be my next suggestion. You can create a workflow to run in a crontab way, that execute your DQL and then take some action based on its results.
09 Feb 2024 05:49 PM
Thanks @dannemca, workflow solution is working as expected. On a different log content, I am tryin gto parse the content like "REPLACE_WEB_SERVICE_UI=TRUE" with below parse dql : PARSE content, " DATA:propkey '=', DATA:propValue EOL;"
However, it does not seem to work. is there another way to find out the value from = to end of line ?
09 Feb 2024 05:57 PM
Try this:
data record(content="REPLACE_WEB_SERVICE_UI=TRUE")
| parse content, "LD:key '=' LD:value"
Result:
Regards.
09 Feb 2024 06:15 PM
Thanks so much! works like a charm.