Solved: Parse log data that contains \n (newline)

andre_vdveen · ‎16 Jul 2025

Hi,

I've been struggling with the DQL processor definition syntax and just not winning...any suggestions on how to parse the items and their respective values for all the 'Order' items in the "message" part of this log?

{
"thread": "redacted::redacted",
"level": "INFO",
"loggerName": "de.hybris.redacted",
"time": "2025-07-16 05:45:14.751",
"endOfBatch": "false",
"message ": "OrderCountSentToday: 378\nOrderCountSentLast10Minutes: 66\nOrderCountErroredToday: 0\nOrderCountErroredLast10Minutes: 0\nOrderCountNonCancelledToday: 378\nOrderCountVsSent: 0\nAverageOrderSendTimeToday: 8.634920634920634\nAverageOrderSendTimeLast10Minutes: 7.848484848484849\nAverageOrderTimeWaitingToSend: 0.0\nTaskCount: 23\n",
"mdc": {
"CronJob": "(jobname) ",
"Tenant": ""
}
}

Also, what would the matching condition be, because anything I've tried other than 'true' doesn't work either.

GerardJ · ‎16 Jul 2025

Hi @andre_vdveen
In such case, I would use DPL with KVP, like this :
| parse content, """DATA "message \": \"" LD:message "\"""""
| parse message, """KVP{ALNUM:key ':' LD:value '\\n'}:attr"""

For matching condition, it really depends how this log is collected and if you can rely on "log.source" or other added attributes

Gerard

andre_vdveen · ‎16 Jul 2025

Hi @GerardJ
Thank you for the suggestion; I tested it but it doesn't parse the content of what's contained in "message " into key:value pairs as I expected.

Did it work for you when you tested it?

GerardJ · ‎16 Jul 2025

it seemed to work, I got this result (raw reponse):

Gerard

andre_vdveen · ‎16 Jul 2025

Thanks for sharing the result - is that in a Notebook?
I tried it in OpenPipeline, doesn't seem to work there.

GerardJ · ‎16 Jul 2025

Yes I did it in a notebook and i took the whole log payload as content, so the behavior is not the same, I also probably made a bad copy/paste because the JSON parsing didn't work...
To do it in a pipeline, i had to use a workaround with an extra field because of a space at the end of the message field name

Gerard

andre_vdveen · ‎16 Jul 2025

Ah thank you so much, @GerardJ, that works! 😄
Here's my complete processor definition, hopefully it helps someone else in future too.

fieldsAdd message2=`message `
| parse message2, """KVP{ALNUM:key ':' LD:value EOL}:attr"""
| fieldsFlatten attr

andre_vdveen · ‎21 Jul 2025

Either I'm just too dumb to understand the DQL syntax, or there is a discrepancy in how the pipeline processes the log data.

When I use the DQL provided, both 'message2' and 'attr' fields contains nothing when I check it in a Notebook.

I assume that's the reason my value metric extraction rules in the pipeline doesn't work either, even though the preview in the pipeline shows the 'message2' and 'attr' fields containing the data I'm looking for and flattening 'attr' returns the fields and their value that I'm after.
ID: pipeline_redacted_8335

GerardJ · ‎22 Jul 2025

it wouldn't be the first time that there's been a difference between the “run sample data” and what's actually available at the end of the processing, but for me it's quite rare.
Just to make sure I understand, are the attr.* fields that result from fieldsFlatten also absent or null?
Have you tried testing the same processing as the pipeline in a notebook on these logs to see if it works?

Gerard

andre_vdveen · ‎22 Jul 2025

I did try, but it doesn't seem to do the same in Notebooks as in the pipeline; I guess the attr.* fields are not showing up because it doesn't contain anything in attr?

GerardJ · ‎22 Jul 2025

Yes you're right for the attr.* not showing up if attr is null.
Did you check if in the log content if the message field still have the same name with a space at the end ?

Gerard

andre_vdveen · ‎22 Jul 2025

Just sanity checking, thanks for confirming I'm not losing my marbles completely, haha! 😉
Yes, the space behind "message " is still there in the actual logs.

What would be the best option to get to the bottom of this: open a chat or support ticket? It's getting a little urgent, the client is looking for the values in log metrics and I've been struggling this for too long now 😞

GerardJ · ‎22 Jul 2025

From your example in the notebook, I think the log record is different from what is used as a sample in the pipeline. In the notebook, `message :` is inside the content field but in the pipeline we parse it as a field of the log record.
So you probably need to add these two steps to extract the message :

| parse content, """JSON:parse"""
| fields message2=parse[`message `]

before parsing message2
Also use the raw response view in the notebook to copy/paste a complete log record to add it as sample in the open pipeline.

Gerard

andre_vdveen · ‎23 Jul 2025

Your help is very much appreciated, @GerardJ! I would be dead in the water without it.

I finally managed to get it to work e.g., create the metrics - the DQL you gave helped but I had to make one small change to it, which ended up giving me the result I wanted: metrics!

The change I made, based on a colleagues observation and suggestion, was to use fieldsAdd and not just fields when creating the field message2. I also removed the | fields message2 line as that also seemingly interfered with the metric creation.

Here's the final DQL processor definition:

parse content, """JSON:parse"""
| fieldsAdd message2=parse[`message `]
| parse message2, """KVP{ALNUM:key ': ' FLOAT:value EOL}:attr"""
| fieldsAdd attr
| fieldsFlatten attr

GerardJ · ‎24 Jul 2025

Glad it works ! And you're absolutely right, an arbitrary and bizarre choice on my part to use the fields command 🤔 especially since I never use it in pipelines 😅
It's better to use fieldsAdd in a pipeline processing as it allows you to add additional fields while preserving existing ones, which is important for maintaining log context or using these fields as metric dimensions.
You could also remove the extracted fields parse and message2 as after parsing you probably won't need them anymore :

| fieldsRemove parse, message2

Gerard