Re: How to use regex with DQL?

jegron · ‎12 Nov 2024

Hi,

I'd like to build a complex regex in DQL. I can't use the same syntax as OneAgent's masking rule, so I tested the parse command:

fetch logs
| parse content, "DATA ([:space:]|[:punct:])([12][0-9]{2}[0][1-9][0-9]{2}[0-9]{3}[0-9]{3}[0-9]{2}):myfield1([:space:]|[:punct:])"
| parse content, "DATA ([:space:]|[:punct:])([12][0-9]{2}[1][0-2][0-9]{2}[0-9]{3}[0-9]{3}[0-9]{2}):myfield2([:space:]|[:punct:])"
| parse content, "DATA ([:space:]|[:punct:])([12][0-9]{2}[0][1-9][2][A-B][0-9]{3}[0-9]{3}[0-9]{2}):myfield3([:space:]|[:punct:])"
| parse content, "DATA ([:space:]|[:punct:])([12][0-9]{2}[1][0-2][2][A-B][0-9]{3}[0-9]{3}[0-9]{2}):myfield4([:space:]|[:punct:])"
| fields myfield=coalesce(myfield1, myfield4, myfield3, myfield4), content, log.source, k8s.namespace.name
| filterOut isNull(myfield)

or

fetch logs
| parse content, "DATA ([:space:]|[:punct:])(([12][0-9]{2}[0][1-9][0-9]{2}[0-9]{3}[0-9]{3}[0-9]{2}):myfield|([12][0-9]{2}[1][0-2][0-9]{2}[0-9]{3}[0-9]{3}[0-9]{2}):myfield|([12][0-9]{2}[0][1-9][2][A-B][0-9]{3}[0-9]{3}[0-9]{2}):myfield|([12][0-9]{2}[1][0-2][2][A-B][0-9]{3}[0-9]{3}[0-9]{2}):myfield)([:space:]|[:punct:])"
| fields myfield, content, log.source, k8s.namespace.name
| filterOut isNull(myfield)

But this costs "four DQL queries" (for each alternative) and this is not acceptable to the customer because of the high volume ...
What's the best practice for complex regexes? Is there a roadmap for improving the use of regexes with DQL? Or to harmonize the syntax used in the platform?

Thanks for your help 🙂

Observability Engineer at Phenisys - Dynatrace Professional

krzysztof_hoja · ‎13 Nov 2024

Applying multiple parsing rules on same data does not increase query cost. It only depends in how much data you needs to access.
Of course heavy parsing parsing will affect query performance, but use of regex will not improve anything here.