12 Jun 2024 09:36 AM
I'm trying to make a lookup as efficient as possible, thus I'd require to provide a dynamic filter to the "fetch" command within a lookup.
Take this code:
fetch spans
| filter k8s.namespace.name == "webshop-frontend"
| filter trace.id == toUid("2ece9047533a31d3808c57165e511ad3")
// get service name
| fieldsAdd service.name = entityName(dt.entity.service)
// get the service name of the parent span service
| lookup [fetch spans
| filter trace.id == toUid("2ece9047533a31d3808c57165e511ad3")
],
sourceField:span.parent_id,
lookupField:span.id,
prefix: "parent."
The lookup's purpose is to get the service name of the parent span of a given span.id. (I'm basically trying to build a "service flow" with timing information between services).
Now, as you can see in the code I can do this with individual traces by providing the same trace id in the query and in the lookup fetch. However this will not work without the single trace context.
What I'd need is to limit the "fetch" within the lookup so that it only returns a small set of spans, namely the ones it is currently operating on.
Something like this:
| lookup [fetch spans
| filter span.id == valueOf(sourceField)
],
sourceField:span.parent_id,
lookupField:span.id,
prefix: "parent."
Isn't that quite a common usecase? Having access to "outer" variable within the fetch?
kr
12 Jun 2024 07:46 PM
Generally lookup command's parameter executionOrder:leftFirst () does what you need. From user perspective it can be describe this way: it executes left part of the query until subquery results are needed and injects set of values of sourceLookup: field as condition [ in() ] in subquery on lookupField:.
However still limits will apply and in case joining spans with spans this may be too weak.
I tested on my environment (1 min timeframe) and went from query which was failing to query scanning 50GB and returning 1000 (default limit) first spans.
my query for reference:
fetch spans
| filter k8s.namespace.name == "seg-index"
| lookup [ fetch spans ],
sourceField:span.parent_id,
lookupField:span.id,
executionOrder:leftFirst,
prefix: "parent."
13 Jun 2024 08:15 AM
Thanks @krzysztof_hoja , after a bit of more reading I figured out the leftFirst parameter and it improved just like you described.
However it still is limited to a very short timeframe (1-5min in my case using the equivalent to lookup with a join:
fetch spans
| filter dt.entity.service == "SERVICE-8ACA58863A49B15E"
| fields trace.id, span.id, span.parent_id, dt.entity.service,duration
| fieldsAdd service.name = entityName(dt.entity.service, type:"dt.entity.service")
// get all caller spans
| join [fetch spans ],
kind: leftOuter,
executionOrder: leftFirst,
on: { left[span.parent_id] == right[span.id] },
prefix: "caller."
I'm wondering how - in an alternative way - that usecase could be achieved with DQL:
I want to know response times between services similar to what service-flow does. Not the response time of a service itself but the response time for caller services.