We have a few services instrumented with both Dynatrace agent & OTEL manual instrumentation. These services are mainly running in Fargate containers & Lambdas, runtime is node.js
Services that are instrumented with dynatrace agent are sent to Dynatrace UI, whereas for same services that are instrumented with OTEL manual instrumentation are sent to other visualization layer(Lightstep).
We have noticed that enabling one of the dynatrace feature 'Forward Tag 4 trace context extension' is actually overriding OTEL trace ID and replacing it with Dynatrace trace ID. This feature is mandatory for dynatrace to connect AWS services(received this from support). This is a big red flag for running Dynatrace and OTEL in parallel. Because of this our leadership team is looking to remove dynatrace agents and this has become bigger issue now. Has anyone encountered the same issue and were you able to resolve this issue.
Here is our architecture:
Below is the documentation we followed to instrument AWS API Gateway non-proxy integration:
Below is the dynatrace feature we have enabled.
Solved! Go to Solution.
If two monitoring tools are active at the same time in the same process but work independent of each other it's mostly impossible to avoid that a single operation results in two traces.
For example if a span is created via OpenTelemetry API it doesn't consider the state of OneAgent at this time and creates a new traceId even if OneAgent started a trace already before.
To continue a trace on another process both monitoring tools inject and extract their view of the trace via e.g. W3C tracecontext or B3 headers. If the same propagation headers are used conflicts may arise even OneAgent tries to avoid overwriting headers set by OpenTelemetry.
The OneAgent feature Forward Tag 4 trace context extension is needed in this setup to ensure that the trace started by a OneAgent code module is continued by the dynatrace AWS lambda layer. If the feature is not set OneAgent doesn't set traceId/spanId on propagation headers resulting in a broken OneAgent trace.
The OneAgent layer for AWS lambda has some limitations compared to other OneAgent code modules:
As a result you end up in having to decide if OneAgent or LightStep trace "wins" regarding continuation of a trace.
There might be a workaround: If OpenTelemetry is configured to *not* use W3C tracecontext headers for propagation (e.g. use B3 instead) and keep DT_OPEN_TELEMETRY_ENABLE_INTEGRATION disabled it might work to have OneAgent to keep the OneAgent trace intact and LightStep to keep the LightStep trace intact.
Thanks for your response. Just want to confirm if you are recommending using b3 headers for otel instrumentation to avoid trace ID overriding. This way it will not interact with Dynatrace ID.
Yes, using propagation headers not used by OneAgent and disabling OpenTelemetry integration should decouple the two tools.
OneAgent is not using B3 headers as of now. OneAgent for AWS lambda has currently no option to disable use of W3C TraceContext (other code modules have this possibility) .
I know this recommendation sounds a bit unusual because usually we recommend to use W3C TraceContext.
Your request is special because OneAgent should not interact with OpenTelemetry. Usually the request is that both tools interact (e.g. OneAgent continues a trace from a process monitored by OTel).
But your request is different that it
From your statement you mentioned "OneAgent is not using B3 headers as of now. OneAgent for AWS lambda has currently no option to disable use of W3C TraceContext (other code modules have this possibility) ."
Is it possible to disable x-dynatrace header for containers and use traceparent instead of it? so that we are expecting entire transaction uses only traceparent instead of x-dynatrace header.
There is currently no option to disable x-dynatrace header only for containers. It is possible to disable it for non AWS lambda code modules for protocols where OA supports W3C tracecontext (GRPC and HTTP as of now).
Please note that using traceparent only for OneAgent will result in limitations like missing connections in service flow or inconsistent sampling decisions/extrapolation.
OneAgent either needs x-dynatrace or traceparent + tracestate to utilize the full feature set.