25 Feb 2025 09:48 AM
I was looking at options to reduce the noise of unwanted spans/traces in my customer's environment.
Since I saw lots of traces with no actual value on various services (or even extra services created) for things like k8s health checks (/alive/healtz) I was wondering what the best approach would be to exclude these from any service.
There are a few options to "remove" them from services:
The URL exclusion rules (1.)work on a global basis and exclude traces basd on a very simple URL/request pattern matching. This i not very powerful and often can't be applied.
Muted requests (2.) are just - well, muted requests. The traces would still be recorded but not considered on aservice. Very difficult to maintain.
But then I thought, hey openPipeline could also be leveraged for that, right? And it should be really powerful since we can use all kinds of matching to send "unwanted traces" into the void (no storage). So I created a OneAgent Span rule that matches any request from "kube-probe" user agent:
matchesValue(`http.request.header.user-agent`, "kube-probe/1.30")
and a Pipeline to send these requests to no storage.
This should effectively remove all "probe" requests for the whole environment, right.
Has anyone tried this before or have you got a similar requirement?
No more kube-probes traces:
28 Feb 2025 09:33 AM
An interesting aspect of this (if you are on traces on grail likely).
In the old services UI you will still be able to see those traces, while on the new UIs (DQL queries, traces app) they will be gone.
This is really confusing for end users. Maybe someone from DT PMs can comment on that. Are traces still stored twice? What data is then really used for problem detection, new traces or old traces? Obviously there are two trace stores still in use. This also explains why sometimes if you see a trace ID in an old UI service/PurePath app and use this trace ID to search in the new spans you won't find it....
28 Feb 2025 02:08 PM
Your observations are right, traces are currently stored twice. Still it should not happen that you see traces in the classic screen that are not visible in the new distributed tracing app, this might require deeper analysis. Having two apps for the same use-case is not optimal and we are working on the full 3rd gen experience to reduce confusion.
Your effort of noise reduction via OpenPipeline can be still useful as it for example reduces the amount of retained bytes. Meaning cost for potential extra retention as well as for queries can be reduced.
28 Feb 2025 08:13 PM
URL-based sampling should also help out with this: https://docs.dynatrace.com/docs/shortlink/url-sampling
28 Feb 2025 08:25 PM
I'd recommend using URL-based sampling (replaces the URL filtering rules in deep monitoring settings) first - such traces are not even ingested.
OpenPipeline filtering is also pretty much needed exactly fro use cases you've mentioned - filtering on user-agent and also filtering spans of other service types such as RMI calls or messaging.
We really need such filtering capabilities for Managed customers too.