11 Nov 2024 12:37 AM - last edited on 12 Nov 2024 01:47 PM by MaciejNeumann
Hello everyone, I am sharing here a problem that we are facing in our use case with Dynatrace, related to the context of using the tool in our technological scenario, while we forward a call with Dynatrace support. On the other hand, if anyone has already experienced this type of situation and had a solution, it will be welcome.
Scenario, Java Spring applications deployed with Spring Cloud openfeign or Feign from Netflix OSS, both deployments use hystrix as a circuit braker.
Dynatrace SaaS with DPS license and limitation of the trace volume (Peak trace volume) based on the GiB/min metric.
Problem:
When the trace volume is above the Peak trace volume and OneAgent enters adaptive traffic management reducing the capture of traces of the most visited requests, the traces break starting a large number of traces in the services named Requests in background threads*.
Analysis:
This happens because when hystrix is in use, the call to the dependency (as a client request) is made through another thread pool managed by hystrix, different from the one that is serving the user request on the frontend (normally the http thread pool of the tomcat, jetty or undertown listener).
When Dynatrace has 100% of the spans available, it can correlate the spans between different threads in a single trace, but when it starts to discard the frontend calls, the dependencies become orphaned and a new entry point is created starting with the first dependency's span as the root span, and The new trace is associated with the Requests executed in background threads* service.
This causes two inconveniences:
First, the trace is broken and disassociated from the original frontend request, making analysis difficult.
Second: The relationship between the original request in the frontend service and the dependencies is 1:N, as a consequence a large number of new traces are created, putting even more pressure on the Peak trace volume and lowering our Oneagent capture rate even further.
I will update this post as I progress towards the solution (or not).
Your comments will be appreciated.
Regards.
11 Nov 2024 01:02 AM
About Hystrix -> https://github.com/netflix/hystrix/wiki
How does Hystrix do this?
How it appears in Dynatrace:
Code level:
11 Nov 2024 01:09 AM - edited 11 Nov 2024 01:44 AM
These four services bellow does not have background services (they have only WEB Services - REST APIs), but when adaptative traffic management starts ...