Solved: Re: Client-Side Response Time

alekz_silva · ‎19 Nov 2021

Hello,

I'm monitoring the environment and I've noticed that several requests are getting a high response time in "Client-Side Response Time" or light blue bar in PurePath view.

The API runs on Windows servers and IIS.
I would like some help in identifying the possible causes so that only a few requests have this high processing time. Apparently, processing occurs but the API takes a long time to respond to the client, generating a timeout in every transaction.

AntonioSousa · ‎19 Nov 2021

@alekz_silva,

This can happen for several reasons I quickly remember:

Network latency between server originating request and server receiving request. Since you mention that this is only happening in a few requests, this is most likely not the case.
Latency being introduced by some type of load balancing. This can lead to the client experiencing more delay than what is seen from the server side.
Eventual retransmissions occurring during the request
Resource shortage at the thread/socket level. Check these values on the services that are being invoked.

Antonio Sousa

Julius_Loman · ‎22 Nov 2021

Most likely as @AntonioSousa writes the shortage on socket/thread/pool - whatever is used to establish the communication. Other reasons mentioned would have an effect on most of the requests.

I'd also suggest checking the Code-level tab in your PurePath (at the calling service node), maybe you can find some additional information there.

Dynatrace Ambassador | Alanata a.s., Slovakia, Dynatrace Master Partner

alekz_silva · ‎22 Nov 2021

I compared two requests (one successfully and the other with the client-side with high response time) and identified that both perform exactly the same activities. In other words, the system processes the request but for some reason there is an excessive delay in the response to the client. In this case, could it still be something related to threads and pool? Unfortunately, at code-level there is no information regarding the light blue purepath bar.

AntonioSousa · ‎22 Nov 2021

@alekz_silva

Yes, all possibilities still apply. Your problems are certainly occurring before your code starts executing. You can eventually get more information by looking at the following points:

Response Time Hotspots and checking out if there are IIS modules with large percentage of code execution
Checking retransmissions at the host level, and if they are different hosts, on both the client & server side.
Check out issues with Connectivity at the process level

To understand the problem better, let us know if these requests happen always in the same host, or if they happen between distinct servers? If it's the latter case, as I suspect, please consider investigating what happen in between (load balancers, firewalls, etc.)

Antonio Sousa

alekz_silva · ‎22 Nov 2021

APIS are distributed across 6 identical servers and traffic is controlled by the load balancer. The problem occurs on some transactions but on all servers.

AntonioSousa · ‎22 Nov 2021

So, I believe there are 6 servers that are receiving the calls you mentioned above.

Can you tell us a little bit more on who is making the requests. If you have client side timings, it means that you have servers that are being monitored also by OneAgent, and are making the calls. Can you tell us a little bit more about them?

Antonio Sousa

alekz_silva · ‎14 Dec 2021

Thanks again for your willingness to help. Calls are made from servers in the AWS cloud. I will attach images from today's request.

AntonioSousa · ‎15 Dec 2021

@alekz_silva

As a matter of caution, I have removed the attachment that you put here, not only because it is classified as confidential, but also because it really may contain sensitive information. I'm in the process of analyzing the information you provided, and will be back in a few minutes.

Antonio Sousa

AntonioSousa · ‎15 Dec 2021

@alekz_silva

I have annotated the Purepath below. It does seem that times are quite high when invoking those services. They even continue working, despite the original client request being timed out at 1 minute.

For what I have seen you should concentrate your efforts on possibly two explanations for this:

The .Net service that is taking too much time to be invoked. There might be thread issues or other related factors that stall invocation. Take a look at Method Hotspots here, and if you have it activated, at IIS Modules in the "Response time hotspots" of that service.
Since you mention the AWS cloud, I'm not sure if these invocations are all inside the cloud, or if they invoke on-prem servers. If the latter is the case, please check network latencies, traffic & retransmissions. The values are very high, so I believe that should not be the case, but please check it out.

Antonio Sousa

julien_duhamel1 · ‎05 Dec 2021

in this case, my first move, is try to evaluate the argments. Your descrption seems finger point calls (front->Back). In 90% the issue come from code execution. Sometimes related to a "batch" who use ressources. If i cannot had access to code lvl, a comparaison with users session's waterfall may a good start point.( Compare ressource load by type of browser/Timesries/ressources/XRH or Load ?) etc etc