20 Sep 2024 03:00 PM
Hi dears, our website is extremely slow, trying to open a item like Iphone 16 like i will show below take between 20 sec and more than 2 minutes.
here i click
xxx seconds later
Here is the DT but i have problems identifying the problem.
I see that it take 51.2 sec to open the page, there was a call to the api /device/apple-iphone-16-pro with a lot of DB requests. All those DB requests take more or less 14 seconds, but what happen then ? and why it take 40 seconds more to display the result.
The performance issue is crazy
Thanks for any help
20 Sep 2024 03:17 PM
You can't see activities not instrumented on the trace overview screen. Have a look at the timing and code level tabs in each particular span (row) for details, starting with the PHP span. Also use can method hotspots (on the span level) - maybe you have enough captured data for the requests. If you are investigating just one trace, maybe there are not enough snapshots to prove anything.
20 Sep 2024 04:33 PM
These are already great solutions!
I would also suggest you to check the "View response time hotspots" and/or "View service flow" on that last screenshot on the original post: also reachable right away from the service details page.
That way, you can allow Dynatrace to help investigar the response times as a whole, instead of navigating from trace to trace and make assumptions only based on that sample of traces you visited 😉
This method tries to avoid basing the analysis on the specific behaviour of certain traces!
Beware: even on these tools, you may have some difficulties checking what is happening inside those gaps you see on the traces! If something is not instrumented, for example, using these other tool would not solve the issue... my main focus here is trying to understand if that gap is present in all traces or if in some of them, a huge response from a dependency can be identified (and later picked up by Service Flow or Response Time Hotspots).
Hope this helps!
20 Sep 2024 04:51 PM - edited 20 Sep 2024 05:23 PM
Thanks, yes i think i tried to check already all possible menus.
Normally on other application we monitor i dont have much problem to identify what cause slowness but here ... 😁
I can see indeed code execution take 46s and the API call, only 16s of those 46s.
And yes i see this behaviors in almost all the page, not only phone selection, but mobile plan, internet plan etc ...
20 Sep 2024 04:39 PM
Hi Julius, thanks for the response.
(screen from another trace)
On the PHP level, database calls take 'only' 20 seconds
I can see that also on the code level (i see there are 93 errors, but this is 'normal', php warning etc...)
I dont see what justify the 20 supplementals seconds after those 20 seconds queries, then the additional xx seconds after the PHP span finished.
Could it be some 'network related issue or something ? and is there a way to check that.
From what i can see in this trace, here is what looks like the user action :
User Click on the device -> F5 VIP (public URL of the website) -> Web-next server (nodeSJ) (/_next/data/C6ghw) -> proxy server (to call api-sylius) -> F5 VIP for API-Sylius -> Web-Sylius server (apache/php) (/api/v2/shop/devices/apple-iphone-16-pro) -> queries to sylius-db server
20 Sep 2024 08:30 PM
Hi @cando ,
the first thing is see is number of them (4.47k !) is much more than it should be 🙂 The single request has almost 4500 DB calls. This is definitely something that must be improved.
For the remaining time, you need to use method hotspots on the PHP span. This will take collected stackframes which happened during the request execution and it will reveal the hotspots. What does method hotspots show in your case? The method hotspots is also accessible from the response time hotspots analysis you have in your screenshots.
21 Sep 2024 10:23 AM
Hi Julius, thanks again for your help, indeed i see the thousand of DB queries (present on almost all DT) but as those queries take between few ms to few seconds, i thought that would not be the cause of the 20-30-40 supplemental waiting time. Am not managing the site, i think this is an out of the box ecommerce platform (Sylius) so not sure if the people managing the site could do something, but i will see.
Please see in attachment the different screenshot and some comments
22 Sep 2024 11:37 AM
Hi @cando ,
based on the method hotspots it looks like a significant time is being spent in the function_exists call in Symphony. See source code - it checks if the function exists and sends the http response accordingly to PHP deployment (fast-cgi, litespeed , .. ).
I'm not considering myself an expert, but I'd assume there is at least something broken in the PHP installation as these functions have to respond fast and in your case, it seems they are not.
To confirm, I'd execute the following script in the same environment to measure it - if it's possible:
<?php
$functions = ["fastcgi_finish_request","litespeed_finish_request"];
foreach ($functions as $function) {
$startTime = microtime(true);
function_exists($function);
$endTime = microtime(true);
$elapsedTime = ($endTime - $startTime)*1000000;
echo "function_exists(".$function.") took: ".$elapsedTime." microseconds <br>";
}
?>
Nevertheless, the number of calls is a typical n+1 query problem and must be resolved too.
23 Sep 2024 11:32 AM
Thanks an lot Julius, i share this information with the devs and let you know if there are some results.
22 Sep 2024 04:36 PM
@cando I would check the size of the HTTP replies. They might explain some of the long waits you are seeing, including "and what happened here?" and the big differences between client and server timings. To get that information you will probably need to capture HTTP headers related to the size of the reply, or you eventually can check it in Developer Tools...
23 Sep 2024 11:38 AM - edited 23 Sep 2024 11:39 AM
Hi Antonio, thanks, indeed i make some tests with powershell and got those (bad) results :
I raise the issue to the devs to see what can cause this very high response size
23 Sep 2024 07:24 PM
A response size of 60MB or even 100MB is sure over a healthy limit for a web page.
I can see developers arguing "It works well on my machine". 😁
23 Sep 2024 10:11 PM
Not sure why I'm seeing this more and more. The other day, it was a 90 MB minified JSON. It was loaded in almost every single page of the site. It was also so big that Chrome couldn't even show it in Developer Tools 🤣
23 Sep 2024 10:08 PM
One of the hints that I always follow is the difference between client & server time, highlighted below in red. When you see this, a lot of times it is excessive content in the reply. In this case, besides the web page, you should also have big sizes also in the web-service replies: