I am analyzing an application for a customer where we have noticed high reponse times on their portal yesterday. Portal is composed of 3 Apache servers for load balancing and 4 Liferay application servers behind it.
We have experienced around 20 minutes of degradatation when customer (based on past experiences) restarted the app server and everything got back to normal. During my analysis I was able to Isolate the problem to the Web Server tier
Driiling down the Response Time Hotspots I can see that this is all due to a single WebRequest: /web/logada/portlet-peca-o-seu
From there I get to the PurePaths, and am able to see response times of up to 113349 ms:
I can see that the PurePath runs fine, however the WebRequest takes a lot of time. What further investigation can be done here? I can see that the webservers resource are fine CPU, Memory and Disk usage is very low. Only point of attention is that the Apaches reach their maximum thread count of 800. This number only gets down when restarting the apps.
Servers running the Liferay portal are also very low in resource usage.
What else could I investigate here? Please see attached session in case you need more information: producao-portal-pr-liferay-28-03-10-30-community.dts
Dynatrace is running on version 6.1, but I am anlyzing this session on a 6.5 client.
Thanks a lot,
I am not expert in the analysis area but I tried to understand the situation and below are my findings.
When I looked into Transaction Flow for a PurePaths, so on the first stage it seems that something wrong with the Web Server.
Then I checked the visit/user actions for that specific transaction and found that the page load time took only 3 Seconds.
Further drill down to the user actions PurePaths and I found that basically the 'Elapsed Time' almost 120 Seconds were spent on the 'Synchronous
Path' to the agent 'PORTAL_APP_LIFERAY' which is installed on the application.
PurePath breakdown is showing 100% IO time.
A high I/O time doesn't necessarily mean that the application spends its time in I/O operations, because the I/O time also contains kernel time.