Analyzing my application I can see very long Wait and WaitOne times. Analyzing it as sugested on another post on this community I came to isolate the problem: it is taking too much time bein called by the System.Web.Hosting.ISAPIRuntime.ProcessRequest(System.IntPtr, int).
That's all I can see along with high webrequest times on the same PurePath. Is there a way to further investigate this? I konw the waits are used to sync threads, and maybe things are not instrumented properly according to this topic https://answers.dynatrace.com/questions/93220/hom-.... Some help on this would be much apreciated. I will try to attach a session to this post to faciliate investigation. PurePaths that represent this condition contain the text SalvarProposta, filtering by this will help the analysis.
During WarRoom DBAs were able to see some table locks making database calls take very long. These calls were inside .Net remoting calls. What I could assume comparing to the application baseline is that the app already has as .Net remoting performance problem, that was increased due to this this DB issue.
Is that anything else I can extract here? I would like to better pinpoint the cause of these long wait times.
Thanks for incluidng a session to help with the analysis. Here are some things I'm noticing:
- SQL exceptions indicating a timeout at the database, likely due to the DB locks confirmed by the DBAs:
- There are some threading issues as you suspected. We know this because of the following symptoms:
Generally when you see sync time on framework-level methods like WaitOneNative, it's usually not a problem in that method itself (the framework), but in the app that's using the framework. It's possible that the DB locks led to a backup in threads on the app.
I would chart the .NET thread count measure to look for a spike in thread counts around the time of the DB locks to see if the sync issue could be related to that - I was only able to chart 2 minutes of data points, not enough to spot any trends.
If you can recreate this condition in a lower environment, it will help to capture a thread dump while there is high sync time to see what the threads are waiting from (on the call stack), as well as capture a cpu sample to get an idea of the call order that leads to the condition.
Thanks Andrew. I was informed by the customer that during the whole day they experienced the DB locks, I have charted the thread count and it is easy to see that the value only increases throughout the day. I as informed that they made several changes to the DB to prevent the locks, but application only got back to its normal behavior after an IIS reset by the end of the day.
Unfortunately my session only has data related to the timeframe prior to the ISS reset. I can see that this is a point for improvement to the customer as the application normaly has high sync times. In crisis moments this just makes things worse.
Thanks a lot for your answer.