I our production environment, we have a cluster of 4 java webapplication servers (IBM websphere) where all of them have same instrumentation, same sizing,....
Still we find one of the 4, having always a higher CPU-usage.
Untill now, we were not able tot detect why there is any difference between the servers, looking into agents, I checked all available columns, and I see only one difference : the instrumentation cache hits + instrumentation cache misses are a lot higher to the other servers.
But when I look in the documentation, I can not find information on instrumentation cache hits nor misses.
Can anyone tell me what these values are about ? Could there be any relation to cpu-usage-overhead ? probably not, but as this is the only difference I can find, I find this worthwile asking.
The instrumentation cache is an internal thing to the collector, designed to help speed up startup time of the app. Basically, in addition to the "class cache" (where the collector will store the original byte code), the instrumentation cache stores the instrumented version. So all of that is a long way of saying that I don't believe this is the cause of your CPU consumption difference, unless of course the difference is only happening at JVM startup time.
Suggestion: are you able to use AppMon to run a CPU sample, to see if you can catch the extra work that's going on?
Actually, now you have me wondering... are all of these agents connecting to the same collector? If these are really a cluster of the same functionality (meaning the JVMs are the same, just clustered for capacity or fault-tolerance), then I'm wondering why the instrumentation cache hit/miss data is different. The point is that agents that load the same code will benefit from the work done for another agent. Are you sure that these four JVMs are doing the same thing?
Could it be that, coincidentally, the one with the higher CPU is the first one to have connected? In that case, it will have more misses because the collector hasn't yet built the class cache. Therefore, the others all have less values because they connected after the class cache was created for the first agent?