We have a situation where there's a vendor provided Tomcat based java web application that's being monitored by Dynatrace. Of late, this application has been having certain memory issues and we're trying to eliminate dynatrace as a potential cause. It does not help when the log has the following message - Found high autosensor
overhead, might hint to time drift problem, tsCounter
I checked the auto-sensor configuration for this agent group and it's all set to default. There are 2 custom sensors to draw out the purepaths and I don't see any purepath explosion to see them as a cause for concern. These two methods are sufficiently coarse grained, business methods and are not low level methods to cause issues.
In order to allay their fears that dynatrace is a potential cause for the issue, they wanted to temporarily disable monitoring and see if they see any difference.
So, short of removing the command-line argument, I am not aware of any way to disable/enable the java agent overhead. I know with .NET agents, we can uncheck the agent configuration using the configuration console.
Is there something that I can set at the agent group level to disable monitoring on this app?
Any assistance is appreciated.
Solved! Go to Solution.
As far as I know, there is no way to disable the Java agents without commenting out the command-line argument.
You could however try disabling all the relevant sensors on the particular agent groups and see how that behaves. You're not going to be disabling the actual agent but you will also not capture any PurePath data.
For measuring how much overhead the AppMon agents are actually adding (which seems to be the primary concern here), I would recommend taking a CPU sample from the client and seeing how much overhead is added by AppMon. You could also potentially take a thread dump and sort the results by CPU Time (ms) to see if any of the Dynatrace threads are actually using up a lot of CPU.
If you do end up finding that the agents are adding more overhead than what is acceptable, here is a link to a good doc which highlights multiple ways you can reduce overhead -> https://community.dynatrace.com/community/display/...
Quick question about the application, is the Application Server Virtualized by any chance?
To 'passivate' the agent you can de-select the option "Capture Events" in the Agent Mapping configuration. However, the agent will remain loaded and sensors placed (unless you manually un-place them and restart the application)
We see that same message every now and then on virtualized systems (VMware in our case) and support gave us the following feedback:
Generally, you do not need to worry about that message unless it occurs often or the "tsCounter" is quite high.
The message usually means there are timer issues on the host (or JVM). By
timer issues I mean it is related to CPU frequency and the "jumping"
around of time calculations.
The result of timer problems can cause other issues, but if you are not experiencing them then I would not worry.
It is also possible, the message is related to the Hypervisor since you are running on a VM and that
message can occur quite often on VM hosts.
In that case, it is probably nothing to worry about unless you are
losing data, or "timings" from measurements or Purepaths (Purepath
nodes) do not make much sense.
First off, thanks a lot for responding so quickly!
To Ari's question - yes, it is a virtual machine. Thanks for the document on reducing overhead. I'll run through those and see how many of them are applicable and what steps we could take to tune some of the settings.
Thanks Shane, I'll disable the sensors and see what improvements (if any) we see.
I'm not sure how much is too much and how often is bad, as this is the first time that the app team is seeing these errors.
I have plotted out the date/time versus the tsCounter value that gets appended at the end of the message. Here's how the graph looks like -
So, if I read your message correctly, the error indicates that the CPU timer/frequency drift occurring at that time and the cause is the high sensor overhead associated with the instrumentation (may be). I pulled the error logs from Dec 1 and the only instances that I see this error is from 1/14/2017.
The application team has also been seeing errors daily, but we're unable to correlate this error with the application issues.
Could you elaborate on the comment related to this message being seen on Hypervisor. I can take these findings to the server team to see if they need to tweak something on the server to alleviate some of the problems that we're facing?
Appreciate you taking time to answer this
To be honest we stopped paying much attention to the tsCounter error messages as they never significantly affected any monitored application so far.
But I also asked more or less the same question about the values for tsCounter to the support team and this was their final answer:
I don't know what high value you should look for, but if you start to
notice strange "time related" values in the Purepaths from this agent
it is "possible" this could be contributing.
AFAIK, we sample the application and if this process takes to much time
the "tsCounter" will be high (We also do some other algorithms). But,
(AFAIK) keep in mind if the CPU time from the start of the sample is not
accurate the tsCounter could be high. In your case, I think the VMware
might be part of the reason.
If timings ever become an issue you can always modify the CPU timer
(system profile -> Agent Group -> Agent Mapping -> Advanced).
Usually the "Low Res System Clock" resolves such timing issues (on the
We never changed the CPU timer therefore I can't comment whether that would have helped with getting rid of the messages.
For your reference: Our ticket was SUPDT-14741.
Hello @Jonathan R.
It seems to be a good idea because we don't need to do the additional tasks e.g. un-placing the sensors, putting the command line arguments and even un-checking the capture event, so only we will have to restart the web/app to release the agent.
Before restarting the web/app, we can enable the license assignment for system profile and agent group.
The license assignement is documented here but there is nothing about the restart:
But I can confirm that this is not needed.
As you said there is nothing written in the document.
Did you personally try this feature successfully?
Because I heard someone that this feature will be available in the version 7 to not to restart the web/app.
I did this practice in our test environment and it worked perfectly and the benefit we got that without bothering applications support / owners to restart /recycle we were able to use the agents whenever we required.