cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

Agents connecting without instrumentation? (Possible timeout)

JamesKitson
Dynatrace Leader
Dynatrace Leader

We have an environment where 2 agents seem to have a problem where they are connected to the collector/server but without instrumentation. I believe the relevant part of the log file is below (it is similar for both agents):

2015-10-29 03:36:20 [991f0221] info [native] Trying to connect to Server/Collector for up to 19 seconds

2015-10-29 03:36:41 [991f0221] severe [native] Exception while connecting to collector, info:<connect()/apr_socket_connect(), 70007, Connection timed out>

2015-10-29 03:36:41 [991f0221] warning [native] Unable to register with Server/Collector <server>:9998, CONTINUING WITHOUT INSTRUMENTATION.

2015-10-29 03:36:41 [a954f221] info [native] Instrumentation channel connected successfully

2015-10-29 03:36:41 [a954f221] info [native] Connected to Server/Collector <server>:9998

Does this look like an issue that could be solved by increasing a timeout setting? And does anyone have any experience with what could be causing this?

This problem just popped up so it may be resolved at least temporarily when the app is recycled but I would like to solve the cause to prevent it from occurring again.

Thanks!

7 REPLIES 7

david_alonso
Dynatrace Pro
Dynatrace Pro

I imagine that you have checked that all the ports from your agent to the server for the port 9998 are opened right?. If they are open and take more than 19 seconds to connect........How many miles or kilomenters are from your agent to your collector? is your collector very heavy loaded?

Joe_Hoffman
Dynatrace Champion
Dynatrace Champion

Note that the "Unable to register" message is happening ~19 seconds after the "Trying to connect". So a simple timeout appears to be happening.

Increasing the timeout is an option, but I'm concerned about the distance issue as menetioned by David. As a test, you could add wait=30 to the -agentpath parameter. It's just another comma delimited token on the end, don't forget the comma. But if this solves it, I still suspect your collector is too far away.

What is the default wait time?

From https://community.dynatrace.com/community/display/DOCDT65/Java+Agent+Configuration it looks like the default is now 20 seconds.

James

Thanks! I guess so while I was checking many agent's log files.

ahmed_el_jafouf
Dynatrace Pro
Dynatrace Pro

As @David and @Joseph mentioned, you need to verify and confirm that the latency between the Agent and Collector is indeed low or not. The link between these two components has to be a low latency one. This generally boils down to having the agent and collector in the same LAN in the same data center.

Potentially related?
What does <server> stand for in the log you posted? Is this the hostname/IP address of the Dynatrace Server or Collector? Even though Dynatrace settings such as the agent string refer to it as "Server" this must be pointing to the Dynatrace Collector. (The "Server" naming is a legacy artifact from the past when there was no such thing called Collector.) You can prevent unintended Agent-to-Server connections by disabling the Embedded Collector in Settings > dynaTrace Server > Services > General.

Yes, the address is pointing to the collector. This environment has been up for a while and I'm relatively new to it. Based on what I'm hearing the distance between the collector and agent is the most likely culprit, it's a production environment so there's not much room for trying out solutions. I think the agents are reconnecting nightly and have been for sometime but I don't seem to see any missing data in the past so I wanted to check if there could be any other explanation for why they didn't connect in time at start. Thank you for your help I'll keep watching.