cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

Dynatrace on VCS (Veritas Cluster)

sjoerd1
Advisor

Hi,

since you can't add attachments to existing topics, I have created this new topic in answer to Chris' question.

I'm currently working on a VCS integration. Today I have executed a fail-over test, and encountered some issues.

Situation (simplified)

(ip ranges just for the picture, no real addresses)

  •  Dynatrace is installed on shared filesystems, which are mounted by both Node1 and Node2 (at the same time).
  •  Dynatrace is addressed by client / agent over the server address of the Veritas Cluster, Veritas proxies all requests to the active node.
  •  When Veritas Cluster needs to switch Active<->Passive node, it takes the following actions:
    •  Node1: Call init.d script dynatracecollector stop
    •  Node1: Call init.d script dynatraceserver stop
    •  VCS: Switch IP address routing to new active node
    •  Node2: Call dynatraceserver start
    •  Node2: Call dynatracecollector start
  •  Because of shared filesystem, EXACT same copy of dynatrace is started on different host

 

Issues

  •  “dynatracecollector stop” and “dynatraceserver stop” do NOT wait for completion of killing the processes (therefore the new node gets activated before old node terminates)
  •  The init.d stop scripts do NOT stop the processes (also renewed kill -2 or running the script doesn’t work)
  •  When Node2 starts up, it complains “License locked to different machine”

 

Questions / remarks

  1.  Is it a known bug that the stop scripts do not stop the proceses on Redhat Linux (waited longer time, no mention in any logfile, and no termination happened)?
  2.  Is it possible to add a -wait flag to the start stop scripts to wait until execution has started/ended?
  3.  Is there a way to get the license in “temporary grant mode” so that we have 2 days to get an emergency license?
  4.  Is there a way to share the license between cluster nodes? I have tried to update the server.xml to take dynatrace.server.com with its IP address (see above example) as server address, but this is immediately overwritten with node1 and its address. 

 

 

 

 

Questions Dynatrace in combination with VCS.png

 

5 REPLIES 5

jeffery_yarbrou
Dynatrace Organizer
Dynatrace Organizer
  1. I do not know of any issues that would cause the init.d scripts to fail with the <stop> option.
    What version of dynaTrace are you running? 
    If you execute the commands manually, do they work as expected?  
    What directory are you executing the scripts from?
    What user account is executing the scripts?  
    Does the user have the proper permissions?
  2. The server processes do need time to initialize fully, when executing the init.d script ./dynaTraceServer start, both the front-end server and the back-end server will startup.  The start times will vary depending on a few variables, cpu speed, memory and system load to name a few.  
    It is feasible to write a wrapper shell script that calls the init.d scripts with wait times.  You can then execute ./dynaTraceServer status and look for the following:
    dynaTrace Server daemon is running:
    and
    dynaTrace FrontendServer daemon is running:
    The dynaTrace front-end server and dynaTrace back-end server would also need time to shutdown cleanly, to complete running tasks, this may be what you are encountering.
    In the end you may have to write a custom shell script to encapsulate the wait times and/or checks to verify the server processes have started or stopped completely. 
  3. The license is tied to the machine, you would need to have a full license for the second machine.
  4. This is not possible, you would need a full license for the second node.  As long as you are only running one node at a time, this should be within the license agreement.

The license files would be distinct for each machine/node, so you would not share these across the two nodes.  In practical terms, there are only a set of relevant directories and files that you should share between the two nodes.  

For the dynaTrace Collector, is this running as a plugin collector or is this collector receiving data from agents?  It may be a better design to have the collector on a separate host.  This would limit the single point of failure and provide more flexibility with the collectors.  We would need a better understanding of your specific setup to make further recommendations.

please let me know if this helps or if you have additional questions.  

Thanks,

sjoerd1
Advisor

Hi Jeff,

Thanks for answering!

Some answers to your points:

  1. dynaTrace 6.1 latest fixpack
    Script executed manually, or manually executing kill -2 on startup process does not work either
    scripts executed from init scripts, with full directory names after su'ing to correct user; in logfiles of VCS it is visible that scripts report back "Terminating ... pid 12345"
    Special user dynatrace created that executes the processes and the scripts
    User has correct permissions.
  2. Startup process is not really bothering, the stop script is more critical. Just executing a kill -2 on a process doesn't guarantee it to be terminating... I can do some scripting to enforce waiting for say 2 minutes and giving a hard kill if it didn't stop in that time...
  3. ok
  4. ok

Practically it is very hard running multiple licenses, because we are using UEM, which needs to remain working when switching as well.

This is a relatively simple dynatrace setup with limited number of agents, on a very heavy VCS cluster, the external collector process is running deliberately on the clusternode, as this has more than enough processing power and memory, and all items are in the same network segment. 

The only issue I cannot solve myself, is the licensing story, where I need to have 2 servers licenses sharing UEM visits...

My current idea, is to save the license file for both setups, and restoring them on the node which becomes active just before the startup sequence. In this way I can keep my setup with complete shared directories (done to keep it simple and to be sure that all patches will remain active also when switching nodes).

If the scripts are not working when you run them manually, I would suggest opening a support case.  I'm positive the lab would want to know.   We should be able to get the scripts working as needed.  

I think I understand the license issue a little better now.  Are you saying that you are using the same shared files on both server?  as you state to keep it simple?

You could do some clever things with symbolic links that would allow you to share the directors, as you are doing, then have the license files stored in a directory that is not shared, in the script that activates the new node, you can then have it create the symbolic link to the proper license file for the node.  Either way, you would need to come up with a way for each node to read in it's own license files. 

I do not think there is anyway around the license being tied to the machine, with the current licenses you are using.  We do now offer a usage based model, perhaps this may be an option.

Thanks,


sjoerd1
Advisor

I am now preparing a shell script to link the correct license file, based on the active node. Is is sufficient to only save and restore the dtactivation.txt and dtlicense.key?

I cannot test at this moment, as I haven't received additional license for the second, inactive, node yet.

Thanks!

I believe so, but you may also want to link the following,

dtlicense_tmp.key
dtlicense.key
dt_keystore.bks
dt_keystore.salt
 

Also, you may need to make sure the cmdb.config.xml file is unique for each node.   This file maintains the server configuration with hostnames.
you could test having this as a shared file but I have only tested with it being unique.

Thanks,