cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Looking to upgrade from Dynatrace Managed to SaaS? See how

Moving from Dynatrace Managed single node cluster to 2 nodes and addressing ALR

runatyr
Organizer

I am looking for assistance and clarification regarding two things

I want to have adaptive load reduction stop occurring

I am trying to figure out what the traffic volume or trigger is causing ALR and then increase the necessary resources on the managed cluster node

 

I currently have a single managed cluster node and want to increase this to have two managed cluster nodes to create a balance of traffic load distribution and provide a layer of redundancy

 

My concern is that currently I have a single node located at one Data Center

Is it impactful or less desirable to leverage a network practice known as OTV between the cluster nodes?

This is a network protocol where both nodes reside in the same VLAN,

but the vlan is stretched and routed through IP. Usually to a different physical location.

Some latency occurs

I am curious what latency tolerances are available between the members nodes of a managed Cluster

Are there any commands or statistics the a managed cluster can provide regarding performance between the nodes?

 

Please advise and thanks!

12 REPLIES 12

Radoslaw_Szulgo
Inactive

You have to ensure near-zero latency (< 10ms) between cluster nodes. Also they have to have the time synchronized with NTP. See also: https://www.dynatrace.com/support/help/shortlink/managed-requirements#multi-node-installations

You can rely on typical network tools to check metrics/network reliability between nodes. You can also use Cassandra nodetool to check network histograms: https://cassandra.apache.org/doc/latest/troubleshooting/use_nodetool.html


nodetool is in /utils/cassandra-nodetool.sh

Senior Product Manager,
Dynatrace Managed expert

dave_mauney
Dynatrace Champion
Dynatrace Champion

Two nodes should be avoided. I consider a single node superior to a two node cluster because it avoids the exposure to "split brain" problem that a two node cluster entails.

I would look at scaling vertically on the single node, and then going to a three node cluster if that is not sufficient.

Hi Dave,

Thank you for the valuable input. Would it perhaps make sense to document the fact that we should avoid 2-node installations, for example here:

https://www.dynatrace.com/support/help/setup-and-configuration/dynatrace-managed/installation/dynatr...

And here:

https://www.dynatrace.com/support/help/setup-and-configuration/dynatrace-managed/basic-concepts/dyna...

In my opinion, this is currently not expressed by Dynatrace's documentation.

Hi Kallie,

I agree we need to update our docs and have requested this of the documentation team.

Thanks,

dave

Thanks Dave! I have one followup question, since you appear to have some expertise in the subject matter. If we do end up with the split brain scenario, are there any feasible methods to recover from it (while keeping the data), or is it just best to restore a previous Managed backup?

I believe the recovery mainly involves some commands you have to issue against cassandra. Support can help you recover in the event it happens. @Radoslaw S. might have some instructions, also. I have never recovered from one personally...

Thanks again Dave, appreciate it! Good stuff to know 🙂

You have to run cassandra-nodetool.sh repair after min. 3 hours disconnection between nodes. This is because Cassandra has a mechanism called "Hinted hand-off" which stores on a side all not synchronized chunks of data. It can store max 3 hours of data.

Usually upgrades don't take that long, so 2 node cluster should be safe in that aspect.

Senior Product Manager,
Dynatrace Managed expert

Sounds good, thanks for the information Radoslaw!

AntonioSousa
DynaMight Guru
DynaMight Guru

Dynatrace recommends that you use 3 nodes and not 2:

https://www.dynatrace.com/support/help/setup-and-configuration/dynatrace-managed/installation/dynatr...

I too agree that there should be an indication on what factors trigger ALR, as that is affecting one of my clients too.

Antonio Sousa

runatyr
Organizer

Thank you all for the help on this. We will be going to a 3 node solution. The good news is that the servers will be in the same VLAN and OTV will not be used to stretch the VLANS across different physical or logical spacings

ChadTurner
DynaMight Legend
DynaMight Legend

Ideally a 3 node set up is what you would want.

-Chad

Featured Posts