Solved: Re: When Managed Nodes each capture different data for same period of time, what would happen?

waikeat_chan · ‎13 Jul 2020

Multi-node Managed-Cluster deployed at banking customer site, where some nodes are in Production Data Centre while some in DR Data Centre.

This customer also has same IP & same Hostname, for Banking Servers in both DR data centre as well as Production data centre.

Currently, the way this customer perform DR-simulation is by cut-off the connection between Prod Data Centre and DR Data Centre, then spawn-up all server instances in DR data centre and make sure servers and services in DR Data Center are running fine.

Thus, unlike a real DR where Either Only Prod or DR are capturing data, in this DR-simulation both side are also capturing data during simulation.

So before cut-off, data is captured and analyzed fine. During cut-off, Prod Nodes of Managed Cluster are capturing data on its own, DR Nodes of Managed CLuster also the same. The question comes after cut-off is finished: "When DR-Simulation Finished, both the PRod Nodes and DR Nodes of Managed Cluster able to talk back to each other. They would realized both side also captured different data for the same period of time, so what would the data looks like for that period of time?"

1. Would it be empty and no-data?

2. If not empty, then it would use the data captured by Prod Nodes? or DR Nodes?

3. If none of the above, then how does Managed decide what data to use?

Best Regards,

Wai Keat

Radoslaw_Szulgo · ‎13 Jul 2020

In case of node unavailability, Cassandra stores data "on a side" using so called "hinted handoffs" (https://docs.datastax.com/en/archived/cassandra/3.0/cassandra/operations/opsRepairNodesHintedHandoff...)

It is able to store up to 3 hours of data that are not synchronized. After that data will be overwritten and lost.

After nodes are up again, data from hinted handoffs will be synchronized to missing nodes.

If the cut-off is longer than 3 hours, additionally cassandra would require a repair to be run, to make the data consistent back again.

If outage is longer than 3 days, then nodes have to be rebuilt as cassandra will remove unavailable nodes.

See also failover mechanism: Managed failover mechanism

What's also risky here, is that I assume they have equally balanced Prod and DR DCs. For example, it's 3 nodes in Prod and 3 nodes in DR. If that's the case, they no longer have majority of nodes up when "cut-off" is done, and data cannot be stored/read properly. To address that scenario look at the Premium HA offering here: Premium HA for multi-data centers

Senior Product Manager,
Dynatrace Managed expert

Yosi_Neuman · ‎14 Jul 2020

Hi @Radoslaw S.

First few facts for our case:

Customer got 2 DCs, that are connected with dedicated communication tunnel with a latency that some time can reach till 15ms.
Both Dc's are running the same application but traffic is sent simultaneously only to one DC.
The plan to install 3 DTMC's on each DC (i.e. total of 6 DTMC's)
Customer will use the Fail Over mechanism (not the High Availability one).

Now the questions are:

Is there a way to figure out what will be the overload of the cluster communication on the tunnel between the 2 DC's, Can it be calculate within the managed sizing excel?
We know the limitation for FO says 10ms latency, what will happen if the latency will exceed 15ms?
Will sync of data is guaranty after short cut off (less than 3 hours) between the 2 DC?
What is the set of steps customer will need to take after medium cut off (more than 3 hours)?
What is the set of steps customer will need to take after long cut off (more than 3 days)?

Thanks in advance

Yos

dynatrace certificated professional - dynatrace master partner - Matrix Soft Ware Division - Israel

Radoslaw_Szulgo · ‎15 Jul 2020

1. That will be hard. I'd check how much you utilize between nodes in a DC - and it should be similar.

2. 15ms should be fine. The issues that arise with longer latency is data consistency and performance.

3. Yes, That's embedded in Cassandra.

4. Usually a cassandra db repair (nodetool repair) is needed to synchronize data between nodes. Visible issues might be gaps in data. We will work on running that automatically in the near future.

5. A cassandra db rebuild might be needed (nodetool rebuild) to synchronize the data from scratch on the node that was lost.

Moreover... you need to aware that if you cut-off 3 nodes out of 6 you can lose some data as data is stored with replication factor of 3 (there's a chance you cut-off nodes that contain all replicas of a given data). What could help here is to set up rack-aware set up so cassandra and elasticsearch are aware of nodes location and will not store all copies of data in a single location. That will be supported in the near future.

Senior Product Manager,
Dynatrace Managed expert

Yosi_Neuman · ‎15 Jul 2020

Thanks @Radoslaw S. for the detailed answer.

We will wait for the update on supporting rack-aware.

All the best and stay safe.

Yos

dynatrace certificated professional - dynatrace master partner - Matrix Soft Ware Division - Israel

Radoslaw_Szulgo · ‎15 Jul 2020

Racks configuration will be possible if you can set up 3 racks. Otherwise 2 racks will be vulnerable to split brain (as running just 2 nodes) - then the only turnkey solution providing fault-tolerance is Premium HA.

Senior Product Manager,
Dynatrace Managed expert

waikeat_chan · ‎21 Jul 2020

Hi Radoslaw,

When you mentioned "If the cut-off is longer than 3 hours, additionally cassandra would require a repair to be run, to make the data consistent back again. "

How to do/run the repair to be exact?

Thanks.

Best Regards,

Wai Keat

Radoslaw_Szulgo · ‎21 Jul 2020

/utils/cassandra-nodetool.sh repair -full -dc

However, I'd rather not run that on your own, unless you have small size of metric storage. This operation might take very long and is resource consuming. In such case, I'd rather recommend contacting with Dynatrace ONE to open a support case so we can assist you.

In the near future we will automate that process so this will not be required.

Senior Product Manager,
Dynatrace Managed expert

thomas_steinma1 · ‎23 Jul 2020

@Radoslaw S.,

above nodetool command won't work at all, cause the "repair" option is missing
even with the "repair" option, "-pr" alone likely will be rejected by our nodetool wrapper script
as you know, as we had some nice findings in the Cassandra repair area while building our HA solution, it even can be dangerous for the Cassandra cluster if we promote/communicate wrong usage of nodetool repair to the outside world

Radoslaw_Szulgo · ‎23 Jul 2020

you are right. I've updated my comment with proper "-full -dc" option that is the only way.

As I mentioned in my comment as well, I don't recommend running that without Dynatrace assistance.

Senior Product Manager,
Dynatrace Managed expert

Dynatrace Managed nodes capturing different data for same period of time