cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Looking to upgrade from Dynatrace Managed to SaaS? See how

Running Cassandra Nodetool cleanup when adding nodes

GilesDay
Advisor

Has anyone else had storage issues even after adding nodes to your cluster? We did and this is what we found out.

Note:  There doesn't seem to be any documentation in Dynatrace to assist here (internal via support or in managed docs)

We had just done a significant deployment of OneAgents a week or so earlier

We found ourselves running up on 2TB of Cassandra data on each of our 5 nodes, so we added a 6th, and found it quickly went to 2TB so we added a 7th and 8th node, which also quickly (within a week) went to 2TB.  (According to DT docs, 2TB is the max they recommended for Cassandra, so we kept adding nodes as we were storage contained, (CPU/Memory are great). 

 

We asked support why we were hammered so much, we expected it a little bit as we had just done a big deployment, but it wasn't adding up. So we did some research on Cassandra and found this: Adding, replacing, moving and removing nodes | Apache Cassandra Documentation. 

"As a safety measure, Cassandra does not automatically remove data from nodes that "lose" part of their token range due to a range movement operation (bootstrap, move, replace). Run nodetool cleanup on the nodes that lost ranges to the joining node when you are satisfied the new node is up and working. If you do not do this the old data will still be counted against the load on that node."

So we ran the nodetool cleanup on each node, starting with the 2nd last node added, then ran 2 at a time, including the last node we added, and found that we were able to recover over 0.5TB of space on each node except the last.  Each node took roughly 12-13 hours to complete the cleanup, except the last node took less than a second to run (which is what we expected as it wouldn't have had any data ranges taken from it. Now weeks, later we finally stabilized (it seems). We haven't seen any ill effects after doing this. We're not planning on removing any nodes as we expect to grow still. 

 

Have you done this before? Is this something Dynatrace can look into and after testing, add to the documentation as a step after adding a node.

 

 

Why do App Developers have high insurance rates? (gnihsarc peek yehT)
1 REPLY 1

jason_gs
Dynatrace Contributor
Dynatrace Contributor

I'd raise a RFE for documentation related to the management of the cluster when scaling out and provide the links you found as ref for this use case. More likely to be picked via that route for documentation inclusion. 

Featured Posts