cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Looking to upgrade from Dynatrace Managed to SaaS? See how

Problem with restore cluster from backup

radek_jasinski
DynaMight Guru
DynaMight Guru

Hi,

I have a problem when trying to restore a cluster from a backup. I have machines that are identical in terms of software and hardware. I'm following the step by step instructions: https://www.dynatrace.com/support/help/shortlink/managed-cluster-restore#restore-from-backup

In step 3 I get an error when starting Casandra.

ERROR [main] 2023-07-11 11:13:53,823 UTC CassandraDaemon.java:803 - Exception encountered during startup
java.lang.RuntimeException: Unable to gossip with any peers
at org.apache.cassandra.gms.Gossiper.doShadowRound(Gossiper.java:1603)
at org.apache.cassandra.service.StorageService.checkForEndpointCollision(StorageService.java:628)
at org.apache.cassandra.service.StorageService.prepareToJoin(StorageService.java:888)
at org.apache.cassandra.service.StorageService.initServer(StorageService.java:745)
at org.apache.cassandra.service.StorageService.initServer(StorageService.java:694)
at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:395)
at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:633)
at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:786)
INFO  [StorageServiceShutdownHook] 2023-07-11 11:13:53,877 UTC HintsService.java:210 - Paused hints dispatch
WARN  [StorageServiceShutdownHook] 2023-07-11 11:13:53,877 UTC Gossiper.java:1728 - No local state, state is in silent shutdown, or node hasn't joined, not announcing shutdown
 
I'm betting it's a network connection problem, but I need to confirm it.
Has anyone encountered this before?
 
Radek
Have a nice day!
6 REPLIES 6

islam_zidan
Champion

Hi Radek,

How many nodes you have? are they on the same subnet. if no, what is the latency between subnets.

 

BR,

Islam

Dynatrace Certified Professional - Dynatrace Partner - Yourcompass.ca

Radoslaw_Szulgo
Inactive

I can confirm this is a network issue. The gossip - is the protocol used in Cassandra to negotiate the connection details between nodes. Make sure you've ensured proper network connection to the nodes you configured in step 2 - --cluster-nodes

Senior Product Manager,
Dynatrace Managed expert

Hi Radek,

Yes, I have verified all the requirements. For now, this environment has only one node because it's a test environment and we are testing the backup procedure before running it on the production environment.

Have a nice day!

Hm.. so if that's only a single node... there should be no networking issue as there's no one to gossip with 😉

Senior Product Manager,
Dynatrace Managed expert

The development environment at the client looks like this:

1. The old DT server from which the backup was created
2. The network resource on which this backup is created (sub-mounted to two hosts).
3. The new DT server to which we make a restore from the backup.

Server 1 when starting the new one after the restore is done is disabled. Everything seems to be correct hence I'm surprised by this error. I will have another try tomorrow - let you know! 😊

Have a nice day!

radek_jasinski
DynaMight Guru
DynaMight Guru

Ok, the matter turned out to be simpler than I thought 😁
What was missing was specifying the --seed-ip variable when running the restore command.

btw. Casandra error message is highly specific in this case.

Radek

Have a nice day!

Featured Posts