on
14 Feb 2025
02:50 PM
- edited on
14 Feb 2025
02:53 PM
by
HannahM
In this troubleshooting article, you'll find the most common issues and their resolutions related to the Managed Cluster backup.
Note: Since the Managed Cluster v1.276, instead of creating a test file called "testfile" (on every node), a random file name is generated, i.e. [timestamp].test
For the PHA (Premium High Availability) Managed Cluster deployment, the test/backup is only performed in one DC.
Network path verification failed on some nodes. Error messages for each node are listed below. Host 172.18.150.29: Failed to find a test file.
This warning message will appear only for Managed Cluster deployments with more than one node.
Backup NFS storage is mounted only on one node out of multiple (depending on the Managed Cluster size deployment)
NFS storage issues on a single node (indicated by error message). The storage might be mounted at the OS level, but it's not responding or timing out
Local storage is being used instead of NFS storage (a backup directory is created with the same name on each node but created locally) – nodes can't see the "test file" on a non-shared disk
The NFS cache option is used for the NFS storage (the file is created/stored in the cache instead of being created/stored on the disk and is not visible to other nodes)
Network path verification failed on some nodes. Error messages for each node are listed below. Host 172.18.150.29: No permissions to write to the directory.
There is a missing directory on the storage compared to the path provided during the backup configuration
The directory is not owned by a "dynatrace" user (the user can be different and depends on the customer customization during Managed Cluster installation)
The "dynatrace" user doesn't have sufficient permissions to write/read from the NFS storage
The "dynatrace" user has different uid and gid across Managed Cluster nodes
Network path verification failed on some nodes. Error messages for each node are listed below. Host 172.18.150.29: Node communication problem.
In most cases, one or more nodes are not communicating directly with the rest of the Managed Cluster nodes. Firewall issues, network issues, network overload, etc., can cause this.
There might be a situation where one of the Managed Cluster nodes is offline.
Check your directory exists on EVERY node.
Check your directory is owned by the dynatrace user on EVERY node.
Check your directory has read/write permissions on EVERY node.
ls -ltra
ls -ltna
Create a test file in your directory on EVERY node.
sudo -u dynatrace -g dynatrace touch /home/patryk/share/test_file_node1.txt
(this example is creating the test_file_node1.txt
file as user/group dynatrace; if another username is being used to run Dynatrace processes, then please change it accordingly)
f.e. open test_file_node1.txt
from node2, put some text in it + save it.
check if the test_file_node1.txt
has proper content on both node1 and node3
do the same with other files in a similar manner
from node3 delete, for example test_file_node1.txt
from node1 delete, for example test_file_node2.txt
from node2 delete, for example test_file_node3.txt
In most cases, the troubleshooting steps help resolve the misconfiguration or other related issues that prevent Dynatrace backup from being configured.
If the above scenarios are working fine, other options to try are:
Disable NFS cache
Ensure that NFS latency is lower than 10 ms
check the NFS server is correctly configured
If this article did not help, please open a support ticket, mention that this article was used, and provide the following in the ticket: