cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
PatrykAdamowicz
Dynatrace Guide
Dynatrace Guide

Abstract

In this troubleshooting article, you'll find the most common issues and their resolutions related to the Managed Cluster backup.

 

How does the backup path test work? (scenario for Managed Cluster with 3 nodes)

  • A test file is created on the NFS storage from node1 and each node checks that the file can be seen. If successful, it's removed by node1.
  • A test file is created on the NFS storage from node2 and each node checks that the file can be seen. If successful, it's removed by node2.
  • A test file is created on the NFS storage from node3 and each node checks that the file can be seen. If successful, it's removed by node3.

Note: Since the Managed Cluster v1.276, instead of creating a test file called "testfile" (on every node), a random file name is generated, i.e. [timestamp].test

For the PHA (Premium High Availability) Managed Cluster deployment, the test/backup is only performed in one DC.

 

Problem 1 - "(...) Host 172.18.150.29: Failed to find a test file."

 
Network path verification failed on some nodes. Error messages for each node are listed below. Host 172.18.150.29: Failed to find a test file.

PatrykAdamowicz_1-1739274839219.png

This warning message will appear only for Managed Cluster deployments with more than one node.

What does this error mean?

  • It means that during the backup path test, the "test file" on the cluster node mentioned in the error description was not found.

What can cause this error to appear?

  • Backup NFS storage is mounted only on one node out of multiple (depending on the Managed Cluster size deployment)

  • NFS storage issues on a single node (indicated by error message). The storage might be mounted at the OS level, but it's not responding or timing out

  • Local storage is being used instead of NFS storage (a backup directory is created with the same name on each node but created locally) – nodes can't see the "test file" on a non-shared disk

  • The NFS cache option is used for the NFS storage (the file is created/stored in the cache instead of being created/stored on the disk and is not visible to other nodes)

 

Problem 2 - "(...) Host 172.18.150.29: No permissions to write to the directory." or "No permissions to read/write to the directory."

Network path verification failed on some nodes. Error messages for each node are listed below. Host 172.18.150.29: No permissions to write to the directory.

PatrykAdamowicz_2-1739274908120.png

What do these errors mean?

  • It means that during the backup path test, a test file couldn't be created on the NFS storage due to missing/ insufficient permissions for the "dynatrace" user,

What can cause this error to appear?

  • There is a missing directory on the storage compared to the path provided during the backup configuration

  • The directory is not owned by a "dynatrace" user (the user can be different and depends on the customer customization during Managed Cluster installation)

  • The "dynatrace" user doesn't have sufficient permissions to write/read from the NFS storage

  • The "dynatrace" user has different uid and gid across Managed Cluster nodes

 

Problem 3 - "(...) Host 172.18.150.29: Node communication problem."

Network path verification failed on some nodes. Error messages for each node are listed below. Host 172.18.150.29: Node communication problem.

PatrykAdamowicz_3-1739275197117.png

What does this error mean?

  • It means that during the backup path test, communication issues were detected between Managed Cluster nodes

What can cause this error to appear?

  • In most cases, one or more nodes are not communicating directly with the rest of the Managed Cluster nodes. Firewall issues, network issues, network overload, etc., can cause this.

  • There might be a situation where one of the Managed Cluster nodes is offline.

PatrykAdamowicz_4-1739275307318.png

 

Troubleshooting steps

  • Check your directory exists on EVERY node.

  • Check your directory is owned by the dynatrace user on EVERY node.

  • Check your directory has read/write permissions on EVERY node.

ls -ltra

PatrykAdamowicz_1-1739278942065.png

  • Check your directory has the correct uid and gid of the dynatrace user. It is important that the primary group is correct because the secondary group is not always working.
ls -ltna

PatrykAdamowicz_2-1739278987103.png

 

Make a test:

  1. Create a test file in your directory on EVERY node.

    sudo -u dynatrace -g dynatrace touch /home/patryk/share/test_file_node1.txt
    PatrykAdamowicz_3-1739279054934.png

    (this example is creating the test_file_node1.txt file as user/group dynatrace; if another username is being used to run Dynatrace processes, then please change it accordingly)

  2.  Edit every test file on each node.

    f.e. open test_file_node1.txt from node2, put some text in it + save it.
    check if the test_file_node1.txt has proper content on both node1 and node3

    do the same with other files in a similar manner

  3.  Remove one test file on EVERY node (choose files created on a different node).

    from node3 delete, for example test_file_node1.txt

    from node1 delete, for example test_file_node2.txt

    from node2 delete, for example test_file_node3.txt

 

Resolution

In most cases, the troubleshooting steps help resolve the misconfiguration or other related issues that prevent  Dynatrace backup from being configured.

If the above scenarios are working fine, other options to try are:

  • Disable NFS cache

  • Ensure that NFS latency is lower than 10 ms

  • check the NFS server is correctly configured

    • for example, node IPs have permission to read/write directories,
  • Remount NFS

 

What's next

If this article did not help, please open a support ticket, mention that this article was used, and provide the following in the ticket:

Version history
Last update:
‎14 Feb 2025 02:53 PM
Updated by: