Solved: A cluster node can't receive OneAgent traffic

Malaik · ‎16 May 2023

Dear All,

Starting from yesterday, a new message is apeareaing in the CMC events

After that, the platform was down for more than 20 hours.

Did anyone faced the same?

The support team is not helping on this.

BRs,

Sharing Knowledge

Radoslaw_Szulgo · ‎16 May 2023

Hi @Malaik , this is quite a new detection we have introduced in version 1.258. See here:

Dynatrace Managed release notes version 1.258

New cluster event in case of a cluster node's inability to receive OneAgent traffic

To better help you in the event of an unsuccessful and incomplete start of all Dynatrace Managed services on a cluster node, we've added additional alerting mechanisms. If you're alerted, please try to carry out the suggested action before reaching out to support. This should generally reduce problem resolution time. At first, when a cluster node can't receive OneAgent traffic, the affected node is highlighted with a red tile on the cluster deployment overview page and in the corresponding row in the cluster node deployment page. Additionally, a cluster event message is generated with the following content:

Summary: "A cluster node can't receive OneAgent traffic"

Description: "The cluster node id can’t receive OneAgent traffic. Try to restart the cluster node. If this doesn’t fix the problem, generate the support archive and provide it to the Support team."

----

Have you tried to restart the node? Did all services come up? Especially the ActiveGate?

Senior Product Manager,
Dynatrace Managed expert

Malaik · ‎16 May 2023

Thanks a lot @Radoslaw_Szulgo

After restarting the nodes, all platform was down for more than 20 hours, services (Cassandra and Elastic) having a big pain to comeUp

Sharing Knowledge

Radoslaw_Szulgo · ‎16 May 2023

It was, anyway probably not writing data and malfunctioning. The event made you aware there's sth wrong. Have you checked if you have enough disk space and RAM memory so services can start?

Is there already a ticket?

Senior Product Manager,
Dynatrace Managed expert

Malaik · ‎16 May 2023

Thanks again,

Everything was checked and now all nodes are working.

Yes we have a ticket opened.

Sharing Knowledge

Radoslaw_Szulgo · ‎17 May 2023

Can you share what was the root cause to the community?

Senior Product Manager,
Dynatrace Managed expert

Malaik · ‎17 May 2023

We are not sure as of now, but we are suspecting the storage (NFS).

The storage was not available at this time, so Elastic and Cassandra were not available.

BRs,

Sharing Knowledge