cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

Dynatrace Nodes going offline

Hello,

We are seeing dynatrace managed nodes going offline.We have 4 node cluster and all the nodes going offline.Can someone let us know how to find if number of purepaths are going high or user sessions going high.

Is there any potential reason for throwing these warning & sever messages.

After going through the logs we saw warning & sever messages.

Line 1462: 2020-01-23 07:04:17 UTC {"eventType":"CREATE","tenantId":"unknown","userId":"Cluster event service","userIdType":"SERVICE_NAME","userOrigination":"Cluster event (Internal)","sessionId":null,"identity":null,"identityCategory":"CLUSTER_EVENT","success":true,"timestamp":1579763057428,"message":"{\"clusterEvent\":{\"clusterEventType\":\"SERVER_LIFECYCLE\",\"timestamp\":1579763057428,\"summary\":\"Server 17 activated Adaptive Load Reduction.\",\"description\":\"\",\"userId\":\"\",\"notification\":\"NONE\",\"clusterEventSeverity\":\"WARNING\"}}"}

Severe:

Line 5706: 2020-01-24 13:39:51 UTC {"eventType":"CREATE","tenantId":"unknown","userId":"Cluster event service","userIdType":"SERVICE_NAME","userOrigination":"Cluster event (Internal)","sessionId":null,"identity":null,"identityCategory":"CLUSTER_EVENT","success":true,"timestamp":1579873191990,"message":"{\"clusterEvent\":{\"clusterEventType\":\"SERVER_LIFECYCLE\",\"timestamp\":1579873191990,\"summary\":\"Heap memory: Server 17 started memory emergency mode.\",\"description\":\"\",\"userId\":\"\",\"notification\":\"NONE\",\"clusterEventSeverity\":\"SEVERE\"}}"}


1 REPLY 1

ADPaoloni
Mentor

Hey there, first i would advice to get in touch with support or open a DT Chat to check it.

Second the "Adaptive Load Reduction" and "started memory emergency mode" are symptoms of maybe a higher than normal traffic or transaction load spike, as you said.

What you can do to check that is create a custom chart for Services -> Request Count. over the last few days/hours and check if you have higher load than normal. If that is the case it will allow you to pinpoint the issue to the service/db.


Good luck.