Showing results for 
Show  only  | Search instead for 
Did you mean: 

Node-name based Kubernetes node ID calculation

Dynatrace Participant
Dynatrace Participant

With release 1.286 we introduced a new way to calculate the ID for Kubernetes node monitored entities. This article explains the reasons behind the new calculation, minimum version requirements, visible effects, as well as some background information.
Update: The release has to be postponed to 1.286 (instead 1.284), as there are still a few minor issues in the migration process, which we want to clarify first. We apologize for the inconvenience.

Why is there the new node-name based ID calculation?

Up to release 1.285 we used the nodes' systemUUID to generate the ID for Kubernetes node entities in Dynatrace. We discovered that there are situations (i.e. OpenShift clusters running on a PPC64LE architecture), where the systemUUID is not unique. That caused that not all nodes were visible in the user interface. Moreover, if the same systemUUID existed on different clusters, it was not predictable, to which cluster such a node would have been assigned. By converting the calculation, we are also future-proof when it comes to integrating other data sources such as Open Ingest.

Therefore we decided to implement an ID calculation based on the Kubernetes node name and the cluster id. As the name identifies a node within one Kubernetes cluster, two nodes cannot have the same name at the same time [See Nodes | Kubernetes]. Thus in combination of the node name and the cluster id the uniqueness of the calculated IDs is guaranteed.

Was a Kubernetes cluster affected?

If there are less Kubernetes node entities visible in Dynatrace than there actually are in the Kubernetes cluster, non-unique systemUUIDs are likely the problem.

List the systemUUIDs of the Kubernetes cluster:

kubectl get nodes -o=json | jq | grep systemUUID | sort

Get the number of nodes:

kubectl get nodes -o=json | jq | grep systemUUID | sort | wc -l

Get the number of unique systemUUIDs:

kubectl get nodes -o=json | jq | grep systemUUID | sort | uniq | wc -l

If the number of nodes and the number of unique systemUUIDs does not match, the cluster would have been affected by the incorrect node ID calculation.

What are the impacts of the new ID calculation?

1. New KUBERNETES_NODE entities

During the transition of the calculation method, you will see twice as many Kubernetes nodes. This is expected behavior. As their names will remain the same, you will see a pair of Kubernetes nodes entities for each node name,  where one ceases to be monitored when the transition took place, whereas the second entity starts to be become visible and be used as a reference for metrics, events, logs lines, etc.


With the following DQL query you can see those duplicate nodes in Grail™:

fetch dt.entity.kubernetes_node, from:now() - 24h
| fieldsAdd clustered_by[dt.entity.kubernetes_cluster], lifetime
| filter clustered_by[dt.entity.kubernetes_cluster] == "<a KUBERNETES_CLUSTER id>"
| sort asc, lifetime asc


2. Monitoring gap

Changing the node id calculation method requires a renegotiation of which ActiveGate will be responsible for monitoring the Kubernetes cluster. This gap could last for 1 to 3 minutes and is expected behavior.

In the following example you see the gap and will recognize that there are two names with the same node name. The slightly different color in the chart indicates that two different Kubernetes node entities (with the same name) are stored to the metric.


3. Metric events etc. based on the Kubernetes node entity ID

If you have any metric events based on the entity ID, those events won't work anymore.  Instead of using the entity ID we recommend using the node name in combination with the cluster id, instead. To avoid gaps in those metric events, we recommend changing such events as soon as possible. You can do that already in advance and don't have to wait for release 1.286. (Learn more about metric events here.)

Minimum version requirements

  • ActiveGate >= 1.273
  • OneAgent >= 1.272

What happens if the requirements are not met?

  • Some problems might point to a non-existent KUBERNETES_NODE entity, if the ActiveGate is too old.
  • No relationship between the HOST entities and the KUBERNETES_NODE entities are stored, if the OneAgent is too old or a not longer supported Dynatrace Operator version is used.


Problems related to the node-name based ID calculation could e.g. be:

  • broken links to node entities
  • no node information available where the information is expected

Before reaching out to support you might want to verify, whether the minimum version requirements are met. Helpful information for the Support team would then be

  • which cluster(s) were affected?
  • when did the transition happen (see the "inactive for ..." text)?
  • if available, some before/after screenshots
  • You might want to add the link to this article.

Featured Posts