11 Dec 2024 11:37 AM - edited 11 Dec 2024 11:38 AM
ActiveGate has a counter metric cache that transforms monotonic counters to delta counters, for cluster scalability reasons. This cache, by default, currently allows 100.000 entries and AG rejects some counter metrics when the limit has been reached.
You enabled monitoring of Prometheus metrics and missing some metrics which type is counter. In the AG logs, you observe the following warning message:
<timestamp> UTC WARNING [<environment-id>] [CounterCacheImpl] Counter metric cache limit exceeded: 100000. Data point evicted. Reason: SIZE [Suppressing further identical messages for 1 hour]
There are two possible options to overcome the problem.
1. Filter ingested metrics and exclude unnecessary ones
2. Extend the cache limit parameters in the custom.properties file
Example:
[kubernetes_monitoring]
mint_metric_counter_metric_cache_size=300000
mint_metric_counter_metric_cache_expiry_duration_minutes=2
For containerized AGs, deployed via the Dynatrace operator, enable the same configuration in the dynakube.
activeGate:
...
customProperties:
value: |
[kubernetes_monitoring]
mint_metric_counter_metric_cache_size=300000
mint_metric_counter_metric_cache_expiry_duration_minutes=2
Something frustrating about this is that I find this documented in this post and at the time of me writing this post, there doesn't seem to be any kind of documentation around these custom property fields.
I was also looking to see if Dynatrace does something with other type of metrics it is scraping, but no luck.