Symptoms

shahinm · ‎11 Dec 2024

ActiveGate has a counter metric cache that transforms monotonic counters to delta counters, for cluster scalability reasons. This cache, by default, currently allows 100.000 entries and AG rejects some counter metrics when the limit has been reached.

Symptoms

You enabled monitoring of Prometheus metrics and missing some metrics which type is counter. In the AG logs, you observe the following warning message:

<timestamp> UTC WARNING [<environment-id>] [CounterCacheImpl] Counter metric cache limit exceeded: 100000. Data point evicted. Reason: SIZE [Suppressing further identical messages for 1 hour]

Resolution

There are two possible options to overcome the problem.

1. Filter ingested metrics and exclude unnecessary ones

2. Extend the cache limit parameters in the custom.properties file

Example:

[kubernetes_monitoring]
mint_metric_counter_metric_cache_size=300000
mint_metric_counter_metric_cache_expiry_duration_minutes=2

For containerized AGs, deployed via the Dynatrace operator, enable the same configuration in the dynakube.

activeGate:
  ...
  customProperties:
    value: |
     [kubernetes_monitoring]
     mint_metric_counter_metric_cache_size=300000
     mint_metric_counter_metric_cache_expiry_duration_minutes=2

paum · ‎14 Jan 2025

Something frustrating about this is that I find this documented in this post and at the time of me writing this post, there doesn't seem to be any kind of documentation around these custom property fields.

I was also looking to see if Dynatrace does something with other type of metrics it is scraping, but no luck.

the_real_anil · ‎16 Apr 2025

Agree. No documentation for mint_metric_counter_metric_cache_size. What is this property mean? Is it cache, if yes, how to reset or how is his value calculated.?The active gate restart does not reset this property, the only option is to increase the value. 😞 No luck so far to find the additional details.

A warning message "Counter metric cache limit exceeded" in AG logs.

Symptoms

Resolution