there is the following reference:
You can define up to 20,000 devices for a single monitoring configuration. Configurations are split into buckets, with a default size of 100 devices per bucket. Each bucket of devices is polled independently as a separate process on one of the ActiveGates in a group. This feature is automatically enabled for WMI, Prometheus, SNMP, and SQL extensions, while for other types of extensions, its activation depends on the specific extension.
Now, for WMI, I have a configuration of almost 100 devices. Only one AG is being used, while I have two. I'm loosing datapoints, as not all queries are being made in a minute.
Is the default size of 100 configurable in any way, preferably at only the WMI extension level? For only WMI, I would like to configure 50 as the size of the bucket.
Solved! Go to Solution.
Hi @AntonioSousa !
I worked a lot with the support regarding Data gap with Remote WMI and large configuration.
We found the issue appears when we tried to poll wmi api and wmi module is currently occupied with error handling.
In theory Microsoft claims that WMI is multithreaded but it seems that it is unable to report back to queries when asked during error handling. So when we ask for data the wmi module is not responding for long periods of time when it is handling the connectivity errors. We suspect that the difference in behavior between one large config and multiple configs comes down to how microsoft handles requests to wmi coming from different processes(like in multiple config scenario) and coming from single process(like in the one large config scenario).
From my experience my recommendation is to reduce the number of WMI devices per configuration as much as possible. Or use local WMI extension with OneAgent if possible
Thanks for your insight. I also have several monitoring configurations, and the ones with fewer devices behave better. I have tracked that down with the following Data Explorer query, as it shows an amusing graph of which hosts get more metrics collected: dsfm:datasource.wmi.query.count:splitBy(host):sort(value(auto,descending))
As I'm seeing measurements being done from only one AG, I was expecting that dividing it by two might get better. But given what you said, it might not be the case...
I have 5 monitoring configurations, with a variable number of devices, and a total of almost 100 devices.
The issue here is that having 2 AGs in th group, only one of the AGs is being used to do the measurements.