There is a global measure called "Page Faults per Second" which I suppose includes any kind of page fault i.e. both soft and hard page faults per monitored host (based on tests conducted on Linux hosts). As I haven't seen any conclusive documentation on the subject: Can anybody confirm this to be the case?
Interestingly the built-in incident "Host Memory Unhealthy" is looking at the average rate (1/s) of hard page faults with a default upper threshold of 20 averaged over the course of 15 minutes:
However, I have been unable to correlate the built-in measure of "Page Faults per Second" to the triggering of this incident, which suggests that this is perhaps a different (internal?) measure - again, any definitive confirmation would be helpful.
The main reason I'm asking is this: As soft page faults are usually unproblematic I'm much more interested in the rate of hard page faults and/or swapping with regards to charting and visualization in custom dashboards but so far I've been unable to do so using agent-based measures (I'm aware that using a "Unix System Monitor" instance I could get values for Pages I/O but this is not possible on our hosts due to security restrictions).