08 May 2024 04:25 PM - edited 08 May 2024 04:26 PM
09 May 2024 03:11 AM
Disk Corrupt : There are several factors that can identify a corrupted disk, for example, read errors, write errors, slow operation.
For this you have useful metrics such as:
- Disk read time (builtin:host.disk.readTime )
- Disk read operations per second (builtin:host.disk.readOps )
- Disk read bytes per second (builtin:host.disk.bytesRead )
- Disk throughput read (builtin:host.disk.throughput.read)
- Disk write time (builtin:host.disk.writeTime)
- Disk write operations per second (builtin:host.disk.writeOps)
- Disk write bytes per second (builtin:host.disk.bytesWritten)
- Disk throughput write (builtin:host.disk.throughput.write)
Disk Available: I suggest measuring these metrics in Percentage
For this you have useful metrics such as:
- Disk available % (builtin:host.disk.free)
- Inodes available % (builtin:host.disk.inodesAvail)
To apply these rules at the host group level:
1.- I think you can first create the rule from Data Explorer.
2.- Create a Metric Event (Using the code of Advanced mode)
I hope it's helpful 💪