Feature request: publish detailed health metrics #5

pavilalopes · 2021-07-20T20:40:32Z

Currently, only the overall zpool health status is published. It would be useful if the individual vdev and disk status were also published.

For example, this status

$ zpool status
...
config:
        NAME                        STATE     READ WRITE CKSUM
        zpool1                      ONLINE       0     0     0
          raidz1-0                  ONLINE       0     0     0
            wwn-0x5000c500b3b2f8c0  ONLINE       0     0     0
            wwn-0x5000c500b3b53463  ONLINE       0     0     0
            wwn-0x5000c500b3b33354  ONLINE       0     0     0

could lead to these metrics being published:

zfs_pool_health{pool="zpool1"} 0
zfs_pool_vdev_health{pool="zpool1", vdev="raidz1-0"} 0
zfs_pool_disk_health{pool="zpool1", vdev="raidz1-0" disk="wwn-0x5000c500b3b2f8c0"} 0
zfs_pool_disk_health{pool="zpool1", vdev="raidz1-0" disk="wwn-0x5000c500b3b53463"} 0
zfs_pool_disk_health{pool="zpool1", vdev="raidz1-0" disk="wwn-0x5000c500b3b33354"} 0

This would make it possible to build more informative dashboards. With only the pool health status, the operator still has to log into the server to find out which/how many devices are faulted.

The text was updated successfully, but these errors were encountered:

pdf · 2021-07-20T23:11:31Z

Unfortunately the ZFS library we use does not provide access to this information, most likely because the zpool command does not provide a flag to enable machine-parseable output. I'd consider exporting this data if it was available upstream, however I don't expect to contribute this functionality myself in the foreseeable future.

HubbeKing · 2021-09-20T13:37:24Z

OpenZFS has now added an "influxdb" command to provide machine-parseable output, which may be useful for this.
openzfs/zfs#10786

pdf · 2021-11-16T10:51:08Z

We no longer rely on an upstream ZFS library, however I'm not certain that we can rely on the influxdb output as that is ZFS version-dependent, and I'd like to maintain good host version compatibility. The alternative though is ugly text parsing, and I'm not eager to tackle this any time soon, though it's certainly doable.

pdf added enhancement New feature or request wontfix This will not be worked on labels Jul 20, 2021

pdf added help wanted Extra attention is needed and removed wontfix This will not be worked on labels Oct 29, 2021

pdf mentioned this issue Feb 21, 2022

Export scrub/resilver status #20

Closed

pdf mentioned this issue Sep 19, 2024

[BUG] exporter not noticing write errors #44

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature request: publish detailed health metrics #5

Feature request: publish detailed health metrics #5

pavilalopes commented Jul 20, 2021

pdf commented Jul 20, 2021

HubbeKing commented Sep 20, 2021

pdf commented Nov 16, 2021

Feature request: publish detailed health metrics #5

Feature request: publish detailed health metrics #5

Comments

pavilalopes commented Jul 20, 2021

pdf commented Jul 20, 2021

HubbeKing commented Sep 20, 2021

pdf commented Nov 16, 2021