Improve resilver ETAs #14410

behlendorf · 2023-01-20T22:40:14Z

Motivation and Context

When resilvering the estimated time remaining is calculated using the average issue rate over the current pass. Where the current pass begins when a scan was started or restarted if the pool was exported/imported.

For dRAID pools in particular this can result in wildly optimistic estimates since the issue rate will be very high while scanning non-degraded regions of the pool. Once repair I/O starts being issued performance drops to a realistic number but the estimated performance is still significantly skewed.

Description

To address this we redefine a pass such that it starts after a scanning phase completes so the issue rate is more reflective of recent performance. This has the advantage that it's backwards compatible with previous versions of the zpool binary. Additionally, the zfs_scan_report_txgs module option can be set to reset the pass statistics more often.

How Has This Been Tested?

Locally rebuilding a draid2:11d:94c:2s pool with approximately 33 TB of data. In this configuration when a single drive fails roughly 86% of the pool is still fully intact. Furthermore, the node has sufficient memory to fully scan the pool before starting to issue I/O. This means that when using the unpatched code the zpool status percent complete quickly jumps to 86% during the scan phase, and then reports an optimistic estimated resilver time of less than a singe minute. In reality, we know the failed disk contains about 350GB of data to rebuild which at 200MB/s will take at best about 30 minutes. With this change, while performing the first scan phase no estimate is reported. After transitioning to the issue phase the estimated resilver time is roughly 31 minutes which is inline with the expected hardware performance.

Before:

  scan: resilver in progress since Fri Jan 20 15:09:03 2023
        33.1T scanned at 112G/s, 28.5T issued at 96.4G/s, 33.1T total
        1.22G resilvered, 86.18% done, 00:00:48 to go

After:

  scan: resilver in progress since Fri Jan 20 15:09:03 2023
        33.1T scanned at 0B/s, 29.0T issued at 2.26G/s, 33.1T total
        37.8G resilvered, 87.57% done, 00:31:01 to go

Types of changes

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Performance enhancement (non-breaking change which improves efficiency)
Code cleanup (non-breaking change which makes code smaller or more readable)
Breaking change (fix or feature that would cause existing functionality to change)
Library ABI change (libzfs, libzfs_core, libnvpair, libuutil and libzfsbootenv)
Documentation (a change to man pages or other documentation)

Checklist:

My code follows the OpenZFS code style requirements.
I have updated the documentation accordingly.
I have read the contributing document.
I have added tests to cover my changes.
I have run the ZFS Test Suite with this change applied.
All commit messages are properly formatted and contain Signed-off-by.

behlendorf · 2023-01-23T21:28:59Z

@akashb-22 you may be interested in reviewing this.

man/man4/zfs.4

cmd/zpool/zpool_main.c

When resilvering the estimated time remaining is calculated using the average issue rate over the current pass. Where the current pass starts when a scan was started or restarted when the pool was exported/imported. For dRAID pools in particular this can result in wildly optimistic estimates since the issue rate will be very high while scanning when non-degraded regions of the pool are scanned. Once repair I/O starts being issued performance drops to a realistic number but the estimated performance is still significantly skewed. To address this we redefine a pass such that it starts after a scanning phase completes so the issue rate is more reflective of recent performance. Additionally, the zfs_scan_report_txgs module option can be set to reset the pass statistics more often. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>

When resilvering the estimated time remaining is calculated using the average issue rate over the current pass. Where the current pass starts when a scan was started, or restarted, if the pool was exported/imported. For dRAID pools in particular this can result in wildly optimistic estimates since the issue rate will be very high while scanning when non-degraded regions of the pool are scanned. Once repair I/O starts being issued performance drops to a realistic number but the estimated performance is still significantly skewed. To address this we redefine a pass such that it starts after a scanning phase completes so the issue rate is more reflective of recent performance. Additionally, the zfs_scan_report_txgs module option can be set to reset the pass statistics more often. Reviewed-by: Akash B <akash-b@hpe.com> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes openzfs#14410

When resilvering the estimated time remaining is calculated using the average issue rate over the current pass. Where the current pass starts when a scan was started, or restarted, if the pool was exported/imported. For dRAID pools in particular this can result in wildly optimistic estimates since the issue rate will be very high while scanning when non-degraded regions of the pool are scanned. Once repair I/O starts being issued performance drops to a realistic number but the estimated performance is still significantly skewed. To address this we redefine a pass such that it starts after a scanning phase completes so the issue rate is more reflective of recent performance. Additionally, the zfs_scan_report_txgs module option can be set to reset the pass statistics more often. Reviewed-by: Akash B <akash-b@hpe.com> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #14410

When resilvering the estimated time remaining is calculated using the average issue rate over the current pass. Where the current pass starts when a scan was started, or restarted, if the pool was exported/imported. For dRAID pools in particular this can result in wildly optimistic estimates since the issue rate will be very high while scanning when non-degraded regions of the pool are scanned. Once repair I/O starts being issued performance drops to a realistic number but the estimated performance is still significantly skewed. To address this we redefine a pass such that it starts after a scanning phase completes so the issue rate is more reflective of recent performance. Additionally, the zfs_scan_report_txgs module option can be set to reset the pass statistics more often. Reviewed-by: Akash B <akash-b@hpe.com> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes openzfs#14410

behlendorf added the Status: Code Review Needed Ready for review and testing label Jan 20, 2023

behlendorf requested a review from tonyhutter January 23, 2023 21:23

tonyhutter approved these changes Jan 24, 2023

View reviewed changes

akashb-22 reviewed Jan 24, 2023

View reviewed changes

man/man4/zfs.4 Show resolved Hide resolved

cmd/zpool/zpool_main.c Outdated Show resolved Hide resolved

behlendorf force-pushed the rebuild-eta branch from 3db2f02 to b2d7ea8 Compare January 24, 2023 19:31

akashb-22 approved these changes Jan 25, 2023

View reviewed changes

behlendorf added Status: Accepted Ready to integrate (reviewed, tested) and removed Status: Code Review Needed Ready for review and testing labels Jan 25, 2023

behlendorf merged commit c85ac73 into openzfs:master Jan 25, 2023

behlendorf mentioned this pull request Feb 14, 2023

Scrub scanning phase bandwidth calculation should take end of scan phase into account #13090

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve resilver ETAs #14410

Improve resilver ETAs #14410

behlendorf commented Jan 20, 2023 •

edited

Loading

behlendorf commented Jan 23, 2023

Improve resilver ETAs #14410

Improve resilver ETAs #14410

Conversation

behlendorf commented Jan 20, 2023 • edited Loading

Motivation and Context

Description

How Has This Been Tested?

Types of changes

Checklist:

behlendorf commented Jan 23, 2023

behlendorf commented Jan 20, 2023 •

edited

Loading