Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve resilver ETAs #14410

Merged
merged 1 commit into from
Jan 25, 2023
Merged

Improve resilver ETAs #14410

merged 1 commit into from
Jan 25, 2023

Conversation

behlendorf
Copy link
Contributor

@behlendorf behlendorf commented Jan 20, 2023

Motivation and Context

When resilvering the estimated time remaining is calculated using the average issue rate over the current pass. Where the current pass begins when a scan was started or restarted if the pool was exported/imported.

For dRAID pools in particular this can result in wildly optimistic estimates since the issue rate will be very high while scanning non-degraded regions of the pool. Once repair I/O starts being issued performance drops to a realistic number but the estimated performance is still significantly skewed.

Description

To address this we redefine a pass such that it starts after a scanning phase completes so the issue rate is more reflective of recent performance. This has the advantage that it's backwards compatible with previous versions of the zpool binary. Additionally, the zfs_scan_report_txgs module option can be set to reset the pass statistics more often.

How Has This Been Tested?

Locally rebuilding a draid2:11d:94c:2s pool with approximately 33 TB of data. In this configuration when a single drive fails roughly 86% of the pool is still fully intact. Furthermore, the node has sufficient memory to fully scan the pool before starting to issue I/O. This means that when using the unpatched code the zpool status percent complete quickly jumps to 86% during the scan phase, and then reports an optimistic estimated resilver time of less than a singe minute. In reality, we know the failed disk contains about 350GB of data to rebuild which at 200MB/s will take at best about 30 minutes. With this change, while performing the first scan phase no estimate is reported. After transitioning to the issue phase the estimated resilver time is roughly 31 minutes which is inline with the expected hardware performance.

Before:

  scan: resilver in progress since Fri Jan 20 15:09:03 2023
        33.1T scanned at 112G/s, 28.5T issued at 96.4G/s, 33.1T total
        1.22G resilvered, 86.18% done, 00:00:48 to go

After:

  scan: resilver in progress since Fri Jan 20 15:09:03 2023
        33.1T scanned at 0B/s, 29.0T issued at 2.26G/s, 33.1T total
        37.8G resilvered, 87.57% done, 00:31:01 to go

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Library ABI change (libzfs, libzfs_core, libnvpair, libuutil and libzfsbootenv)
  • Documentation (a change to man pages or other documentation)

Checklist:

@behlendorf behlendorf added the Status: Code Review Needed Ready for review and testing label Jan 20, 2023
@behlendorf behlendorf requested a review from tonyhutter January 23, 2023 21:23
@behlendorf
Copy link
Contributor Author

@akashb-22 you may be interested in reviewing this.

man/man4/zfs.4 Show resolved Hide resolved
cmd/zpool/zpool_main.c Outdated Show resolved Hide resolved
When resilvering the estimated time remaining is calculated using
the average issue rate over the current pass.  Where the current
pass starts when a scan was started or restarted when the pool
was exported/imported.

For dRAID pools in particular this can result in wildly optimistic
estimates since the issue rate will be very high while scanning
when non-degraded regions of the pool are scanned.  Once repair
I/O starts being issued performance drops to a realistic number
but the estimated performance is still significantly skewed.

To address this we redefine a pass such that it starts after a
scanning phase completes so the issue rate is more reflective of
recent performance.  Additionally, the zfs_scan_report_txgs
module option can be set to reset the pass statistics more often.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
@behlendorf behlendorf added Status: Accepted Ready to integrate (reviewed, tested) and removed Status: Code Review Needed Ready for review and testing labels Jan 25, 2023
@behlendorf behlendorf merged commit c85ac73 into openzfs:master Jan 25, 2023
lundman pushed a commit to openzfsonwindows/openzfs that referenced this pull request Mar 3, 2023
When resilvering the estimated time remaining is calculated using
the average issue rate over the current pass.  Where the current
pass starts when a scan was started, or restarted, if the pool
was exported/imported.

For dRAID pools in particular this can result in wildly optimistic
estimates since the issue rate will be very high while scanning
when non-degraded regions of the pool are scanned.  Once repair
I/O starts being issued performance drops to a realistic number
but the estimated performance is still significantly skewed.

To address this we redefine a pass such that it starts after a
scanning phase completes so the issue rate is more reflective of
recent performance.  Additionally, the zfs_scan_report_txgs
module option can be set to reset the pass statistics more often.

Reviewed-by: Akash B <akash-b@hpe.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes openzfs#14410
behlendorf added a commit to behlendorf/zfs that referenced this pull request Apr 21, 2023
When resilvering the estimated time remaining is calculated using
the average issue rate over the current pass.  Where the current
pass starts when a scan was started, or restarted, if the pool
was exported/imported.

For dRAID pools in particular this can result in wildly optimistic
estimates since the issue rate will be very high while scanning
when non-degraded regions of the pool are scanned.  Once repair
I/O starts being issued performance drops to a realistic number
but the estimated performance is still significantly skewed.

To address this we redefine a pass such that it starts after a
scanning phase completes so the issue rate is more reflective of
recent performance.  Additionally, the zfs_scan_report_txgs
module option can be set to reset the pass statistics more often.

Reviewed-by: Akash B <akash-b@hpe.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes openzfs#14410
behlendorf added a commit that referenced this pull request Apr 24, 2023
When resilvering the estimated time remaining is calculated using
the average issue rate over the current pass.  Where the current
pass starts when a scan was started, or restarted, if the pool
was exported/imported.

For dRAID pools in particular this can result in wildly optimistic
estimates since the issue rate will be very high while scanning
when non-degraded regions of the pool are scanned.  Once repair
I/O starts being issued performance drops to a realistic number
but the estimated performance is still significantly skewed.

To address this we redefine a pass such that it starts after a
scanning phase completes so the issue rate is more reflective of
recent performance.  Additionally, the zfs_scan_report_txgs
module option can be set to reset the pass statistics more often.

Reviewed-by: Akash B <akash-b@hpe.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #14410
ofaaland pushed a commit to LLNL/zfs that referenced this pull request Jun 16, 2023
When resilvering the estimated time remaining is calculated using
the average issue rate over the current pass.  Where the current
pass starts when a scan was started, or restarted, if the pool
was exported/imported.

For dRAID pools in particular this can result in wildly optimistic
estimates since the issue rate will be very high while scanning
when non-degraded regions of the pool are scanned.  Once repair
I/O starts being issued performance drops to a realistic number
but the estimated performance is still significantly skewed.

To address this we redefine a pass such that it starts after a
scanning phase completes so the issue rate is more reflective of
recent performance.  Additionally, the zfs_scan_report_txgs
module option can be set to reset the pass statistics more often.

Reviewed-by: Akash B <akash-b@hpe.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes openzfs#14410
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Accepted Ready to integrate (reviewed, tested)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants