Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ZFS resilver can be very slow if there are other heavy disk IO requests, can the resilver priority be adjusted? #11777

Closed
wxiaoguang opened this issue Mar 21, 2021 · 6 comments
Labels
Type: Feature Feature request or new feature

Comments

@wxiaoguang
Copy link

wxiaoguang commented Mar 21, 2021

Describe the feature would like to see added to OpenZFS

Can the resilver IO priority be adjusted? It gives the users a chance to decide how to allocate IO resources.

In old ZFS there are module parameters like zfs_resilver_delay, however these parameters have been removed in latest versions.

How will this feature improve OpenZFS?

ZFS resilver can be very slow if there are other heavy disk IO requests, it may strave and nearly stop working, and the resilver progress doesn't complete in several days.

@wxiaoguang wxiaoguang added the Type: Feature Feature request or new feature label Mar 21, 2021
@justinianpopa
Copy link

justinianpopa commented Mar 21, 2021

Have a look at zfs_resilver_min_time_ms and possibly zfs_scan_min_time_ms via https://openzfs.github.io/openzfs-docs/Performance%20and%20Tuning/Module%20Parameters.html?highlight=resilver#zfs-resilver-min-time-ms

EDIT: urlfix

@amotin
Copy link
Member

amotin commented Mar 21, 2021

The latest change in this area was mine: #11166 . While the idea was actually opposite -- better throttle resilver to not affect the payload latency, there are number of tunables to allow adjustment, if needed.

@wxiaoguang
Copy link
Author

wxiaoguang commented Mar 22, 2021

Have a look at zfs_resilver_min_time_ms and possibly zfs_scan_min_time_ms via https://openzfs.github.io/openzfs-docs/Performance%20and%20Tuning/Module%20Parameters.html?highlight=resilver#zfs-resilver-min-time-ms

Thank you very much, justinianpopa ~

The document says ZFS spends at least zfs_resilver_min_time_ms time working on a resilver between txg commits and ZFS spends at least zfs_scan_min_time_ms time working on a scrub between txg commits

For my understanding, txg commits only affects "writes".

In my case, if there are heavy read IO requests, it still affects ZFS resilver speed.

eg: When a disk is being resilvering, if there is a rsync copying files out from the raidz, then the resilver becomes very slow. If the rsync is killed, the resilver becomes faster.

Not sure if my understanding is correct.

And I have tested the zfs_resilver_min_time_ms parameter:

  1. disk sdb in a raidz3(data1) is being resilvering, while there are some (light) read/write IO requests running.
  2. the iostat shows that %util of sdb is about 100% (the disk is being writing by resilver)
  3. the otime of data1/txg is around 6s
  4. run rsync to copy large files from data1 to another raidz3(data2)
  5. the otime of data1/txg becomes around 10-20s
  6. the iostat shows that %util of sdb drops to 0-10% frequently, even if I set 10s in zfs_resilver_min_time_ms

About: #11166 , thank you very much amotin, I am not sure whether the problem I meet is caused by 4K random read (somehow related?) I am glad to continue to investigate and help.

@justinianpopa
Copy link

justinianpopa commented Mar 22, 2021

While i'm unsure if scrub/resilver IO is considered async or sync IO for the zfs scheduler, you may also try (as mentioned in the issue # above) tuning zfs_vdev_async_read_[min,max]_active / zfs_vdev_sync_read_[min,max]_active to lower values to reduce queue depth.

There may not be a tunable combination for exactly prioritising resilver reads however you might be able to tune scheduling of io requests to at least treat normal pool activity and resilvers in a balanced manner for your specific drives. For tuning, there also exists zfs_vdev_scrub_[min,max]_active that refers to "reads and scan IOs" but i think that's only for scrubs and not resilvers and possibly zfs_vdev_aggregation_limit might help you gain a few more IOPS assuming scub/resilver requests get treated and scheduled alongside normal ones in the aggregate limit.

In my experience with ZFS on linux perf. tuning, you could also try setting the read_ahead_kb kernel tunable to 0 (per each block device in the pool, in /sys/block/*/queue/read_ahead_kb) to gain a few more useful IOPS as data is rarely useful from read aheads.

It's worth a few combinations to try.

@rincebrain
Copy link
Contributor

The man page for zfs-module-parameters explicitly cites zfs_vdev_async_[...] as affecting resilver performance.

It also explicitly suggests tuning zfs_vdev_scrub_max_active "will cause the scrub or resilver to complete more quickly,", so it should affect resilvers too.

@wxiaoguang
Copy link
Author

Thanks rincebrain.

I will read the documents about "zfs_vdev_scrub_max_active" and try later.

It may help many users if ZFS document has a topic about resilver performance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Feature Feature request or new feature
Projects
None yet
Development

No branches or pull requests

4 participants