-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce overhead in zvol_write/zvol_read via existing hold #4316
Conversation
633e7d7
to
785d8a0
Compare
I probably should elaborate on |
I noticed that |
|
||
error = dmu_read_uio_dbuf(zv->zv_dbuf, ZVOL_OBJ, &uio, bytes); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't need ZVOL_OBJ as per function:
dmu_read_uio_dbuf(dmu_buf_t *zdb, uio_t *uio, uint64_t size)
error = dmu_read_uio_dbuf(zv->zv_dbuf, &uio, bytes);
That should be correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. I corrected that typo and repushed. :)
@behlendorf I would like to squash this PR into 1 or 2 patches and rewrite the commit message(s) before any of this is merged. I say 1 or 2 because I am undecided on whether I should break out the |
7be5e80
to
4f47945
Compare
I doubt this will make any significant impact to performance and the bugs addressed aren't harmful but it's a useful bit of cleanup. It nicely builds on the work done by @tuxoko which allowed bio_vec's to be safely passed in a uio_t. @tuxoko this LGTM to be but I'd appreciate it if you could please review this too. |
@ryao let's keep the patches separate as you've done. |
@behlendorf At this point, I think further improvements in performance will involve micro-optimizations of this nature. There really is no obvious bottleneck for latency or source of CPU usage, although it would be nice to make further reductions in both categories. |
Since I was using this patch while working on #4265, I figured I'd post here that this patch worked perfectly fine for me and seems to be a nice bit of cleanup. I never did measure its performance impact, but as has been pointed out, it's likely to be fairly small. |
In illumos-gate, `zvol_read` and `zvol_write` are both passed uio_t rather than bio_t. Since we are translating from bio to uio for both, we might as well unify the logic and have code more similar to its illumos counterpart. At the same time, we can fix some regressions that occurred versus the original code from illumos-gate. We refactor zvol_write to take uio and also correct the following problems: 1. We did `dnode_hold()` on each IO when we already had a hold. 2. We would attempt to send writes that exceeded `DMU_MAX_ACCESS` to the DMU. 3. We could call `zil_commit()` twice. In this case, this is because Linux uses the `->write` function to send flushes and can aggregate the flush with a write. If a synchronous write occurred with the flush, we effectively flushed twice when there is no need to do that. zvol_read also suffers from the first two problems. Other platforms suffer from the first, so we leave that for a second patch so that there is a discrete patch for them to cherry-pick. Signed-off-by: Richard Yao <ryao@gentoo.org>
The difference between `dmu_read_uio()` and `dmu_read_uio_dbuf()` is that the former takes a hold while the latter uses an existing hold. `zfs_read()` in the ZPL will use `dmu_read_uio_dbuf()` while our analogous `zvol_write()` will use `dmu_write_uio_dbuf()`, but for no apparent reason, we inherited a `zvol_read()` function from OpenSolaris that does `dmu_read_uio()`. illumos-gate also still uses `dmu_read_uio()` to this day. Lets switch to `dmu_read_uio_dbuf()`, which is more performant. Signed-off-by: Richard Yao <ryao@gentoo.org>
@tuxoko I have updated this to address your comments on the !! operator that you made on b7a908a. As I said on the comment, your suggestion on breaking at a block boundary is an improvement that I would prefer to leave for another patch because the first one is partly intended to allow improvements that apply to both ZoL and illumos-gate to be easily cherry-picked and your suggestion applies to code specific to illumos-gate. Also, I am not sure how much that matters in practice because we are breaking up IO on a 32MB boundary and IOs of such size are very rare. It is probably an improvement that we can do separately when one of us has some time to spend working it out. We also do not need to break on the block boundary as we only need to break on the record boundary. After doing that, 32MB chunks will fill the buffers every time without any need to perform reads. |
@ryao |
@behlendorf |
The difference between `dmu_read_uio()` and `dmu_read_uio_dbuf()` is that the former takes a hold while the latter uses an existing hold. `zfs_read()` in the ZPL will use `dmu_read_uio_dbuf()` while our analogous `zvol_write()` will use `dmu_write_uio_dbuf()`, but for no apparent reason, we inherited a `zvol_read()` function from OpenSolaris that does `dmu_read_uio()`. illumos-gate also still uses `dmu_read_uio()` to this day. Lets switch to `dmu_read_uio_dbuf()`, which is more performant. Signed-off-by: Richard Yao <ryao@gentoo.org> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tim Chase <tim@chase2k.com> Signed-off-by: Chunwei Chen <tuxoko@gmail.com> Closes #4316
Our
zvol_write
function had severely diverged with illumos-gate in amanner that caused multiple regressions:
dnode_hold()
on each IO when we already had a hold.DMU_MAX_ACCESS
to theDMU.
zil_commit()
twice. In this case, this is becauseLinux uses the
->write
function to send flushes and can aggregate theflush with a write. If a synchronous write occurred with the flush, we
effectively flushed twice when there is no need to do that.
Signed-off-by: Richard Yao ryao@gentoo.org