-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implementation changes to BufWriter #78551
Conversation
Changes to some BufWriter Write methods, with a focus on reducing total device write calls by fully filling the buffer in some cases
(rust_highfive has picked a reviewer for you, use r? to override) |
I'm slightly worried about splitting things from The documentaion here says:
Strictly speaking it doesn't specify that it will call
The documentation says:
So I'd say yes, this |
- BufWriter now makes use of vectored writes when flushing; it attempt to write both buffered data and incoming data in a single operation when it makes sense to do so. - LineWriterShim takes advantage of BufWriter's new "vectored flush" operation in a few places - Fixed a failing test. No new tests yet.
- Added BufWriter::available; used it to simplify various checks - Refactored write to make it more simple; extensively recommented it
- Fixed bugs in write implementation; decompressed it to make the flow more clear. - Replaced several uses of .capacity() with .available(); in these cases they're identical (because they always occur after a completed flush) but available makes more sense conceptually.
Yeah, I went back on forth on this. The rationale I settled on is that, while
Done! |
I added a section to the PR description, but I wanted to call it out in a new comment: I revised the PR so that |
if let Some(buf) = only_one(bufs, |b| !b.is_empty()) { | ||
// If there's exactly 1 incoming buffer, `Self::write` can make | ||
// use of self.inner.write_vectored to attempt to combine flushing | ||
// the existing buffer with writing the new one. | ||
self.write(buf) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure this special case is worth optimizing for. write_vectored
is used when there are multiple slices to write, and all the other paths handle occasionally empty slices correctly. The added branching here adds overhead to the most typical case of multiple non-empty slices.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm. I'll see if I can whip up a benchmark; my instinct is that branch prediction means that this overhead will be completely negligible (since, as you said, most of the time there'll be several inputs.)
The major point of differentiation between write
and write_vectored
here doesn't have anything to do with empty slices; it's that write
gets to use flush_buf_vectored
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I went back and forth a lot on this, but I'm coming around to getting rid of it since, yeah, callers will likely end up calling the correct method. I've reviewed my own use of write_vectored
in this PR (flush_buf_vectored
etc) and confirmed that this usually ends up being the case. I'd love to get another opinion though (@m-ou-se?)
@Lucretiel any updates on this (i know it is waiting on review but if you can address the last review before we get this reviewed by @joshtriplett it would be better) |
Yeah, my apologies, I've been busy with other stuff and honestly the election kind of took over all of my free cognitive space. I can follow up with this at least by this weekend. |
The main reason I'm willing to break them up in this case is that Additionally, by far the most common caller of |
Found a corner-case bug in this implementation, please don't merge until I update & write a regression test |
- Found and fixed a bug where write_vectored could, in some circumstances, forward a write directly to the inner writer (skipping the buffer) without first flushing the buffer. - Added a regression test for this bug.
Fixed |
☔ The latest upstream changes (presumably #78768) made this pull request unmergeable. Please resolve the merge conflicts. Note that reviewers usually do not review pull requests until merge conflicts are resolved! Once you resolve the conflicts, you should change the labels applied by bors to indicate that your PR is ready for review. Post this as a comment to change the labels:
|
I'm working on some changes that may significantly improve |
@Lucretiel any updates? |
thanks for taking the time to contribute. I have to close this due to inactivity. If you wish and you have the time you can open a new PR with these changes and we'll take it from there. Thanks |
This PR contains some proposed implementation updates to
BufWriter
, with a focus on reducing total devicewrite
calls in the average case. These updates are based on lessons learned from the design ofLineWriterShim
, as well as some discussions with @workingjubilee on this topic.There are three main changes to how
BufWriter
works:write_all
now fully fills the buffer before flushing. Previously, it would fill the buffer as much as possible without splitting any incoming writes. However, we assume that a caller ofwrite_all
is foremost interested in minimizing write calls to the underlying device, which we can best achieve by maximizing the size of buffers we flush. Of course, incoming data that exceeds the total size of the buffer is still forwarded directly to the underlying device.write
would prefer to have unbroken writes if possible (that is, would preferOk(n)
wheren == buf.len()
). The implementation ofwrite
is therefore left unchanged with regard to buffer filling.write_vectored
is unchanged wheninner.is_write_vectored()
. However, in the case where the underlying device doesn't offer any specialization, we now take care to buffer together all the incoming sub-bufs, even if the total size exceeds our buffer, and only fall back to directly forwarded writes in the case that an individual slice exceeds our buffer size. As a result,BufWriter
now also unconditionally returnstrue
foris_write_vectored
.flush_buf_vectored()
, a new internal method that is similar toflush_buf
, but additionally attempts to use vectored operations to send the new incoming data along with the existing buffered data in a single operation. I took care to ensure that this never results in more system calls than it would have already; in particular, it immediately stops forwarding as soon as the existing buffer is fully forwarded, even if 0 new bytes have been sent.write
andwrite_all
were refactored to take advantage of it, andwrite_vectored
was slightly modified to forward towrite
if exactly 1 buffer is being written.self.panicked
guard fences.BufWriter::panicked
is about preventing duplicate writes of buffered data, so it isn't necessary to fence writes to the underlying device when the buffer is known to be empty.Follow up items
Tasks
Open questions
Stuff I want to call out and ensure is addressed in review:
BufWriter
changes make sense.BufWriter
to unconditionally returntrue
foris_write_buffered
, since it specializes vectored writes by buffering them together, even when the underlying device offers no such specialization?This was split off to a separate PR from #78515