-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Specialize io::copy for BufRead sources #71091
Conversation
Allocating an additional buffer and then copying through it is completely unnecessary if the source we're copying from is BufRead (i.e. already maintains a buffer internally). Using specialization, we can detect this and handle it accordingly. Note that this also makes it possible to specify a custom buffer size for copy operations by wrapping the source in a BufReader::with_capacity.
Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @Mark-Simulacrum (or someone else) soon. If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes. Please see the contribution instructions for more information. |
The job Click to expand the log.
I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact |
The job Click to expand the log.
I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact |
The job Click to expand the log.
I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact |
??? |
r? @matthewjasper for specialization review; it looks like BufRead is not currently within the grouping. I think we should try and work on some documentation here :) |
Is this some special ruleset when compiling std? Because this exact same specialization worked fine when I tested it outside of std. |
Yeah, due to unsoundness bugs with specialization, std is limited to just |
r? @dtolnay |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am uncomfortable accepting this because it results in behavior differences that are observable outside of std (different method calls on the caller's trait impls) and can only be preserved by continuing to rely on specialization, with this not being in the sound min_specialization subset.
The optimization is good but we should defer this until specialization is in better shape.
Given that this optimization is not doable without specialization, how do you propose to move forward with the underlying issue? Document it and provide an alternative like |
Quite a lot of optimizations will be made possible by specialization. It's being worked on. In the meantime a copy_bufread can be provided on crates.io for the use cases where it makes a significant difference; I don't think we need to stabilize something new for that in std in the short term. |
How about adding a |
If an approach which only requires |
specialize io::copy to use copy_file_range, splice or sendfile Fixes rust-lang#74426. Also covers rust-lang#60689 but only as an optimization instead of an official API. The specialization only covers std-owned structs so it should avoid the problems with rust-lang#71091 Currently linux-only but it should be generalizable to other unix systems that have sendfile/sosplice and similar. There is a bit of optimization potential around the syscall count. Right now it may end up doing more syscalls than the naive copy loop when doing short (<8KiB) copies between file descriptors. The test case executes the following: ``` [pid 103776] statx(3, "", AT_STATX_SYNC_AS_STAT|AT_EMPTY_PATH, STATX_ALL, {stx_mask=STATX_ALL|STATX_MNT_ID, stx_attributes=0, stx_mode=S_IFREG|0644, stx_size=17, ...}) = 0 [pid 103776] write(4, "wxyz", 4) = 4 [pid 103776] write(4, "iklmn", 5) = 5 [pid 103776] copy_file_range(3, NULL, 4, NULL, 5, 0) = 5 ``` 0-1 `stat` calls to identify the source file type. 0 if the type can be inferred from the struct from which the FD was extracted 𝖬 `write` to drain the `BufReader`/`BufWriter` wrappers. only happen when buffers are present. 𝖬 ≾ number of wrappers present. If there is a write buffer it may absorb the read buffer contents first so only result in a single write. Vectored writes would also be an option but that would require more invasive changes to `BufWriter`. 𝖭 `copy_file_range`/`splice`/`sendfile` until file size, EOF or the byte limit from `Take` is reached. This should generally be *much* more efficient than the read-write loop and also have other benefits such as DMA offload or extent sharing. ## Benchmarks ``` OLD test io::tests::bench_file_to_file_copy ... bench: 21,002 ns/iter (+/- 750) = 6240 MB/s [ext4] test io::tests::bench_file_to_file_copy ... bench: 35,704 ns/iter (+/- 1,108) = 3671 MB/s [btrfs] test io::tests::bench_file_to_socket_copy ... bench: 57,002 ns/iter (+/- 4,205) = 2299 MB/s test io::tests::bench_socket_pipe_socket_copy ... bench: 142,640 ns/iter (+/- 77,851) = 918 MB/s NEW test io::tests::bench_file_to_file_copy ... bench: 14,745 ns/iter (+/- 519) = 8889 MB/s [ext4] test io::tests::bench_file_to_file_copy ... bench: 6,128 ns/iter (+/- 227) = 21389 MB/s [btrfs] test io::tests::bench_file_to_socket_copy ... bench: 13,767 ns/iter (+/- 3,767) = 9520 MB/s test io::tests::bench_socket_pipe_socket_copy ... bench: 26,471 ns/iter (+/- 6,412) = 4951 MB/s ```
Allocating an additional buffer and then copying through it is completely unnecessary if the source we're copying from is
BufRead
(i.e. already maintains a buffer internally). Using specialization, we can detect this and handle it accordingly.Note that this also makes it possible to specify a custom buffer size for copy operations by wrapping the source in a
BufReader::with_capacity
. This might fix #49921.Here's why this is better:
I know this benchmark is unfair, but that's basically the point.