Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

io::copy does not use copy_file_range consistently #114341

Closed
xstaticxgpx opened this issue Aug 1, 2023 · 9 comments
Closed

io::copy does not use copy_file_range consistently #114341

xstaticxgpx opened this issue Aug 1, 2023 · 9 comments
Labels
A-io Area: `std::io`, `std::fs`, `std::net` and `std::path` C-enhancement Category: An issue proposing an enhancement or a PR with one. O-linux Operating system: Linux T-libs Relevant to the library team, which will review and decide on the PR/issue.

Comments

@xstaticxgpx
Copy link
Contributor

xstaticxgpx commented Aug 1, 2023

I tried this code:

test.rs:

use std::fs::File;
use std::io;
use std::os::fd::FromRawFd;

fn main() -> io::Result<()> {
    let file = "./test";
    let mut stdout = unsafe { File::from_raw_fd(1) };
    for _ in 0..3 {
        let mut _file = File::open(&file)?;
        let _ = io::copy(&mut _file, &mut stdout);
    }
    Ok(())
}

Testing:

rustc test.rs
./test >test.out # Reads itself and redirects stdout to regular file (now truncated)
strace ./test >test.out # See sendfile is used first before `copy_file_range_candidate` is hit
...
sendfile(1, 3, NULL, 2147479552)        = 10065520
sendfile(1, 3, NULL, 2147479552)        = 0       
...
copy_file_range(-1, NULL, -1, NULL, 1, 0) = -1 EBADF (Bad file descriptor)                                                                                                                                                                                                                                                   
copy_file_range(3, NULL, 1, NULL, 1073741824, 0) = 10065520                                                                                                                                                                                                                                                                  
copy_file_range(3, NULL, 1, NULL, 1073741824, 0) = 0                                                                                                                                                                                                                                                                         
...
copy_file_range(3, NULL, 1, NULL, 1073741824, 0) = 10065520
copy_file_range(3, NULL, 1, NULL, 1073741824, 0) = 0
...

I expected to see this happen:

io::copy initially detects that copy_file_range is available for the copy from regular file to regular file (truncated by the shell redirection)

Instead, this happened:

io::copy initially uses sendfile on the first iteration and then determines copy_file_range is a candidate
ie. when stdout redirected to a regular file which is now meta.len() > 0

The meta.len() > 0 logic does not make sense for outputs AFAICT - only inputs (ie. /proc filesystems, as documented)

FdMeta::Metadata(meta) if meta.is_file() && meta.len() > 0 => true,

copy_file_range can populate the initially empty (regular) output file without issue, that's how coreutils cat works today.

Meta

rustc --version --verbose:

rustc 1.71.0 (8ede3aae2 2023-07-12) (Arch Linux rust 1:1.71.0-1)
binary: rustc
commit-hash: 8ede3aae28fe6e4d52b38157d7bfe0d3bceef225
commit-date: 2023-07-12
host: x86_64-unknown-linux-gnu
release: 1.71.0
LLVM version: 15.0.7
@xstaticxgpx xstaticxgpx added the C-bug Category: This is a bug. label Aug 1, 2023
@rustbot rustbot added the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label Aug 1, 2023
@xstaticxgpx
Copy link
Contributor Author

xstaticxgpx commented Aug 1, 2023

FWIW I'm interested in proposing a code change for this if there's agreement given this behavior is non-ideal, or at least, unexpected. Possibly adding a meta.is_input boolean to the candidate function? I'm curious if there's any "prior art" around something like this in std.

@the8472
Copy link
Member

the8472 commented Aug 2, 2023

Yes, this should be fine. Adding a copy_file_range_candidate(&self, for_read: bool) might be simpler than adding the flag to all the FdMeta construction sites.

let mut stdout = unsafe { File::from_raw_fd(1) };

You shouldn't be doing that since it would prematurely close stdout once it's dropped. Acquire a StdoutLock and the copy specialization will apply to that too.
I suppose we could also specialize Stdout but there might be some tricky edge-cases due to locking, at least when sockets or pipes are involved. But for regular files it should be possible since they shouldn't depend on the other side becoming ready.

@the8472 the8472 added O-linux Operating system: Linux C-enhancement Category: An issue proposing an enhancement or a PR with one. T-libs Relevant to the library team, which will review and decide on the PR/issue. A-io Area: `std::io`, `std::fs`, `std::net` and `std::path` and removed C-bug Category: This is a bug. needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. labels Aug 2, 2023
@xstaticxgpx
Copy link
Contributor Author

You shouldn't be doing that since it would prematurely close stdout once it's dropped. Acquire a StdoutLock and the copy specialization will apply to that too.

In this example, wouldn't that be when main() terminates (which is OK?) I don't see close(1) happen until the program termination as expected.

The reason I did it like that, for context, is that I found the implicit LineWriter wrapping of Stdout to be non-ideal for my actual use case (I wanted full r/w buffers) - so I ended up using BufWriter around the stdout file descriptor directly. For this bug report I simplified it down. I know there's another issue around that specifically that I've been following (rust-lang/libs-team#148)

Yes, this should be fine. Adding a copy_file_range_candidate(&self, for_read: bool) might be simpler than adding the flag to all the FdMeta construction sites.

I agree, that didn't make much sense after some consideration - passing the bool directly seems reasonable 👍

I'll work on a PR for this.

@the8472
Copy link
Member

the8472 commented Aug 2, 2023

The reason I did it like that, for context, is that I found the implicit LineWriter wrapping of Stdout to be non-ideal for my actual use case

Using io::copy + StdoutLock should bypass the linewriter.

@xstaticxgpx
Copy link
Contributor Author

The reason I did it like that, for context, is that I found the implicit LineWriter wrapping of Stdout to be non-ideal for my actual use case

Using io::copy + StdoutLock should bypass the linewriter.

AH yes, of course - originally I was working with arbitrary streaming data (via pipes) so I missed this nuance when I got to implementing io::copy

Would you say it's sufficient to just lock these "at the top" like I'm doing here so you can ultimately choose whether you pass the File.from_raw_fd (ie. to BufWriter for full control) or not depending on various conditions? I appreciate your insight! I'm still figuring out these nuances in rust.

https://github.com/xstaticxgpx/ratiscat/blob/3043ae8cc2b6aa90aeb51419cbc2eda54a6ed3b3/src/bin/rat.rs#L154-L159

@the8472
Copy link
Member

the8472 commented Aug 2, 2023

let input = io::stdin().lock();
let output: StdoutLock = io::stdout().lock();
let buffered_out = BufWriter::new(output);

// ...

io::copy(&mut input, &mut buffered_out)

no from_raw_fd needed. This should already bypass the linebuffer. The bufwriter will also cause large writes should bypass the linewriter I think. And it would only be used for... socket-socket copies I guess. Or old kernels. Or non-linux systems.

@xstaticxgpx
Copy link
Contributor Author

xstaticxgpx commented Aug 2, 2023

Aha! I think I narrowed down some more of confusion here - indeed, when using io::copy as you specified, under certain (modern linux) circumstances it will typically fall back to using splice() in a logical manner. Where it seems to break down is here:

use std::io;

fn main() -> io::Result<()> {
    let mut stdin = io::stdin().lock();
    let mut stdout = io::BufWriter::new(io::stdout().lock());
    let _ = io::copy(&mut stdin, &mut stdout)?;
    Ok(())
}
$ rustc test.rs
$ strace --syscall-limit 100 ./test </dev/urandom | cat >/dev/null
..
read(0, "A\313\260\233\276A\353\235\214\237\302\216\231\355$\256Q;\235\316Ma'\247\34*\304\347\254\215\231\330"..., 8192) = 8192
write(1, "A\313\260\233\276A\353\235\214\237\302\216\231\355$\256Q;\235\316Ma'\247\34*\304\347\254\215\231\330"..., 8162) = 8162
read(0, ">D%\260\31\315:\214V\352\325\370<?\325\35\251(\316\314\27\361\204\366\335R\261\17n\246L\207"..., 8192) = 8192
write(1, "\207[d\30\333s\337\204?\214\337\10\32\27\215\2513\25\360W*J\346\231>2\336\265\202\321", 30) = 30
write(1, ">D%\260\31\315:\214V\352\325\370<?\325\35\251(\316\314\27\361\204\366\335R\261\17n\246L\207"..., 7702) = 7702
..

It's still line buffering when it falls back completely to read() and write(), those calls are split on wherever a 0x0a byte char is seen (ie. see strace with -x -s 8192). Read calls are all at 8192 as expected given that's hardcoded currently.

@the8472
Copy link
Member

the8472 commented Aug 2, 2023

I see. urandom is a char device, we don't have special path for that in kernel_copy. It would require io_uring, a pipe and splicing to get sendfile-like behavior. Yeah in that case line-writer will try to write out large chunks up to the last newline.

I don't think this is worth special-casing in the standard library for the moment. Instead #60673 needs to be solved.

As a workaround you can do this:

let locked = stdout().locked();
let writer = BufWriter::new(File::from(locked.as_fd().try_clone_to_owned().unwrap()));

@xstaticxgpx
Copy link
Contributor Author

As a workaround you can do this:

This is super helpful - thanks! I'm glad to remove anything unsafe from my code as much as possible.

For posterity here is working code which also sets the buffer size > than the default (8192) using .with_capacity() which results in the properly sized syscalls (ie. 64K ideal for pipes)

use std::fs::File;
use std::io;
use std::os::fd::AsFd;

fn main() -> io::Result<()> {
    let mut stdin = io::stdin().lock();
    let stdout = io::stdout().lock();
    let mut stdout = io::BufWriter::with_capacity(65536, File::from(stdout.as_fd().try_clone_to_owned().unwrap()));
    let _ = io::copy(&mut stdin, &mut stdout)?;
    Ok(())
}

$ strace --syscall-limit 100 -x -s 0 ./test </dev/urandom | cat >/dev/null
..
read(0, ""..., 65536)                   = 65536
write(3, ""..., 65536)                  = 65536
..

This wreaks of premature optimization but ultimately I was just shooting for 1:1 low-level behavior/functionality with cat for a rust learning project (cat currently writes 128K to pipes for some reason which lowers throughput). With the exception of the splice usage between pipes which cat doesn't implement today, io::copy essentially functions the same which is really nice 👍

I'd be happy to see the Stdout* write behavior become more consistent as described in that other issue you linked too 👍

Sorry for derailing the original issue discussion (continued in the PR #114373) but this has been very enlightening.

xstaticxgpx added a commit to xstaticxgpx/rust that referenced this issue Aug 3, 2023
This is for rust-lang#114341

The `meta.len() > 0` condition here is intended for inputs only,
ie. when input is in the `/proc` filesystem as documented.

That inaccurately included empty output files which are then shunted to
the sendfile() routine leading to higher than nescessary IO util in some
cases, specifically with CoW filesystems like btrfs.

Further, `NoneObtained` is not relevant in this context, so remove it.

Simply, determine what is input or output given the passed enum Unit.
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Aug 4, 2023
unix/kernel_copy.rs: copy_file_range_candidate allows empty output files

This is for rust-lang#114341

The `meta.len() > 0` condition here is intended for inputs only, ie. when input is in the `/proc` filesystem as documented.

That inaccurately included empty output files which are then shunted to the sendfile() routine leading to higher than nescessary IO util in some cases, specifically with CoW filesystems like btrfs.

Simply, determine what is input or output given the passed boolean.
thomcc pushed a commit to tcdi/postgrestd that referenced this issue Oct 17, 2023
This is for rust-lang/rust#114341

The `meta.len() > 0` condition here is intended for inputs only,
ie. when input is in the `/proc` filesystem as documented.

That inaccurately included empty output files which are then shunted to
the sendfile() routine leading to higher than nescessary IO util in some
cases, specifically with CoW filesystems like btrfs.

Further, `NoneObtained` is not relevant in this context, so remove it.

Simply, determine what is input or output given the passed enum Unit.
thomcc pushed a commit to tcdi/postgrestd that referenced this issue Oct 17, 2023
unix/kernel_copy.rs: copy_file_range_candidate allows empty output files

This is for rust-lang/rust#114341

The `meta.len() > 0` condition here is intended for inputs only, ie. when input is in the `/proc` filesystem as documented.

That inaccurately included empty output files which are then shunted to the sendfile() routine leading to higher than nescessary IO util in some cases, specifically with CoW filesystems like btrfs.

Simply, determine what is input or output given the passed boolean.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-io Area: `std::io`, `std::fs`, `std::net` and `std::path` C-enhancement Category: An issue proposing an enhancement or a PR with one. O-linux Operating system: Linux T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

3 participants