-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
process::Command data loss when using Stdio::inherit for stdin and stdin has been read from #97855
Comments
There's #78515 for switchable buffering for stdout. A similar API to switch to unbuffered IO could added for stdin. |
That would provide a more elegant workaround for sure, but as a resolution it would be incomplete, since the default situation would still lead to data loss. I would argue the default should be unbuffered, and a buffered stdin should be wrapped in BufReader by the user, that being a type that would be unsuitable for conversion to Stdio for a Command. That would go a long way towards preventing the user from doing something that is going to result in data loss. |
The data isn't lost, it's in the buffer. And this is documented behavior
|
I understand that, but being in the buffer is tantamount to being lost, because the buffer is not present in the input to the child process. I can't think of any use case where I would want to:
stdin is just another fd on Unix; I find it very odd that it is treated so differently (behind a mutex and all that) in Rust. But it's true I don't have to care about that so long as it actually works properly -- which I don't believe it does in this case. |
Currently you can access the underlying file directly this way: let inp = stdin().lock();
let inp = ManuallyDrop::new(unsafe { File::from_raw_fd(inp.as_raw_fd()) });
inp.read_exact(&mut buf)?;
drop(inp);
// spawn command here.
The underlying syscalls are, sure. But most languages provide buffering and possibly thread-safety around stdio to avoid line-tearing and to speed up small reads/writes. Even libc has buffering. https://man7.org/linux/man-pages/man3/setbuf.3.html |
You are correct that this issue does exist in other languages in some cases; for instance, this from the popen(3) manpage on Linux:
However, there are several differences to note:
It may be too late for this wrt backwards compatibility, but if I were designing this, I wouldn't make stdin/out/err be special cases; BufReader and BufWriter could be used for them just like anything else if people want. But we are where we are. So what to do about it?
Fundamentally people shouldn't have to resort to using strace, as I did, to discover that read_exact didn't do what it said on the tin. I question about your ManuallyDrop use in the example above. I assume this is because when the File::from_raw_fd is dropped, it closes the underlying fd, which may not be desired. In that case, probably the explicit drop should occur after the spawn? (And I believe that if there was an error from read_exact, there would be a memory leak there, right?) Perhaps an alternative would be to explicitly std::mem::drop the File after the spawn? |
I tried this code:
In this case, I had already read a few bytes from stdin using
stdin()
andread_exact
. strace showed that it read 8K from stdin, and that the command started missed the remainder of the 8K. #58326 discusses this in a roundabout way, but fundamentally it should be safe to read from stdin and then subsequently use it as input to a Command. The workaround was to use.stdin(Stdio::piped())
and copy the data across the pipe. Not ideal.Meta
rustc --version --verbose
:The text was updated successfully, but these errors were encountered: