You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
jgoerzen opened this issue
Jun 17, 2022
· 4 comments
Labels
A-docsArea: Documentation for any part of the project, including the compiler, standard library, and toolsA-processArea: `std::process` and `std::env`C-bugCategory: This is a bug.T-libsRelevant to the library team, which will review and decide on the PR/issue.
I am debugging a problem where I use process::Command to establish pipes between various commands. I have been experiencing occasional (roughly 0.1% of invocations) deadlocks while those deadlocks are not present when the same pipeline is created from a bash script.
In examining the behavior with strace, I have noticed at least one suspicious behavior: when piping data in to a child process's stdin, I see this pattern. Here PID 230999 is the parent and PID 231002 is the child. I am trying to include only relevant bits of the strace log:
The suspicious thing here is that the parent didn't close the reading side of the pipe (fd 5) until after it has waited for the child process to exit. This is an issue; see this from pipe(7) on Linux:
If all file descriptors referring to the write end of a pipe have been
closed, then an attempt to read(2) from the pipe will see end-of-file
(read(2) will return 0). If all file descriptors referring to the read
end of a pipe have been closed, then a write(2) will cause a SIGPIPE
signal to be generated for the calling process. If the calling process
is ignoring this signal, then write(2) fails with the error EPIPE. An
application that uses pipe(2) and fork(2) should use suitable close(2)
calls to close unnecessary duplicate file descriptors; this ensures that
end-of-file and SIGPIPE/EPIPE are delivered when appropriate.
The circumstances around this suggest to me that may not be my precise issue, though I am having difficulty figuring out what else may be.
An any case, the result of this is that when piping from one process to another, the first process will hang indefinitely until the Rust program waits on the second. Here fd 5 should have been closed on the parent side immediately after the fork/clone, before returning anything back to the caller, since it is no longer needed there.
Presumably the unused fds on the child side (post-dup2) were closed due to the use of O_CLOEXEC, but I would argue it is somewhat bad form to just not even bother to do it.
Elsewhere, it gets things a bit better. Here fd 4 is a File that had been opened in the parent process, and is now the stdin for the child, and fds 5 and 6 are used to obtain the stdout FROM the child. This also happens to be the place where the first command is spawned (whose stdout will be the stdin for the second command listed above); the pipe2 call shown here is the same one shown above.
Here it closes the unnecessary fds (4 and 6) on the parent before returning to the caller. That's what it should do. Why it does that for a File piped to stdin and not for a ChildStdout piped to stdin, I don't know. (Recall that the fd 5 here later becomes stdin for the second spawned command as shown above, and is not closed by the parent until after the second child exits)
Command::{stdin,stdout,stderr} methods transfer ownership of the file descriptor to the command. Those file descriptors will be dropped when command itself is dropped.
Consequently, if you would like to close a configured file descriptor in the parent before waiting on the child, you need to drop the command before calling Child::wait.
I don't know why the previous approach I took didn't do this, as I'd have expected the Drop to run when the let rebinding occurred (with the spawn).
But in any case, shouldn't the spawn cause the drop of those things itself? That is, it shouldn't be possible to get into this predicament, I'd think? That is, since spawn() doesn't consume the Command, shouldn't spawn() at least consume (or explicitly drop) the fds after the fork?
Hmm. I suppose there isn't much that could be done about the multiple spawns, despite the dubiousness of spawning twice with the same fd hooked to stdin (particularly without waiting for the first spawn to exit). Perhaps the solution is to document this then?
ChrisDenton
added
A-docs
Area: Documentation for any part of the project, including the compiler, standard library, and tools
T-libs
Relevant to the library team, which will review and decide on the PR/issue.
and removed
needs-triage-legacy
labels
Nov 6, 2023
A-docsArea: Documentation for any part of the project, including the compiler, standard library, and toolsA-processArea: `std::process` and `std::env`C-bugCategory: This is a bug.T-libsRelevant to the library team, which will review and decide on the PR/issue.
I am debugging a problem where I use process::Command to establish pipes between various commands. I have been experiencing occasional (roughly 0.1% of invocations) deadlocks while those deadlocks are not present when the same pipeline is created from a bash script.
In examining the behavior with strace, I have noticed at least one suspicious behavior: when piping data in to a child process's stdin, I see this pattern. Here PID 230999 is the parent and PID 231002 is the child. I am trying to include only relevant bits of the strace log:
The suspicious thing here is that the parent didn't close the reading side of the pipe (fd 5) until after it has waited for the child process to exit. This is an issue; see this from pipe(7) on Linux:
The circumstances around this suggest to me that may not be my precise issue, though I am having difficulty figuring out what else may be.
An any case, the result of this is that when piping from one process to another, the first process will hang indefinitely until the Rust program waits on the second. Here fd 5 should have been closed on the parent side immediately after the fork/clone, before returning anything back to the caller, since it is no longer needed there.
Presumably the unused fds on the child side (post-dup2) were closed due to the use of O_CLOEXEC, but I would argue it is somewhat bad form to just not even bother to do it.
The relevant code is at and around https://salsa.debian.org/jgoerzen/filespooler/-/blob/v1.1.0/src/exec.rs#L47
In this case, payload is of type
ChildStdout
.Elsewhere, it gets things a bit better. Here fd 4 is a File that had been opened in the parent process, and is now the stdin for the child, and fds 5 and 6 are used to obtain the stdout FROM the child. This also happens to be the place where the first command is spawned (whose stdout will be the stdin for the second command listed above); the pipe2 call shown here is the same one shown above.
Here it closes the unnecessary fds (4 and 6) on the parent before returning to the caller. That's what it should do. Why it does that for a File piped to stdin and not for a ChildStdout piped to stdin, I don't know. (Recall that the fd 5 here later becomes stdin for the second spawned command as shown above, and is not closed by the parent until after the second child exits)
This trace corresponds to the code at https://salsa.debian.org/jgoerzen/filespooler/-/blob/v1.1.0/src/jobqueue.rs#L258 and here input is a File.
Meta
rustc --version --verbose
:The text was updated successfully, but these errors were encountered: