-
Notifications
You must be signed in to change notification settings - Fork 17.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
os/exec: race between Wait and ctx.Done #16028
Comments
Would it be better to remove CommandContext for this release if we cannot ship a fixed version? |
Here is a complex way to fix the problem. Add the following internal API to the runtime package.
With those functions, the code in
|
@cespare I'm inclined to leave it in. It's a very unlikely race. Not only do the process and the context have to end at exactly the same time quantum, but the process ID has to be reused within that same time quantum as well. Happy to hear other opinions, though. |
I think I see how to do this on systems that support the POSIX.1-2008 |
CL https://golang.org/cl/23967 mentions this issue. |
On systems that support the POSIX.1-2008 waitid function, we can use it to block until a wait will succeed. This avoids a possible race condition: if a program calls p.Kill/p.Signal and p.Wait from two different goroutines, then it is possible for the wait to complete just before the signal is sent. In that case, it is possible that the system will start a new process using the same PID between the wait and the signal, causing the signal to be sent to the wrong process. The Process.isdone field attempts to avoid that race, but there is a small gap of time between when wait returns and isdone is set when the race can occur. This CL avoids that race by using waitid to wait until the process has exited without actually collecting the PID. Then it sets isdone, then waits for any active signals to complete, and only then collects the PID. No test because any plausible test would require starting enough processes to recycle all the process IDs. Update #13987. Update #16028. Change-Id: Id2939431991d3b355dfb22f08793585fc0568ce8 Reviewed-on: https://go-review.googlesource.com/23967 Run-TryBot: Ian Lance Taylor <iant@golang.org> Reviewed-by: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>
This is now fixed on GNU/Linux. Since the fix was to the os package, not the os/exec package, I'm now declaring this issue a dup of #13987. |
CL https://golang.org/cl/24021 mentions this issue. |
CL https://golang.org/cl/24020 mentions this issue. |
This change is a followup to https://go-review.googlesource.com/23967 for Darwin. Updates #13987. Updates #16028. Change-Id: Ib1fb9f957fafd0f91da6fceea56620e29ad82b00 Reviewed-on: https://go-review.googlesource.com/24020 Reviewed-by: Ian Lance Taylor <iant@golang.org>
This change is a followup to https://go-review.googlesource.com/23967 for FreeBSD. Updates #13987. Updates #16028. Change-Id: I0f0737372fce6df89d090fe9847305749b79eb4c Reviewed-on: https://go-review.googlesource.com/24021 Reviewed-by: Ian Lance Taylor <iant@golang.org>
There is an unlikely race condition when using
os/exec.CommandContext
. InCmd.Wait
, one goroutine sleeps in a select onctx.Done
andwaitDone
, while another goroutine sleeps inc.Process.Wait
. If the process terminates at the same time as the context, then it is possible for the first goroutine to wake up and receive fromctx.Done
and callc.Process.Kill
just as the code inos.Process.Wait
returns fromsyscall.Wait4
. At that point the process ID is available for reuse, and a new process may be started with the same process ID. The call toc.Process.Kill
will callc.Process.Signal
. BecauseProcess.wait
has not yet had a chance to callProcess.setDone
,c.Process.Signal
will callsyscall.Kill
with the old process ID. This could potentially kill an unrelated process.This is similar to the problem described in #13987; the race described there has now been brought into the standard library.
Marking as 1.8 because I don't see a simple way to fix this.
The text was updated successfully, but these errors were encountered: