Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

syscall: misleading documentation for linux SysProcAttr.Pdeathsig #27505

Open
virtuald opened this issue Sep 5, 2018 · 9 comments
Open

syscall: misleading documentation for linux SysProcAttr.Pdeathsig #27505

virtuald opened this issue Sep 5, 2018 · 9 comments
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. Documentation help wanted NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@virtuald
Copy link

virtuald commented Sep 5, 2018

Currently, the documentation says:

// Signal that the process will get when its parent dies (Linux only)

However, according to the prctl man page:

Warning: the "parent" in this case is considered to be the thread that created this process. In other words, the signal will be sent when that thread terminates (via, for example, pthread_exit(3)), rather than after all of the threads in the parent process terminate.

I got bit by this in a python program -- started a new program on one thread, and tried to wait for it on another thread, and the child process kept dying and it took awhile to figure out what was going on. While I haven't ran into it in go yet, because in go threads and goroutines aren't one to one, I imagine if one ran into this sort of bug it would only occur intermittently.

Thinking about it, it seems like a user might want to call runtime.LockOSThread when using this?

It's not clear to me whether the docs should have a larger warning in them -- but I think at the minimum the documentation should be updated to say 'parent thread' or 'parent goroutine' instead of just 'parent'.

@dominikh
Copy link
Member

dominikh commented Sep 5, 2018

If I am not mistaken, the current implementation of Go never kills threads, which is why you wouldn't be able to run into this bug in Go currently. At the same time, if Go ever does start killing threads, this might hit a lot of people at once.

@virtuald
Copy link
Author

virtuald commented Sep 5, 2018

Looked around a bit, looks like go 1.10+ will kill a thread if you lock it and the goroutine exits without unlocking it: #20395

Here's an example that shows that:

package main

import (
	"flag"
	"fmt"
	"os/exec"
	"runtime"
	"syscall"
)

func main() {
	kill := false
	flag.BoolVar(&kill, "death", false, "Set deathsig")
	flag.Parse()

	cmd := exec.Command("sleep", "1000")
	if kill == true {
		fmt.Println("Death is coming")
		cmd.SysProcAttr = &syscall.SysProcAttr{
			Setpgid:   true,
			Pdeathsig: syscall.SIGKILL,
		}
	}

	// force other goroutines to be spawned on a new thread
	runtime.LockOSThread()

	done := make(chan struct{})

	go func() {
		runtime.LockOSThread()
		err := cmd.Start()
		if err != nil {
			fmt.Println(err)
		}
		close(done)
		fmt.Println("Exit goroutine")
	}()

	<-done

	fmt.Println("Waiting")
	err := cmd.Wait()
	fmt.Println("Done", err)
}

In that example, if -death is passed as an argument the program will immediately exit. Obviously, you have to go out of your way to trigger this behavior, but presumably someone who is messing with platform-specific behavior is going to want to know about platform-specific behaviors... thus why the docs should at least hint that this is possible.

@dominikh dominikh added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Sep 6, 2018
@dominikh dominikh changed the title Update documentation for linux SysProcAttr.Pdeathsig syscall: misleading documentation for linux SysProcAttr.Pdeathsig Sep 6, 2018
@virtuald
Copy link
Author

Circling back to this, I did eventually figure out the right way to use Pdeathsig, and I think the documentation should include some examples or discussion about this.

Here's an example of what people usually want to do when using Pdeathsig -- ensure the child dies when the parent dies. This example ensures that golang won't accidentally kill the OS thread associated with the child process.

package main

import (
	"fmt"
	"os/exec"
	"runtime"
	"syscall"
)

func main() {

	done := make(chan struct{})

	cmd := exec.Command("sleep", "1000")
	cmd.SysProcAttr = &syscall.SysProcAttr{
		Setpgid:   true,
		Pdeathsig: syscall.SIGKILL,
	}

	go func() {
		// On Linux, pdeathsig will kill the child process when the thread dies,
		// not when the process dies. runtime.LockOSThread ensures that as long
		// as this function is executing that OS thread will still be around
		runtime.LockOSThread()
		defer runtime.UnlockOSThread()

		err := cmd.Start()
		if err != nil {
			fmt.Println(err)
		}

		fmt.Println("Child is PID", cmd.Process.Pid)
		err = cmd.Wait()
		close(done)
		fmt.Println("Child exited:", err)
	}()

	fmt.Println("Waiting for child to exit")
	<-done
	fmt.Println("Parent exiting")
}

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/412114 mentions this issue: syscall: clarify Pdeathsig documentation on Linux

@cespare
Copy link
Contributor

cespare commented Jun 14, 2022

@ianlancetaylor On the CL you wrote

and it's not clear to me why the workaround works in the general case

(The workaround being to use LockOSThread in the goroutine that creates the child process to ensure that that thread can't be killed.)

Can you say more about why it might not work?

@ianlancetaylor
Copy link
Contributor

The CL says " A workaround is to call LockOSThread just before starting the new process and UnlockOSThread after it finishes." But the thread can then later, before the child process exits, pick up another goroutine, that goroutine can call LockOSThread, and then that goroutine can exit. That will cause the thread to exit, and cause a signal to be sent to the child process, although the parent process overall is still running. Unless I misunderstand something.

@cespare
Copy link
Contributor

cespare commented Jun 15, 2022

But the thread can then later, before the child process exits, pick up another goroutine,

Isn't the point of LockOSThread that this cannot happen?

@Tasssadar
Copy link
Contributor

Tasssadar commented Jun 15, 2022

The idea is that the goroutine locks to thread, and then just waits and does nothing until the process exits. That way, the thread cannot exit until the exec'd process exits

go func() {
	runtime.LockOSThread()

	err := cmd.Start()
	if err != nil {
		...
	}

	// some channels to communicate with this goroutine or whatever are here...
        // <-finished


	err = cmd.Wait()

	runtime.UnlockOSThread()
}()

Tasssadar added a commit to Tasssadar/go that referenced this issue Jun 15, 2022
@ianlancetaylor
Copy link
Contributor

Thanks, I understand the comment now. I was reading "after it finishes" as meaning "after StartProcess completes," but it really means "after Wait completes.`"

gopherbot pushed a commit that referenced this issue Jun 15, 2022
This is a rather large footgun, so let's mention that it sends the signal on thread termination and not process termination in the documentation.

Updates #27505

Change-Id: I489cf7136e34a1a7896067ae24187b0d523d987e
GitHub-Last-Rev: c8722b2
GitHub-Pull-Request: #53365
Reviewed-on: https://go-review.googlesource.com/c/go/+/412114
Reviewed-by: Ian Lance Taylor <iant@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Ian Lance Taylor <iant@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
@gopherbot gopherbot added the compiler/runtime Issues related to the Go compiler and/or runtime. label Jul 7, 2022
@mknyszek mknyszek moved this to Triage Backlog in Go Compiler / Runtime Jul 15, 2022
aaronlehmann added a commit to aaronlehmann/go-runc that referenced this issue Jul 20, 2022
See golang/go#27505 for context. Pdeathsig
isn't safe to set without locking to the current OS thread, because
otherwise thread termination will send the signal, which isn't the
desired behavior.

I discovered this while troubleshooting a problem that turned out to be
unrelated, but I think it's necessary for correctness.

Signed-off-by: Aaron Lehmann <alehmann@netflix.com>
aaronlehmann added a commit to aaronlehmann/go-runc that referenced this issue Sep 7, 2022
See golang/go#27505 for context. Pdeathsig
isn't safe to set without locking to the current OS thread, because
otherwise thread termination will send the signal, which isn't the
desired behavior.

I discovered this while troubleshooting a problem that turned out to be
unrelated, but I think it's necessary for correctness.

Signed-off-by: Aaron Lehmann <alehmann@netflix.com>
aaronlehmann added a commit to aaronlehmann/go-runc that referenced this issue Sep 8, 2022
See golang/go#27505 for context. Pdeathsig
isn't safe to set without locking to the current OS thread, because
otherwise thread termination will send the signal, which isn't the
desired behavior.

I discovered this while troubleshooting a problem that turned out to be
unrelated, but I think it's necessary for correctness.

Signed-off-by: Aaron Lehmann <alehmann@netflix.com>
corhere added a commit to corhere/moby that referenced this issue Sep 28, 2022
On Linux, when (os/exec.Cmd).SysProcAttr.Pdeathsig is set, the signal
will be sent to the process when the OS thread on which cmd.Start() was
executed dies. The runtime terminates an OS thread when a goroutine
exits after being wired to the thread with runtime.LockOSThread(). If
other goroutines are allowed to be scheduled onto a thread which called
cmd.Start(), an unrelated goroutine could cause the thread to be
terminated and prematurely signal the command. See
golang/go#27505 for more information.

Prevent started subprocesses with Pdeathsig from getting signaled
prematurely by wiring the starting goroutine to the OS thread until the
subprocess has exited. No other goroutines can be scheduled onto a
locked thread so it will remain alive until unlocked or the daemon
process exits.

Signed-off-by: Cory Snider <csnider@mirantis.com>
corhere added a commit to corhere/moby that referenced this issue Sep 28, 2022
On Linux, when (os/exec.Cmd).SysProcAttr.Pdeathsig is set, the signal
will be sent to the process when the OS thread on which cmd.Start() was
executed dies. The runtime terminates an OS thread when a goroutine
exits after being wired to the thread with runtime.LockOSThread(). If
other goroutines are allowed to be scheduled onto a thread which called
cmd.Start(), an unrelated goroutine could cause the thread to be
terminated and prematurely signal the command. See
golang/go#27505 for more information.

Prevent started subprocesses with Pdeathsig from getting signaled
prematurely by wiring the starting goroutine to the OS thread until the
subprocess has exited. No other goroutines can be scheduled onto a
locked thread so it will remain alive until unlocked or the daemon
process exits.

Signed-off-by: Cory Snider <csnider@mirantis.com>
corhere added a commit to corhere/moby that referenced this issue Oct 5, 2022
On Linux, when (os/exec.Cmd).SysProcAttr.Pdeathsig is set, the signal
will be sent to the process when the OS thread on which cmd.Start() was
executed dies. The runtime terminates an OS thread when a goroutine
exits after being wired to the thread with runtime.LockOSThread(). If
other goroutines are allowed to be scheduled onto a thread which called
cmd.Start(), an unrelated goroutine could cause the thread to be
terminated and prematurely signal the command. See
golang/go#27505 for more information.

Prevent started subprocesses with Pdeathsig from getting signaled
prematurely by wiring the starting goroutine to the OS thread until the
subprocess has exited. No other goroutines can be scheduled onto a
locked thread so it will remain alive until unlocked or the daemon
process exits.

Signed-off-by: Cory Snider <csnider@mirantis.com>
fho added a commit to simplesurance/baur that referenced this issue Dec 12, 2022
…eebsd

When baur executes a task and the baur process gets killed, the task subprocess
continues to run. This was reproduced on Linux, on other OSes it was not tested
but they are probably also affected.

Prevent that this can happen by setting Pdeathsig for the executed process.
If the parent thread is killed, the specified signal (SIGKILL) will be sent to
the child.
Pdeathsig is sent when then parent thread dies, to prevent that thread on which
the go-routine ran that started the process dies, runtime.LockOSThread is
called[^1].

This fixes the issue only on Linux and FreeBSD.
Windows & Darwin do not have Pdeathsig in their SysProcAttrs.
To achieve the same on Windows support for job objects in Golang might be
needed[^2].

[^1]: golang/go#27505 (comment)
[^2]: golang/go#17608
fho added a commit to simplesurance/baur that referenced this issue Dec 12, 2022
When baur executes a task and the baur process gets killed, the task subprocess
continues to run. This was reproduced on Linux, on other OSes it was not tested
but they are probably also affected.

Prevent that this can happen by setting Pdeathsig for the executed process.
If the parent thread is killed, the specified signal (SIGKILL) will be sent to
the child.
Pdeathsig is sent when then parent thread dies, to prevent that thread on which
the go-routine ran that started the process dies, runtime.LockOSThread is
called[^1].

This fixes the issue only on Linux and FreeBSD.
Windows & Darwin do not have Pdeathsig in their SysProcAttrs.
To achieve the same on Windows support for job objects in Golang might be
needed[^2].

[^1]: golang/go#27505 (comment)
[^2]: golang/go#17608
fho added a commit to simplesurance/baur that referenced this issue Dec 12, 2022
When baur executes a task and the baur process gets killed, the task subprocess
continues to run. This was reproduced on Linux, on other OSes it was not tested
but they are probably also affected.

Prevent that this can happen by setting Pdeathsig for the executed process.
If the parent thread is killed, the specified signal (SIGKILL) will be sent to
the child.
Pdeathsig is sent when then parent thread dies, to prevent that thread on which
the go-routine ran that started the process dies, runtime.LockOSThread is
called[^1].

This fixes the issue only on Linux and FreeBSD.
Windows & Darwin do not have Pdeathsig in their SysProcAttrs.
To achieve the same on Windows support for job objects in Golang might be
needed[^2].

[^1]: golang/go#27505 (comment)
[^2]: golang/go#17608
sipsma added a commit to sipsma/dagger that referenced this issue Apr 3, 2023
When Pdeathsig is used, the child process will receive the signal
whenever the parent *thread* is killed. In go, an os thread can be
killed if it is locked to a goroutine and left locked when the
goroutine finishes execution. The go runtime will schedule goroutines to
OS threads as it sees fit, so this can result in random unexpected
signals being sent to child processes with pdeathsig set.

There's no currently *known* case where the shim code leaves goroutines
locked to a thread, but given this could happen now or in the future in
our code or in any dependency code, it's best to safeguard against this.

The fix is to keep the goroutine that spawns and waits on the child
process locked to its os thread. That way, we can be sure it will never
be killed by the go runtime.

See also: golang/go#27505

Signed-off-by: Erik Sipsma <erik@dagger.io>
sipsma added a commit to dagger/dagger that referenced this issue Apr 3, 2023
When Pdeathsig is used, the child process will receive the signal
whenever the parent *thread* is killed. In go, an os thread can be
killed if it is locked to a goroutine and left locked when the
goroutine finishes execution. The go runtime will schedule goroutines to
OS threads as it sees fit, so this can result in random unexpected
signals being sent to child processes with pdeathsig set.

There's no currently *known* case where the shim code leaves goroutines
locked to a thread, but given this could happen now or in the future in
our code or in any dependency code, it's best to safeguard against this.

The fix is to keep the goroutine that spawns and waits on the child
process locked to its os thread. That way, we can be sure it will never
be killed by the go runtime.

See also: golang/go#27505

Signed-off-by: Erik Sipsma <erik@dagger.io>
fsmv added a commit to fsmv/daemon that referenced this issue Jul 4, 2024
Also don't check os.ProcessState.Exited() when reporting child deaths, I
had previously misunderstood what it was for.

In linux child processes are killed via pdeathsig if the *thread* that
called fork() is ended, not the process. So this change makes sure that
we do not allow go to kill thread. However apparently in current
versions of go the threads are never killed anyway so this wouldn't have
happened. This code is more formally correct though.

See: golang/go#27505
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. Documentation help wanted NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
Status: Triage Backlog
Development

No branches or pull requests

6 participants