Skip to content

os: spurious SIGCHILD on running child process #71828

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
mohitaron opened this issue Feb 19, 2025 · 15 comments
Closed

os: spurious SIGCHILD on running child process #71828

mohitaron opened this issue Feb 19, 2025 · 15 comments
Labels
BugReport Issues describing a possible bug in the Go implementation. Critical A critical problem that affects the availability or correctness of production systems built using Go NeedsFix The path to resolution is known, but the work has not been done.
Milestone

Comments

@mohitaron
Copy link

mohitaron commented Feb 19, 2025

Go version

1.23.6

Output of go env in your module/workspace:

GO111MODULE=''
GOARCH='amd64'
GOBIN=''
GOCACHE='/home/aron/.cache/go-build'
GOENV='/home/aron/.config/go/env'
GOEXE=''
GOEXPERIMENT=''
GOFLAGS=''
GOHOSTARCH='amd64'
GOHOSTOS='linux'
GOINSECURE=''
GOMODCACHE='/home/aron/go/pkg/mod'
GONOPROXY=''
GONOSUMDB=''
GOOS='linux'
GOPATH='/home/aron/go'
GOPRIVATE=''
GOPROXY='https://proxy.golang.org,direct'
GOROOT='/home/aron/junk/go'
GOSUMDB='sum.golang.org'
GOTMPDIR=''
GOTOOLCHAIN='auto'
GOTOOLDIR='/home/aron/junk/go/pkg/tool/linux_amd64'
GOVCS=''
GOVERSION='go1.23.6'
GODEBUG=''
GOTELEMETRY='local'
GOTELEMETRYDIR='/home/aron/.config/go/telemetry'
GCCGO='gccgo'
GOAMD64='v1'
AR='ar'
CC='gcc'
CXX='g++'
CGO_ENABLED='1'
GOMOD='/dev/null'
GOWORK=''
CGO_CFLAGS='-O2 -g'
CGO_CPPFLAGS=''
CGO_CXXFLAGS='-O2 -g'
CGO_FFLAGS='-O2 -g'
CGO_LDFLAGS='-O2 -g'
PKG_CONFIG='pkg-config'
GOGCCFLAGS='-fPIC -m64 -pthread -Wl,--no-gc-sections -fmessage-length=0 -ffile-prefix-map=/tmp/go-build1713836783=/tmp/go-build -gno-record-gcc-switches'

What did you do?

The below program reproduces the bug. It panics when run with go 1.23.6 and runs fine with go 1.22.5:

What did you see happen?

// This program is used to test the SIGCHLD signal.
//
// It runs a child process and waits for it to exit. When the child process
// exits, it sends a SIGCHLD signal to the parent process. The parent process
// then waits for the SIGCHLD signal and prints a message. The parent should
// receive the SIGCHLD signal only once.
//
// Go 1.23.6 seems to have a bug where it receives the SIGCHLD signal twice.

package main

import (
        "flag"
        "fmt"
        "os"
        "os/exec"
        "os/signal"
        "syscall"
        "time"
)

var (
    isChild bool
)

func init() {
    flag.BoolVar(&isChild, "child", false, "if true, run as a child process")
}

func child() {
    fmt.Println("I am a child process")

    time.Sleep(2 * time.Second)

    fmt.Println("Child process exiting")

    os.Exit(0)
}

func main() {
    flag.Parse()

    if isChild {
        child()
        return
    }

    fmt.Println("I am a parent process")

    sigChan := make(chan os.Signal, 1)
        signal.Notify(sigChan, syscall.SIGCHLD)

    go func () {
        numSigchld := 0
        for {
            <-sigChan
            numSigchld++
            fmt.Printf("Received SIGCHLD %d\n", numSigchld)
            if numSigchld > 1 {
                panic("Received more than one SIGCHLD")
            }
        }
    }()

    // Start the child process.
    cmd := exec.Command(os.Args[0], "-child")
    err := cmd.Start()
    if err != nil {
        panic(err)
    }

    // Wait for the child process to exit.
    cmd.Wait()

    // Wait for any SIGCHLD signals.
    time.Sleep(1 * time.Second)

    os.Exit(0)
}

What did you expect to see?

In the above program, there's only one child process. Yet in go 1.23.6, two SIGCHLD signals are received by the parent process.

@gabyhelp gabyhelp added the BugReport Issues describing a possible bug in the Go implementation. label Feb 19, 2025
@varun-scifin
Copy link

varun-scifin commented Feb 19, 2025

@prattmic @cherrymui It seems to have been added between Go 1.23.2 and 1.23.3. It might be related to the pidfd changes in this commit, where we added some pidfd feature checking (though I may be completely wrong -- this is just my best guess).

@seankhliao
Copy link
Member

The instructions don't seem sufficient to reproduce the issue, what's the environment it runs in? (os, distro, kernel, qemu?, etc)

@seankhliao seankhliao changed the title Spurious SIGCHLD signal received upon starting a child process in go 1.23.6 os: spurious SIGCHILD on running child process Feb 19, 2025
@seankhliao seankhliao added WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels Feb 19, 2025
@varun-scifin
Copy link

Tested on the ubuntu-latest runner (Ubuntu 24, 6.8.0-1021-azure kernel, Go 1.23.3 and above, no QEMU): https://github.com/varun-scifin/go-71828/actions/runs/13411687068/job/37463124345

You can also clone the repo to repro.

@mohitaron
Copy link
Author

The instructions don't seem sufficient to reproduce the issue, what's the environment it runs in? (os, distro, kernel, qemu?, etc)

As required by the ticket, I had posted the output of 'go env' already. Here's the output of 'uname -a' and 'lsb_release -a':

Ubuntu-iMac:sigchld>uname -a
Linux Ubuntu-iMac 6.8.0-52-generic #53-Ubuntu SMP PREEMPT_DYNAMIC Sat Jan 11 00:06:25 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux

Ubuntu-iMac:sigchld>lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 24.04.2 LTS
Release: 24.04
Codename: noble

@mohitaron
Copy link
Author

The problem is also reproducible on go version 1.24.0.

Ubuntu-iMac:sigchld>go version
go version go1.24.0 linux/amd64

Ubuntu-iMac:sigchld>go env
AR='ar'
CC='gcc'
CGO_CFLAGS='-O2 -g'
CGO_CPPFLAGS=''
CGO_CXXFLAGS='-O2 -g'
CGO_ENABLED='1'
CGO_FFLAGS='-O2 -g'
CGO_LDFLAGS='-O2 -g'
CXX='g++'
GCCGO='gccgo'
GO111MODULE=''
GOAMD64='v1'
GOARCH='amd64'
GOAUTH='netrc'
GOBIN=''
GOCACHE='/home/aron/.cache/go-build'
GOCACHEPROG=''
GODEBUG=''
GOENV='/home/aron/.config/go/env'
GOEXE=''
GOEXPERIMENT=''
GOFIPS140='off'
GOFLAGS=''
GOGCCFLAGS='-fPIC -m64 -pthread -Wl,--no-gc-sections -fmessage-length=0 -ffile-prefix-map=/tmp/go-build4247764318=/tmp/go-build -gno-record-gcc-switches'
GOHOSTARCH='amd64'
GOHOSTOS='linux'
GOINSECURE=''
GOMOD='/home/aron/workspace/main/go.mod'
GOMODCACHE='/home/aron/go/pkg/mod'
GONOPROXY=''
GONOSUMDB=''
GOOS='linux'
GOPATH='/home/aron/go'
GOPRIVATE=''
GOPROXY='https://proxy.golang.org,direct'
GOROOT='/snap/go/10853'
GOSUMDB='sum.golang.org'
GOTELEMETRY='local'
GOTELEMETRYDIR='/home/aron/.config/go/telemetry'
GOTMPDIR=''
GOTOOLCHAIN='auto'
GOTOOLDIR='/snap/go/10853/pkg/tool/linux_amd64'
GOVCS=''
GOVERSION='go1.24.0'
GOWORK=''
PKG_CONFIG='pkg-config'

Ubuntu-iMac:sigchld>go run sigchld.go
I am a parent process
Received SIGCHLD 1
Received SIGCHLD 2
panic: Received more than one SIGCHLD

goroutine 8 [running]:
main.main.func1()
/home/aron/workspace/main/experimental/aron/sigchld/sigchld.go:64 +0xa5
created by main.main in goroutine 1
/home/aron/workspace/main/experimental/aron/sigchld/sigchld.go:57 +0x125
exit status 2

@ianlancetaylor
Copy link
Contributor

The problem is that with the new pidfd code, we run a test fork to see if pidfd works. That test fork winds up sending a SIGCHLD signal. I think the fix is straightforward.

@seankhliao seankhliao removed the WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. label Feb 20, 2025
@ianlancetaylor
Copy link
Contributor

@gopherbot Please open backport issues.

This bug causes a spurious SIGCHLD signal the first time a process is executed when running on Linux systems. This is a regression from past behavior and from behavior on non-Linux systems. The patch is small and safe.

@gopherbot
Copy link
Contributor

Backport issue(s) opened: #71848 (for 1.23), #71849 (for 1.24).

Remember to create the cherry-pick CL(s) as soon as the patch is submitted to master, according to https://go.dev/wiki/MinorReleases.

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/650835 mentions this issue: syscall: don't send child signal when testing pidfd

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/651035 mentions this issue: [release-branch.go1.24] syscall: don't send child signal when testing pidfd

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/651055 mentions this issue: [release-branch.go1.23] syscall: don't send child signal when testing pidfd

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/651415 mentions this issue: runtime: use WCLONE when waiting on pidfd test child

gopherbot pushed a commit that referenced this issue Feb 21, 2025
As of CL 650835, the pidfd test child no longer sends SIGCHLD on exit.
Per clone(2), "If [the child termination] signal is specified as
anything other than SIGCHLD, then the parent process must specify the
__WALL or __WCLONE options when waiting for the child with wait(2)."

Align with this requirement.

For #71828.

Change-Id: I6a6a636c739e4a59abe1533fe429a433e8588939
Reviewed-on: https://go-review.googlesource.com/c/go/+/651415
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Auto-Submit: Michael Pratt <mpratt@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
@gopherbot
Copy link
Contributor

Change https://go.dev/cl/651495 mentions this issue: [release-branch.go1.23] runtime: use WCLONE when waiting on pidfd test child

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/651476 mentions this issue: [release-branch.go1.24] runtime: use WCLONE when waiting on pidfd test child

@dmitshur dmitshur added NeedsFix The path to resolution is known, but the work has not been done. and removed NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels Feb 26, 2025
@dmitshur dmitshur added this to the Go1.25 milestone Feb 26, 2025
@prattmic prattmic added the Critical A critical problem that affects the availability or correctness of production systems built using Go label Feb 26, 2025
gopherbot pushed a commit that referenced this issue Feb 26, 2025
… pidfd

Avoid a spurious SIGCHLD the first time we start a process.

For #71828
Fixes #71848

Change-Id: I744100d21bf6aaaaafc99bc5eec9f9f807a50682
Reviewed-on: https://go-review.googlesource.com/c/go/+/651055
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
gopherbot pushed a commit that referenced this issue Feb 26, 2025
…t child

As of CL 650835, the pidfd test child no longer sends SIGCHLD on exit.
Per clone(2), "If [the child termination] signal is specified as
anything other than SIGCHLD, then the parent process must specify the
__WALL or __WCLONE options when waiting for the child with wait(2)."

Align with this requirement.

For #71848.
For #71828.

Change-Id: I6a6a636c739e4a59abe1533fe429a433e8588939
Reviewed-on: https://go-review.googlesource.com/c/go/+/651415
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Auto-Submit: Michael Pratt <mpratt@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
(cherry picked from commit e1e65ae)
Reviewed-on: https://go-review.googlesource.com/c/go/+/651495
gopherbot pushed a commit that referenced this issue Feb 26, 2025
… pidfd

Avoid a spurious SIGCHLD the first time we start a process.

For #71828
Fixes #71849

Change-Id: I744100d21bf6aaaaafc99bc5eec9f9f807a50682
Reviewed-on: https://go-review.googlesource.com/c/go/+/651035
Reviewed-by: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
gopherbot pushed a commit that referenced this issue Feb 26, 2025
…t child

As of CL 650835, the pidfd test child no longer sends SIGCHLD on exit.
Per clone(2), "If [the child termination] signal is specified as
anything other than SIGCHLD, then the parent process must specify the
__WALL or __WCLONE options when waiting for the child with wait(2)."

Align with this requirement.

For #71849.
For #71828.

Change-Id: I6a6a636c739e4a59abe1533fe429a433e8588939
Reviewed-on: https://go-review.googlesource.com/c/go/+/651415
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Auto-Submit: Michael Pratt <mpratt@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
(cherry picked from commit e1e65ae)
Reviewed-on: https://go-review.googlesource.com/c/go/+/651476
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
BugReport Issues describing a possible bug in the Go implementation. Critical A critical problem that affects the availability or correctness of production systems built using Go NeedsFix The path to resolution is known, but the work has not been done.
Projects
None yet
Development

No branches or pull requests

8 participants