Skip to content

runtime: signal race condition #14571

Closed
Closed
@Tasssadar

Description

@Tasssadar

1. What version of Go are you using (go version)?
2. What operating system and processor architecture are you using (go env)?
go version go1.6 linux/amd64

3. What did you do?
I have summoner process with large number of workers. When disconnected from the server, worker will exit its main loop and call signal.Stop() on the SIGINT it had previously set-up with signal.Notify(). At the same time, summoner will attempt to kill it with proc.Signal(os.Interrupt).

Sometimes, the interrupt signal gets lost - won't arrive to the channel and will not crash the program either. Below is an example I can replicate the race with, run it over and over and you will see some "signal missed" soon:

while true; do go run signaltest.go; done
package main

import (
    "os/signal"
    "os"
    "fmt"
    "time"
)

func connection(disconnected <-chan bool) bool {
    in := make(chan os.Signal, 2)
    signal.Notify(in, os.Interrupt)
    defer signal.Stop(in)

    for {
        select {
        case <-in:
            return true
        case <-disconnected:
            goto check_signal
        }
    }

check_signal:
    signal.Stop(in)

    select {
    case <- in:
        return true
    default:
        return false
    }
}

func killer(disconnected chan<- bool) {
    proc, _ := os.FindProcess(os.Getpid())
    time.Sleep(100*time.Millisecond)
    disconnected <- true
    proc.Signal(os.Interrupt)
}

func main() {
    disconnected := make(chan bool)

    for i := 0; true; i++ {
        fmt.Printf("%d: start\n", i)

        go killer(disconnected)

        if connection(disconnected) {
            fmt.Printf("  %d: got signal\n", i)
        } else {
            fmt.Printf("  %d: missed\n", i)
        }
        time.Sleep(500*time.Millisecond)
    }
}

4. What did you expect to see?
I expected the signal not to get lost. The documentation only states that "When Stop returns, it is guaranteed that c will receive no more signals." , so I'm not sure whether it is the right expectation, if it isn't, then the fix is fairly easy - just keep the signal channel registered the whole time.

5. What did you see instead?
Some of the signals get lost.

Metadata

Metadata

Assignees

No one assigned

    Labels

    FrozenDueToAgeNeedsFixThe path to resolution is known, but the work has not been done.

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions