-
Notifications
You must be signed in to change notification settings - Fork 17.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
runtime/cgo: scheduler blocked with goroutines #8636
Labels
Milestone
Comments
When a cgo call is started the thread will block waiting for the call to complete. There is a system monitor thread that looks for threads that have blocked waiting for cgo for more than 20 us. When it finds such a thread, and there is other work for it do, it spins up a new thread to run new goroutines. The effect is that you can have very different profiling results depending on how long your cgo calls run. Could that be the issue here? If that's not the issue, please try to provide a complete standalone example that shows the problem. Thanks. Labels changed: added repo-main. |
Hi, here finally what I believe show the problem: package main // #include <unistd.h> import "C" import ( "fmt" "net/http" _ "net/http/pprof" "runtime" "sync" "time" ) const size = 10000 func main() { runtime.GOMAXPROCS(0) started := time.Now() go func() { fmt.Println("pprof enabled on port 6060") err := http.ListenAndServe(":6060", nil) if err != nil { panic(err) } }() // prepare communication protocol urls := make(chan string, size) wg := new(sync.WaitGroup) // fire a bunch of workers for i := 0; i < runtime.NumCPU(); i++ { wg.Add(1) go func(i int) { defer wg.Done() for _ = range urls { // time.Sleep(1 * time.Millisecond) // nanoseconds C.usleep(1000) // microseconds } }(i) } for i := 0; i < size; i++ { urls <- "test" } // wait close(urls) wg.Wait() fmt.Println("done in", time.Since(started)) } Here I believe the final result should be around 1s, right? With the C call needs on my mac 12.860826382s The go version of sleeps instead correctly shows: 1.621176016s If I increase the C sleep to 10 ms (10*1000), I get back the same result of the go counterpart: 15.092162142s > why set GOMAXPROCS to 0? I believe be the same as GOMAXPROCS=runtime.NumCPU(), I saw that's a common pattern in go. |
With go version go version devel +9cc3e5afb5d6 Thu Sep 04 11:34:01 2014 +0400 linux/amd64 on linux/amd64 on 32 core machine: without sleep: $ GOMAXPROCS=1 go run /tmp/sched.go done in 914.7684ms $ GOMAXPROCS=32 go run /tmp/sched.go done in 337.801592ms with sleep: $ go run /tmp/sched.go done in 989.857984ms $ GOMAXPROCS=32 go run /tmp/sched.go done in 687.407522ms Everything seems fine to me. What go version do you use? what OS? what hardware? what is the value of GOMAXPROCS env var? |
Another update, My initial guess was the using of "pprof", so removing it, give me back the 1.5s (constantly). However, if I import only "net/http" it takes back constantly 12s. Here the code: package main // #include <unistd.h> import "C" import ( "fmt" _ "net/http" "runtime" "sync" "time" ) const size = 10000 func main() { started := time.Now() // prepare communication protocol urls := make(chan string, size) wg := new(sync.WaitGroup) // fire a bunch of workers for i := 0; i < runtime.NumCPU(); i++ { wg.Add(1) go func(i int) { defer wg.Done() for _ = range urls { // time.Sleep(1 * time.Millisecond) // nanoseconds C.usleep(1000) // microseconds } }(i) } for i := 0; i < size; i++ { urls <- "test" } // wait close(urls) wg.Wait() fmt.Println("done in", time.Since(started)) } I don't know if it's related, but probably there is some package that I use that raise on production (in a bigger way) this kind of issue. |
Here a dump with DEBUG enabled: tmp $ GOTRACEBACK=crash GODEBUG=schedtrace=1000 go run test.go SCHED 0ms: gomaxprocs=1 idleprocs=0 threads=2 idlethreads=0 runqueue=0 [1] # command-line-arguments SCHED 0ms: gomaxprocs=1 idleprocs=0 threads=2 idlethreads=0 runqueue=0 [1] # command-line-arguments SCHED 0ms: gomaxprocs=1 idleprocs=0 threads=2 idlethreads=0 runqueue=0 [1] SCHED 0ms: gomaxprocs=1 idleprocs=0 threads=2 idlethreads=0 runqueue=0 [1] SCHED 1006ms: gomaxprocs=1 idleprocs=1 threads=6 idlethreads=2 runqueue=0 [0] SCHED 1009ms: gomaxprocs=1 idleprocs=0 threads=5 idlethreads=0 runqueue=3 [5] SCHED 2012ms: gomaxprocs=1 idleprocs=1 threads=6 idlethreads=2 runqueue=0 [0] SCHED 2010ms: gomaxprocs=1 idleprocs=0 threads=5 idlethreads=0 runqueue=5 [3] SCHED 3022ms: gomaxprocs=1 idleprocs=1 threads=6 idlethreads=2 runqueue=0 [0] SCHED 3022ms: gomaxprocs=1 idleprocs=0 threads=5 idlethreads=0 runqueue=6 [2] SCHED 4024ms: gomaxprocs=1 idleprocs=1 threads=6 idlethreads=2 runqueue=0 [0] SCHED 4030ms: gomaxprocs=1 idleprocs=0 threads=5 idlethreads=0 runqueue=8 [0] SCHED 5032ms: gomaxprocs=1 idleprocs=1 threads=6 idlethreads=2 runqueue=0 [0] SCHED 5038ms: gomaxprocs=1 idleprocs=0 threads=6 idlethreads=1 runqueue=0 [7] SCHED 6039ms: gomaxprocs=1 idleprocs=1 threads=6 idlethreads=2 runqueue=0 [0] SCHED 6049ms: gomaxprocs=1 idleprocs=0 threads=6 idlethreads=1 runqueue=1 [6] SCHED 7044ms: gomaxprocs=1 idleprocs=1 threads=6 idlethreads=2 runqueue=0 [0] SCHED 7059ms: gomaxprocs=1 idleprocs=0 threads=6 idlethreads=1 runqueue=1 [6] SCHED 8046ms: gomaxprocs=1 idleprocs=1 threads=6 idlethreads=2 runqueue=0 [0] SCHED 8060ms: gomaxprocs=1 idleprocs=0 threads=6 idlethreads=1 runqueue=2 [5] SCHED 9058ms: gomaxprocs=1 idleprocs=1 threads=6 idlethreads=2 runqueue=0 [0] SCHED 9065ms: gomaxprocs=1 idleprocs=0 threads=6 idlethreads=1 runqueue=3 [4] SCHED 10065ms: gomaxprocs=1 idleprocs=1 threads=6 idlethreads=2 runqueue=0 [0] SCHED 10077ms: gomaxprocs=1 idleprocs=0 threads=6 idlethreads=1 runqueue=3 [4] SCHED 11067ms: gomaxprocs=1 idleprocs=1 threads=6 idlethreads=2 runqueue=0 [0] SCHED 11081ms: gomaxprocs=1 idleprocs=0 threads=6 idlethreads=0 runqueue=4 [3] SCHED 12074ms: gomaxprocs=1 idleprocs=1 threads=6 idlethreads=2 runqueue=0 [0] SCHED 12086ms: gomaxprocs=1 idleprocs=0 threads=6 idlethreads=0 runqueue=6 [1] SCHED 13084ms: gomaxprocs=1 idleprocs=1 threads=6 idlethreads=2 runqueue=0 [0] done in 12.857744792s |
Assigning to Dmitriy to see if there's anything to do here or else close. |
I don't see anything actionable here. |
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
The text was updated successfully, but these errors were encountered: