-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug: Graceful shutdown issues #2968
Comments
It's not easy for me to provide a clean reproduction for this, but you could clone this repo: https://github.com/seed-hypermedia/seed and do Doing some very tedious and manual debugging I figured out that it gets stuck in the place I shared previously. |
Can you check if the environment variable GODEBUG="asynctimerchan=1" fixes the issue. It's probably because of golang/go#69312 Alternatively, you can change your go version in your go.mod to go1.22. |
I think I found a solution for the timer problem (will make pr for pubsub as well):
|
@sukunrt can you point to me to the exact timer that could be causing the shutdown issues? |
Confirming that running with |
@vyzo that solution is racy for versions <= go1.22.
When timer.Stop returns false, it doesn't mean the value has been pushed to the channel. It only means that |
ok, fair enough; lets wait for the upstream fix then. |
One is in quic-go: see quic-go/quic-go#4659 I'm sure there are some others in go-libp2p and the dependencies. I'm keeping this issue open. I'll add some text in the next patch release regarding this, and close the issue. |
there is one in pubsub too |
fixed by v0.36.4 |
We recently started facing issues with graceful shutdown in our app. After receiving termination signal, the app still hangs and never exists until forcefully shut down.
After spending some time debugging, I've found our that this place in libp2p never returns:
https://github.com/libp2p/go-libp2p/blob/v0.36.3/config/host.go#L28
To clarify, we are using libp2p with AutoRelay, HolePunching, DHT, and other things. The node needs to run for a while before this problem occurs. I suspect that it could be AutoRelay that's causing this, because the problem starts occurring after AutoRelay starts doing periodic relay finding.
So,
closableRoutedHost.Close()
gets called, but the underlyingfx.App
's Stop method never returns.The text was updated successfully, but these errors were encountered: