-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closing connection takes up to 15 minutes. #7314
Comments
Could you tell us which version you used before, where you were seeing a different behavior? Are you actually calling 15 minutes is the time it takes with typical linux default settings to detect silent connection drops, unless gRPC keepalives are enabled. So the behavior you are seeing is expected if there are outstanding RPCs (more details e.g. in https://www.evanjones.ca/tcp-connection-timeouts.html). It's been like this forever I think, so I suspect you're talking about something different? |
(OK I see in your goroutine capture that you are closing the channel) |
Thanks for your reply. We used to be on 1.62.1 and I don't think the |
OK, thank you for the detailed report. The problem can occur when a call to When we attempt to flush the control buffer in the call to One option to give some time to loopy to return while not waiting until the operating system fails the socket would be to set a short deadline on the transport connection via |
Quoting from https://www.rfc-editor.org/rfc/rfc9113.html#name-connection-error-handling An endpoint can end a connection at any time. In particular, an endpoint MAY choose to treat a stream error as a connection error. Endpoints SHOULD send a GOAWAY frame when ending a connection, providing that circumstances permit it. Maybe our GOAWAY sending on close should be a best effort, but I'm not sure if there is a way to do that without setting a write deadline on the underlying conn. |
What version of gRPC are you using?
1.64.0
What version of Go are you using (
go version
)?go1.22.3
What operating system (Linux, Windows, …) and version?
Linux
What did you do?
After upgrading to gRPC 1.64, closing connections started to take a very long time, specifically around 15 minutes. This happens when the server side abruptly goes away and the TCP connection breaks on one end.
I captured a goroutine profile while the application was trying to close one such connection. What stands out are the following two stacks:
and
I believe the (potential) bug was introduced on these lines where the client tries to send a GOAWAY packet to the server before closing the connection. In case the connection is half-closed, the call hangs for 15 minutes which is the default timeout for
net.(*conn).Write
.What did you expect to see?
I would expect there to be a more reasonable timeout for closing a connection, or perhaps a way to control the timeout by the client.
What did you see instead?
Long time to close connections.
The text was updated successfully, but these errors were encountered: