-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change graceful_shutdown function behavior. #3729
Conversation
Why is this needed? If handshaking is taking a long time, it should still get closed, in my opinion. Regarding the greater motivation, I believe it should be possible to implement a max age timer without any change to hyper. If any changes are needed, it's best to open an issue to discuss and propose a design, before coding happens. |
The problem is that the requests sent during handshake are lost. This can be tested using ghz tool that I used here as a client:
Summary: Response time histogram: Latency distribution: Status code distribution: Error distribution: only 23736 OK responses, instead of having 100000 OK responses like here: PS: i'm open to any design proposal discussion. Thank you! |
The unavailable error is because of graceful shutdown was triggered during the handshake? I'm struggling to make the connection, but perhaps there's a detail I'm not noticing. |
The unavailable error is not a problem. So, If I have sent 100k requests (-n 100000), I expect to receive 100k OK responses like here: Error distribution: We received all 100k responses but some of them (184) got Unavailable error. Error distribution: So, yes I think this is the problem the graceful shutdown was triggered during the handshake. In this case the client will receive a broken pipe error:
when trying to write to that stream and both request and response will be lost. Anyway, I think the best option is to fix somehow also the case when the client of tonic that use this library sets max_connection_age_ms to a small value like 5ms and in that case tonic should not miss any OK responses. Or maybe to not permit to set it below 30s or same minimum default value. |
But this change fixes the problem? It seems this just cancels graceful shutdown if triggered while a connection is in the process of starting. |
Yes, this is what I'm trying to do, to not shutdown the connections that are in handshaking state and to try the shutdown again later, after the handshake completes. That's because, in this case the responses will be lost. Is there any external method to know that the http2 connection is in handshaking state? I'm thinking to change my tonic PR to something like this: if connection.state != HANDSHAKING { |
That doesn't seem to me like what most people would want from this method. Consider: as implemented in this PR, if graceful shutdown is triggered a couple milliseconds after the connection starts, and thus gets ignored because of handshaking, then this connection will stay in an "active" state indefinitely, and then you're much more likely to have active streams that get abruptly interrupted when the final shutdown step just kills everything that hasn't stopped. That can be even worse. I could imagine a solution here: instead of just closing immediately, a flag could be set that as soon as the handshake is done, then it starts the HTTP/2 graceful shutdown. So, literally all requests will get rejected, but as least the client will know from the |
I tried to implement your suggested solution. |
This would be useful to us as we noticed we're closing connections abruptly if the h2 handshake is in progress and we're trying to gracefully shut down the server. |
Hello! Any news on that? Would be very useful while re-deploying services on kubernetes |
Needed for implementing tonic max_connection_age_ms .
Related PRs: