-
Notifications
You must be signed in to change notification settings - Fork 844
HTTP2 drain #1710
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HTTP2 drain #1710
Conversation
proxy/http2/Http2ConnectionState.cc
Outdated
| ua_session->destroy(); | ||
| ua_session = nullptr; | ||
| } else if (total_client_streams_count == 0 && http2_drain && ua_session && stream) { | ||
| send_goaway_frame(stream->get_id(), Http2ErrorCode::HTTP2_ERROR_NO_ERROR); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this the GOAWAY frame to notify Graceful Shutdown to client? If so the stream id should be 2^31 - 1.
RFC says below in section 6.8.
A server that is attempting to gracefully shut down a connection SHOULD send an initial GOAWAY frame with the last stream identifier set to 231-1 and a NO_ERROR code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There should be 2 GOWAY frames sent during graceful shutdown. @zizhong it seems you are only doing the 2nd one.
- stream id - 2^31 - 1 - This signals the client that we are shutting down so that it should stop sending us more streams
- After allowing time for any in-flight stream creation (at least one round-trip time), the server can send another GOAWAY frame with an updated last stream identifier. This ensures that a connection can be cleanly shut down without losing requests.
|
Thanks for the comments. @masaori335 @sidhuagarwal I have a more detailed plan as follows: ShutDownState {
|
484f2af to
3630d47
Compare
|
@masaori335 @sidhuagarwal Updated the rb according to the design above. Test result with chrome: Issues not addressed in this commit includes:
|
f7d56a3 to
40c97ff
Compare
|
@masaori335 could you review this? Do you agree we should have different behaviors of the first GOAWAY and the second GOAWAY? |
|
@zizhong Yes, I agree with sending 2 GOAWAY frames to client. I'll take a look. |
|
[approve ci] |
|
clang format successful! https://ci.trafficserver.apache.org/job/clang-format-github/288/ |
|
RAT check successful! https://ci.trafficserver.apache.org/job/RAT-github/302/ |
|
AU check failed! https://ci.trafficserver.apache.org/job/autest-github/284/ |
|
FreeBSD11 build successful! https://ci.trafficserver.apache.org/job/freebsd-github/1983/ |
|
Intel CC build successful! https://ci.trafficserver.apache.org/job/icc-github/414/ |
|
Linux build successful! https://ci.trafficserver.apache.org/job/linux-github/1875/ |
|
clang-analyzer build failed! https://ci.trafficserver.apache.org/job/clang-analyzer-github/547/ |
maskit
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the configuration name should be renamed at least because it's a sort of interface. I don't want to change it frequently.
As for implementation, I'm OK with it. It can be refactored later.
mgmt/RecordsConfig.cc
Outdated
| // ########### | ||
| {RECT_CONFIG, "proxy.config.http.connect_ports", RECD_STRING, "443", RECU_DYNAMIC, RR_NULL, RECC_STR, "^(\\*|[[:digit:][:space:]]+)$", RECA_NULL} | ||
| , | ||
| {RECT_CONFIG, "proxy.config.http.http2_drain_timeout", RECD_INT, "0", RECU_RESTART_TS, RR_NULL, RECC_STR, "^[0-9]+$", RECA_NULL} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be RECU_DYNAMIC?
Also, could you make it more general name or move it under "proxy.config.http2"?
proxy/Main.cc
Outdated
| signal_crash_handler(signo, info, ctx); | ||
| } | ||
|
|
||
| if (http2_drain_timeout) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not a big fun of adding http2_something into Main.cc. I think graceful shutdown is not HTTP2 specific and there should be something we can do on other protocols. Could you make it more general?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Making it more general is a better idea. Thanks!
40c97ff to
f2b25e6
Compare
|
@maskit I updated the PR and docs. Can you take a look again? |
| :reloadable: | ||
|
|
||
| This setting specifies the number of active client connections | ||
| for use by :option:`traffic_ctl server restart --drain`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It'd be great if TS can drain traffic without restart or stop.
In our use case, we want to break it down to some phases like 1) drain traffic, 2) check stats, and 3) restart or stop server.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm. you mean exposing an API to initiate the shutdown? @mlibbey also gave the same idea.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. That is helpful for us:)
|
(frequently because we want to do something between drain and restart -- like install a new version of ATS). |
f2b25e6 to
952d9df
Compare
|
@zizhong If I understand correctly, after setting the timeout, it starts graceful shutdown when ATS receive some signal, right? I'd prefer small commits. I think adding support for |
maskit
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found an issue that new sessions open during waiting shutdown timeout get unexpected data. It expects HTTP2 preface but it receive GOAWAY with stream id 2^32-1 instead.
b9dc101 to
8bf7015
Compare
8bf7015 to
0394a1f
Compare
|
@maskit when shutting down ATS, I think we need to stop accepting new connections. What is your idea? |
|
@zizhong I agree. ATS should close the listening sockets. I'm not sure it can be made easily. If it's difficult or the change would be big, then I'm fine with just sending GOAWAY frames after H2 preface (and maybe SETTINGS frames?). We can make it better later. |
|
Currently, I added a check in |
|
Ah, I didn't have that idea. Sounds good, it has a little overhead though. Also, since we used that only for ACL, it may affect some stats values. |
|
@maskit I already updated it in the PR. Can you review it again? |
|
Oh, I just found it. Sure. [approve ci] |
|
clang format successful! https://ci.trafficserver.apache.org/job/clang-format-github/405/ |
|
RAT check successful! https://ci.trafficserver.apache.org/job/RAT-github/416/ |
|
FreeBSD11 build successful! https://ci.trafficserver.apache.org/job/freebsd-github/2099/ |
|
Intel CC build successful! https://ci.trafficserver.apache.org/job/icc-github/528/ |
|
Linux build successful! https://ci.trafficserver.apache.org/job/linux-github/1993/ |
|
AU check successful! https://ci.trafficserver.apache.org/job/autest-github/401/ |
|
clang-analyzer build successful! https://ci.trafficserver.apache.org/job/clang-analyzer-github/661/ |
maskit
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good job!
|
Changed milestone to 8.0.0, because 7.1.0 doesn't have this. |
This is trying to fix issue #1693.
The idea is adding a timeout when we stop ATS. During the time out, we send the GOAWAY frame when the last stream on the connection has been released. It's to make sure streams with stream_id no less than
last_stream_idhas been processed. Ideally, we can send GOAWAY while other streams going on. However, currently, ATS lacks the mechanism that after GOAWAY was sent, other streams continue to be processed.This PR requires PR #1704 so that the client can receive GOAWAY correctly.
I understand this commit can be improved in many ways. Any ideas would be appreciated,