-
Notifications
You must be signed in to change notification settings - Fork 844
Fix crash when H2 client does not set End-of-data bit #6413
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
[approve ci autest] |
masaori335
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need #6408 too, right?
| // Completed successfully | ||
| c->write_success = true; | ||
| c->write_success = true; | ||
| server_entry->in_tunnel = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, this single line change will fix the crash. However, calling release_server_session() before setting server_session = nullptr make sense.
masaori335
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The crash looks gone with this patch, but this doesn't fire up Http2Stream to send 408.
[Feb 12 12:24:29.922] [ET_NET 1] DEBUG: <HttpTunnel.cc:1299 (consumer_handler)> (http_tunnel) [1] consumer_handler [http server post VC_EVENT_WRITE_COMPLETE/TS_EVENT_VCONN_WRITE_COMPLETE]
[Feb 12 12:24:29.922] [ET_NET 1] DEBUG: <HttpSM.cc:3600 (tunnel_handler_post_server)> (http) [1] [&HttpSM::tunnel_handler_post_server, VC_EVENT_WRITE_COMPLETE/TS_EVENT_VCONN_WRITE_COMPLETE]
[Feb 12 12:24:35.510] [ET_NET 1] DEBUG: <HttpSM.cc:2631 (main_handler)> (http) [1] [HttpSM::main_handler, VC_EVENT_ACTIVE_TIMEOUT/TS_EVENT_VCONN_ACTIVE_TIMEOUT]
[Feb 12 12:24:35.510] [ET_NET 1] DEBUG: <HttpSM.cc:2051 (state_send_server_request_header)> (http) [1] [&HttpSM::state_send_server_request_header, VC_EVENT_ACTIVE_TIMEOUT/TS_EVENT_VCONN_ACTIVE_TIMEOUT]
[Feb 12 12:24:35.510] [ET_NET 1] DEBUG: <HttpSM.cc:5496 (handle_server_setup_error)> (http) [1] [&HttpSM::handle_server_setup_error, VC_EVENT_ACTIVE_TIMEOUT/TS_EVENT_VCONN_ACTIVE_TIMEOUT]
[Feb 12 12:24:35.510] [ET_NET 1] DEBUG: <HttpSM.cc:5506 (handle_server_setup_error)> (http) [1] [handle_server_setup_error] forwarding event VC_EVENT_ACTIVE_TIMEOUT/TS_EVENT_VCONN_ACTIVE_TIMEOUT to post tunnel
|
If I revert all changes of HTTP/2, I could get 408 and no crash. Updating |
I commented on #6404, I found that ATS ignore "pure END_STREAM" (DATA frame with END_STREAM & length = 0) and VC_EVENT_READ_COMPLETE is not signaled. It's possible that |
07d2d8e to
20e0423
Compare
|
Updated to address @masaori335 comments. Tested with PR #6404 successfully. Updating issue #6313 with details. |
|
[approve ci autest] |
masaori335
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. Tested by nghttp with --trailer, no crash.
|
Cherry-picked to v9.0.x branch. |
Yet another solution to issue #6313.
Unlike PR #6407 and #6401, this PR does not update the H2 logic to send a READ_COMPLETE when the end-of-data bit is set or when the read_vio.ntodo() is 0. Rather if the user agent never sends the end-of-data bit, the user agent will eventually send a EOS or get timed out and this PR adds logic in the HttpSM to cleanup the server_session without crashing.
While this PR is more spec compliant, our organization may still go with PR #6407 in the short term anyway because we do have user agents that send POST data but do not set the end-of-data bit. While running with this patch on a production box, the number of H2 POST ERR_CLIENT_ABORTS increased by an order of magnitude over a 15 minute logging interval. For us tracking down the appropriate app dev teams takes some time, and getting clients to update apps takes even longer.
You'll noticed in this PR, that I added a boolean return value to Http2Stream::update_read. The idea was that it would return false if ndone > nbytes. Then the caller could issue a stream error because the client was breaking spec by sending more data than indicated by the content-length. However, in many cases nbytes is 0 because after the request header is processed, the read_vio nbytes is set to ndone which at that point is 0. For about an hour of panicked investigation, I thought that @masaori335's post optimization had broken things if the data frame arrived before the Tunnel's do_io_read was all set up. But in fact all is well. The logic in HttpSM::do_setup_post_tunnel will copy extra data from the end of the ua_buffer (which is the _request_buffer for Http2Stream) into the tunnel's buffer, so all is well there anyway.