-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issues with client termination of H2 CONNECT streams #3652
Comments
That is surprising. The cc @nox, who worked on this previously. |
Ah, I think I got things a little mixed up. In #3647, I sent a test and a fix.
That being said, I do still see the issue in our application despite dropping everything, I am just unable to reproduce it in a unit test. |
See hyperium/hyper#3652. What I have found is the final reference to a stream being dropped after the `maybe_close_connection_if_no_streams` but before the `inner.poll()` completes can lead to the connection dangling forever without any forward progress. No streams/references are alive, but the connection is not complete and never wakes up again. This seems like a classic TOCTOU race condition. In this fix, I check again at the end of poll and if this state is detected, wake up the task again. Wth the test in hyperium/hyper#3655, on my machine, it fails about 5% of the time: ``` 1876 runs so far, 100 failures (94.94% pass rate). 95.197349ms avg, 1.097347435s max, 5.398457ms min ``` With that PR, this test is 100% reliable ``` 64010 runs so far, 0 failures (100.00% pass rate). 44.484057ms avg, 121.454709ms max, 1.872657ms min ``` Note: we also have reproduced this using `h2` directly outside of `hyper`, which is what gives me confidence this issue lies in `h2` and not `hyper`.
Ok I think I figured it out and put up two PRs:
|
See hyperium/hyper#3652. What I have found is the final reference to a stream being dropped after the `maybe_close_connection_if_no_streams` but before the `inner.poll()` completes can lead to the connection dangling forever without any forward progress. No streams/references are alive, but the connection is not complete and never wakes up again. This seems like a classic TOCTOU race condition. In this fix, I check again at the end of poll and if this state is detected, wake up the task again. Wth the test in hyperium/hyper#3655, on my machine, it fails about 5% of the time: ``` 1876 runs so far, 100 failures (94.94% pass rate). 95.197349ms avg, 1.097347435s max, 5.398457ms min ``` With that PR, this test is 100% reliable ``` 64010 runs so far, 0 failures (100.00% pass rate). 44.484057ms avg, 121.454709ms max, 1.872657ms min ``` Note: we also have reproduced this using `h2` directly outside of `hyper`, which is what gives me confidence this issue lies in `h2` and not `hyper`.
Version
Hyper 1.3
Platform
Description
We are seeing a few issues around using H2 CONNECT. Our application is using Tokio, Rustls, and Hyper to communicate between a client and server both using the same stack. We have multiple streams on one TCP connection sometimes (though the issue occurs regardless of multiplexing).
Our first issue that popped up was leaking connections. This was debugging and an (attempted) fix was opened in #3647. More details there. This was like not detected before because:
With that fix, everything was working fine. However, a later refactor in our broke things again; we started dropping the
SendRequest
before we were complete with IO operations onUpgraded
(before, we dropped theSendRequest
after). This causes the stream/connection to be closed unexpectedly. This can be reproduced easily by moving thedrop(client)
in #3647 to before the write/read operations. This is easy to workaround, but its not documented and @seanmonstar suggested it was not expected on Discord.The text was updated successfully, but these errors were encountered: