Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTTP2 requests waiting an outstanding connecting task should not be canceled when the connecting task is canceled #3199

Open
arnauorriols opened this issue Apr 11, 2023 · 5 comments
Labels
C-bug Category: bug. Something is wrong. This is bad!

Comments

@arnauorriols
Copy link

Version
0.14.25. I guess it applies also to the pool in hyper-util, but haven't really checked.

Platform
Darwin Kernel Version 21.4.0: Mon Feb 21 20:35:58 PST 2022; root:xnu-8020.101.4~2/RELEASE_ARM64_T6000 arm64

Description

When there are a bunch of HTTP2 requests waiting for a connection from the pool and there is already a connecting task, if the connecting task is dropped before the waiters receive the new connection we drop all the waiter senders, resulting in the cancelation of the awaiting requests, arguing that the connecting attempt failed. However, this also happens if the connecting attempt is cancelled by the user (ie the ResponseFuture is dropped) and not technically failed.

I've written this test that reproduces (what I consider) the bug:

    #[tokio::test]
    async fn http2_connection_waiters_are_not_canceled_when_outstanding_connecting_task_is_canceled() {
        let _ = pretty_env_logger::try_init();

        let server = TcpListener::bind("127.0.0.1:0").unwrap();
        let addr = server.local_addr().unwrap();

        #[derive(Clone)]
        struct SlowConnector;

        impl hyper::service::Service<Uri> for SlowConnector {
            type Response = TcpStream;
            type Error = hyper::Error;
            type Future = future::Pending<Result<TcpStream, hyper::Error>>;

            fn poll_ready(&mut self, _cx: &mut Context<'_>) -> Poll<Result<(), Self::Error>> {
                Poll::Ready(Ok(()))
            }

            fn call(&mut self, _dst: Uri) -> Self::Future {
                // An slow connector...
                future::pending()
            }
        }

        let client = Client::builder().http2_only(true).build(SlowConnector);

        // This first request starts the connecting task 
        let req = Request::builder()
            .uri(&*format!("http://{}/a", addr))
            .body(Body::empty())
            .unwrap();
        let mut res1 = client.request(req).map(|_| unreachable!());

        // This second request waits for the conencting task to finish
        // instead of starting a new connecting task
        let req = Request::builder()
            .uri(&*format!("http://{}/a", addr))
            .body(Body::empty())
            .unwrap();
        let mut res2 = client
            .request(req)
            // Test fails here with
            // thread 'dispatch_impl::http2_connection_waiters_are_not_canceled_when_outstanding_connecting_task_is_canceled'
            // panicked at 'internal error: entered unreachable code: res2 should had never resolved, but resulted in
            // Err(hyper::Error(Canceled, "request has been canceled"))', tests/client.rs:1398:22
            .map(|r| unreachable!("res2 should had never resolved, but resulted in {:?}", r));

        // Prime the requests
        assert!(
            future::poll_fn(|ctx| Poll::Ready(Pin::new(&mut res1).poll(ctx).is_pending())).await
        );
        assert!(
            future::poll_fn(|ctx| Poll::Ready(Pin::new(&mut res2).poll(ctx).is_pending())).await
        );

        // The `ResponseFuture` that drives the connecting task is dropped, and the connecting task is canceled
        drop(res1);

        // The request that was waiting for the connecting task fails with Error::Canceled
        assert!(
            future::poll_fn(|ctx| Poll::Ready(Pin::new(&mut res2).poll(ctx).is_pending())).await
        );
    }

Thoughts? I'm up to work on a solution if we agree on the assessment.

@arnauorriols arnauorriols added the C-bug Category: bug. Something is wrong. This is bad! label Apr 11, 2023
@seanmonstar
Copy link
Member

Oh, thanks so much for the code, that helped me understand the subtlety you were getting at. It seems like the other pending slots should "take over" the connecting task if the first one is given up. Does that sound right?

@arnauorriols
Copy link
Author

It seems like the other pending slots should "take over" the connecting task if the first one is given up. Does that sound right?

Yes, that should do it.

@seanmonstar
Copy link
Member

Do you want to try to take that on? I imagine a unit test wouldn't be that hard to adapt from the code you included here. You could probably make use of tokio::sync::watch to be able to share a receiver, and then only resolve on it after you've pinged and dropped the first response future.

@arnauorriols
Copy link
Author

arnauorriols commented Apr 13, 2023

Sure! To release branch 0.14.x right? What about hyper-util? Should I open a PR there as well?

@seanmonstar
Copy link
Member

If you want to use it with 0.14, Yea. If you want to use in 1.0, something similar in util would be needed.

arnauorriols added a commit to arnauorriols/hyper that referenced this issue Apr 18, 2023
…is canceled

In the previous implementation the `connect_to` future was spawned to let it finish when the checkout future won the race. However, in the event of the user dropping the `ResponseFuture`, the `connect_to` future was canceled as well, and all the http2 requests waiting for the connection would error as a result. This commit prevents that
by spawning the `connect_to` future invariably (once it starts) thus
covering both cases.

Fixes hyperium#3199
arnauorriols added a commit to arnauorriols/hyper that referenced this issue Apr 18, 2023
…is canceled

In the previous implementation the `connect_to` future was spawned to let it finish when the checkout future won the race. However, in the event of the user dropping the `ResponseFuture`, the `connect_to` future was canceled as well, and all the http2 requests waiting for the connection would error as a result. This commit prevents that by spawning the `connect_to` future invariably (once it starts) thus covering both cases.

Fixes hyperium#3199
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Category: bug. Something is wrong. This is bad!
Projects
None yet
Development

No branches or pull requests

2 participants