Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

poll_capacity spuriously returns Ready(Some(0)) #628

Closed
vadim-eg opened this issue Jul 28, 2022 · 4 comments · Fixed by #661
Closed

poll_capacity spuriously returns Ready(Some(0)) #628

vadim-eg opened this issue Jul 28, 2022 · 4 comments · Fixed by #661

Comments

@vadim-eg
Copy link
Contributor

vadim-eg commented Jul 28, 2022

Sometimes, under active IO poll_capacity returns Poll(Ready(Some(0))).
In this situation there is not much caller can do to wait for the capacity becomes available.

It appears to be a concurrency/timing issue - my reconstruction of the events hints that it can happen when wakeup is called from one thread while the outer is sending data on a stream - e.g. -

  1. thread A - sender called poll_capacity, got 16K and proceeded to send_data
  2. thread B - connection got a window update and sent out some pending frames. capacity did not increase but wakeup was 3) scheduled and send_capacity_inc set. thread A - sent the data and capacity is now 0, send_capacity_inc set
  3. thread A - poll_capacity returns 0

One solution is not to wakeup from pop_frame if capacity has not changed.
It is not quite clear if this would be a robust solution and this is the only situation - because 'if send_capacity_inc is set - capacity > 0' doesn't appear to be an atomic invariant.

@vadim-eg
Copy link
Contributor Author

Hello! Is there anybody here?
Any thoughts or comments will be appreciated.

@seanmonstar
Copy link
Member

I believe the source code even has a TODO comment about this spurious return. I don't know why it occurs, but would welcome investigation!

@vadim-eg
Copy link
Contributor Author

The summary provides one real scenario how this can happen.
We have been running the code with the fix on a scale since then and we haven't seen any traces of the spurious wake ups - so far so good.
It would be great though to get it upstream to be able to stick to official releases.

vadim-eg added a commit to edgeguard-dev/h2 that referenced this issue Feb 14, 2023
vadim-eg added a commit to vadim-eg/h2 that referenced this issue Feb 17, 2023
Fixes hyperium#628

The previous fix has addressed only a part of the problem - there is
still a possibility for the `poll_capacity` to return
`Ready(Some(0))`.

The root cause - in a nutshell - is the race condition between the
application tasks that performs stream I/O and the task that serves
the underlying HTTP/2 connection. The application thread that is about
to send data calls `reserve_capacity/poll_capacity`, is provided
with some send capacity and proceeds to `send_data`.

Meanwhile the service thread may send some buffered data and/or
receive some window updates - either way the stream's effective
allocated send capacity may not change, but, since the capacity still
available, `send_capacity_inc` flag may be set.

The sending task calls `send_data` and uses the entire allocated
capacity, leaving the flag set. Next time `poll_capacity` returns
`Ready(Some(0))`.

This change sets the flag and dispatches the wakeup event only in
cases when the effective capacity reported by `poll_capacity` actually
increases.
vadim-eg added a commit to vadim-eg/h2 that referenced this issue Feb 17, 2023
Fixes hyperium#628

Sometimes `poll_capacity` returns `Ready(Some(0))` - in which case
caller will have no way to wait for the stream capacity to become available.
The previous attempt on the fix has addressed only a part of the problem.

The root cause - in a nutshell - is the race condition between the
application tasks that performs stream I/O and the task that serves
the underlying HTTP/2 connection. The application thread that is about
to send data calls `reserve_capacity/poll_capacity`, is provided
with some send capacity and proceeds to `send_data`.

Meanwhile the service thread may send some buffered data and/or
receive some window updates - either way the stream's effective
allocated send capacity may not change, but, since the capacity still
available, `send_capacity_inc` flag may be set.

The sending task calls `send_data` and uses the entire allocated
capacity, leaving the flag set. Next time `poll_capacity` returns
`Ready(Some(0))`.

This change sets the flag and dispatches the wakeup event only in
cases when the effective capacity reported by `poll_capacity` actually
increases.
@vadim-eg
Copy link
Contributor Author

I have opened a new pr with a more comprehensive change and UT that demonstrates the issue.

vadim-eg added a commit to vadim-eg/h2 that referenced this issue Feb 17, 2023
Fixes hyperium#628

Sometimes `poll_capacity` returns `Ready(Some(0))` - in which case
caller will have no way to wait for the stream capacity to become available.
The previous attempt on the fix has addressed only a part of the problem.

The root cause - in a nutshell - is the race condition between the
application tasks that performs stream I/O and the task that serves
the underlying HTTP/2 connection. The application thread that is about
to send data calls `reserve_capacity/poll_capacity`, is provided
with some send capacity and proceeds to `send_data`.

Meanwhile the service thread may send some buffered data and/or
receive some window updates - either way the stream's effective
allocated send capacity may not change, but, since the capacity still
available, `send_capacity_inc` flag may be set.

The sending task calls `send_data` and uses the entire allocated
capacity, leaving the flag set. Next time `poll_capacity` returns
`Ready(Some(0))`.

This change sets the flag and dispatches the wakeup event only in
cases when the effective capacity reported by `poll_capacity` actually
increases.
seanmonstar pushed a commit that referenced this issue Feb 20, 2023
Fixes #628

Sometimes `poll_capacity` returns `Ready(Some(0))` - in which case
caller will have no way to wait for the stream capacity to become available.
The previous attempt on the fix has addressed only a part of the problem.

The root cause - in a nutshell - is the race condition between the
application tasks that performs stream I/O and the task that serves
the underlying HTTP/2 connection. The application thread that is about
to send data calls `reserve_capacity/poll_capacity`, is provided
with some send capacity and proceeds to `send_data`.

Meanwhile the service thread may send some buffered data and/or
receive some window updates - either way the stream's effective
allocated send capacity may not change, but, since the capacity still
available, `send_capacity_inc` flag may be set.

The sending task calls `send_data` and uses the entire allocated
capacity, leaving the flag set. Next time `poll_capacity` returns
`Ready(Some(0))`.

This change sets the flag and dispatches the wakeup event only in
cases when the effective capacity reported by `poll_capacity` actually
increases.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants