Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[HTTP/3] Optimize reading from Quic stream #58445

Closed
CarnaViire opened this issue Aug 31, 2021 · 4 comments
Closed

[HTTP/3] Optimize reading from Quic stream #58445

CarnaViire opened this issue Aug 31, 2021 · 4 comments
Labels
area-System.Net.Http enhancement Product code improvement that does NOT require public API changes/additions tenet-performance Performance related issue
Milestone

Comments

@CarnaViire
Copy link
Member

Reads don't leverage buffering and most of the time they would read 1 (or less) data frame at a time, even if they are available in transport. Reads can be optimized in a way chunked encoding for HTTP/1.1 is implemented, see algorithm below

From #58296 (comment)

The logic here should be similar to how we handle chunked encoding for HTTP/1.1. See here: https://github.com/dotnet/runtime/blob/main/src/libraries/System.Net.Http/src/System/Net/Http/SocketsHttpHandler/ChunkedEncodingReadStream.cs#L38

I'm not sure why it was implemented so differently than chunked encoding in the first place.

In other words, we should do something like this:
(1) Read as much data out of our read buffer as we can. If we can read at least one byte, or we hit EOF, we are done and return to the caller. (This is the ReadChunksFromConnectionBuffer logic in the chunked encoding code.)
(2) At this point, we need to read more data. In certain cases it makes sense to avoid buffering the data ourselves and just read directly into the user's buffer. If so, do the read and (assuming it doesn't fail or return unexpected EOF) return.
(3) Otherwise, we need to read into the buffer. Do the read into the buffer and try to get data out of the buffer again (as in step 1, i.e. call the equivalent of ReadChunksFromConnectionBuffer). If we can read at least one byte, or we hit EOF, we are done.
(4) Otherwise, we must not have read enough data (e.g. we just got a partial frame header or something like that. Loop back to step 3 and repeat until we either get some data or hit EOF.

(Note that the chunked encoding logic actually loops back to step 2 instead of step 3. In theory that allows us to do the optimization in step 2 in a very narrow case, where we read exactly the "chunk header" in step 3 but don't read any chunk data. This seems unlikely in practice, but there's probably no harm in having the extra optimization here.)

HTTP3 DATA frames should actually be a bit simpler to handle than HTTP/1.1 chunked encoding, because their layout is simpler -- the frame header is a fixed size, the length is binary encoded instead of ASCII, there's no trailing frame delimiter as there is in chunked encoding (i.e. trailing "\r\n" for every chunk). Note however that for HTTP3, I think we do need to accept 0-length DATA frames -- whereas in chunked encoding, we don't (because a 0-length frame indicates EOF).

Also note the description above is similar to how we processing incoming SSL frames. See

private async ValueTask<int> ReadAsyncInternal<TIOAdapter>(TIOAdapter adapter, Memory<byte> buffer)

From #58296 (comment)

Other than that, the buffer we have now is small (64 bytes) and we don't have a logic of growing it... (so it's only useful for envelopes) But we can use e.g. larger initial size, at least 4KB as in HTTP/1.1, so branching on whether to read into user's buffer or to our buffer would make sense (now I believe it's not)

We should probably use a size like 4K here, yeah.

QUIC is rather different than HTTP1/2 because it's in user mode and it actively pushes data to us. We should think through the implications of this and try to make sure we are optimizing for how QUIC works in practice. But at least for now it seems reasonable to make this work fairly similarly to HTTP/1.1 since we know that works correctly and gives pretty good performance.

@dotnet-issue-labeler dotnet-issue-labeler bot added area-System.Net.Http untriaged New issue has not been triaged by the area owner labels Aug 31, 2021
@ghost
Copy link

ghost commented Aug 31, 2021

Tagging subscribers to this area: @dotnet/ncl
See info in area-owners.md if you want to be subscribed.

Issue Details

Reads don't leverage buffering and most of the time they would read 1 (or less) data frame at a time, even if they are available in transport. Reads can be optimized in a way chunked encoding for HTTP/1.1 is implemented, see algorithm below

From #58296 (comment)

The logic here should be similar to how we handle chunked encoding for HTTP/1.1. See here: https://github.com/dotnet/runtime/blob/main/src/libraries/System.Net.Http/src/System/Net/Http/SocketsHttpHandler/ChunkedEncodingReadStream.cs#L38

I'm not sure why it was implemented so differently than chunked encoding in the first place.

In other words, we should do something like this:
(1) Read as much data out of our read buffer as we can. If we can read at least one byte, or we hit EOF, we are done and return to the caller. (This is the ReadChunksFromConnectionBuffer logic in the chunked encoding code.)
(2) At this point, we need to read more data. In certain cases it makes sense to avoid buffering the data ourselves and just read directly into the user's buffer. If so, do the read and (assuming it doesn't fail or return unexpected EOF) return.
(3) Otherwise, we need to read into the buffer. Do the read into the buffer and try to get data out of the buffer again (as in step 1, i.e. call the equivalent of ReadChunksFromConnectionBuffer). If we can read at least one byte, or we hit EOF, we are done.
(4) Otherwise, we must not have read enough data (e.g. we just got a partial frame header or something like that. Loop back to step 3 and repeat until we either get some data or hit EOF.

(Note that the chunked encoding logic actually loops back to step 2 instead of step 3. In theory that allows us to do the optimization in step 2 in a very narrow case, where we read exactly the "chunk header" in step 3 but don't read any chunk data. This seems unlikely in practice, but there's probably no harm in having the extra optimization here.)

HTTP3 DATA frames should actually be a bit simpler to handle than HTTP/1.1 chunked encoding, because their layout is simpler -- the frame header is a fixed size, the length is binary encoded instead of ASCII, there's no trailing frame delimiter as there is in chunked encoding (i.e. trailing "\r\n" for every chunk). Note however that for HTTP3, I think we do need to accept 0-length DATA frames -- whereas in chunked encoding, we don't (because a 0-length frame indicates EOF).

Also note the description above is similar to how we processing incoming SSL frames. See

private async ValueTask<int> ReadAsyncInternal<TIOAdapter>(TIOAdapter adapter, Memory<byte> buffer)

From #58296 (comment)

Other than that, the buffer we have now is small (64 bytes) and we don't have a logic of growing it... (so it's only useful for envelopes) But we can use e.g. larger initial size, at least 4KB as in HTTP/1.1, so branching on whether to read into user's buffer or to our buffer would make sense (now I believe it's not)

We should probably use a size like 4K here, yeah.

QUIC is rather different than HTTP1/2 because it's in user mode and it actively pushes data to us. We should think through the implications of this and try to make sure we are optimizing for how QUIC works in practice. But at least for now it seems reasonable to make this work fairly similarly to HTTP/1.1 since we know that works correctly and gives pretty good performance.

Author: CarnaViire
Assignees: -
Labels:

area-System.Net.Http, untriaged

Milestone: -

@karelz karelz added the tenet-performance Performance related issue label Aug 31, 2021
@karelz karelz added this to the 7.0.0 milestone Aug 31, 2021
@karelz karelz removed the untriaged New issue has not been triaged by the area owner label Aug 31, 2021
@karelz
Copy link
Member

karelz commented Aug 31, 2021

Triage: Perf optimization, not important for 6.0.

@karelz karelz added the enhancement Product code improvement that does NOT require public API changes/additions label Nov 23, 2021
@karelz karelz modified the milestones: 7.0.0, Future Jun 14, 2022
@rzikm
Copy link
Member

rzikm commented May 29, 2024

Is there anything more to be done as part of this issue? We do have read-side buffering on QuicStream.

@CarnaViire
Copy link
Member Author

Closing as fixed by read-side buffering on QuicStream.

@karelz karelz modified the milestones: Future, 9.0.0 Jun 24, 2024
@github-actions github-actions bot locked and limited conversation to collaborators Jul 25, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-System.Net.Http enhancement Product code improvement that does NOT require public API changes/additions tenet-performance Performance related issue
Projects
None yet
Development

No branches or pull requests

3 participants