[HTTP/3] Optimize reading from Quic stream #58445

CarnaViire · 2021-08-31T16:03:37Z

Reads don't leverage buffering and most of the time they would read 1 (or less) data frame at a time, even if they are available in transport. Reads can be optimized in a way chunked encoding for HTTP/1.1 is implemented, see algorithm below

From #58296 (comment)

The logic here should be similar to how we handle chunked encoding for HTTP/1.1. See here: https://github.com/dotnet/runtime/blob/main/src/libraries/System.Net.Http/src/System/Net/Http/SocketsHttpHandler/ChunkedEncodingReadStream.cs#L38

I'm not sure why it was implemented so differently than chunked encoding in the first place.

In other words, we should do something like this:
(1) Read as much data out of our read buffer as we can. If we can read at least one byte, or we hit EOF, we are done and return to the caller. (This is the ReadChunksFromConnectionBuffer logic in the chunked encoding code.)
(2) At this point, we need to read more data. In certain cases it makes sense to avoid buffering the data ourselves and just read directly into the user's buffer. If so, do the read and (assuming it doesn't fail or return unexpected EOF) return.
(3) Otherwise, we need to read into the buffer. Do the read into the buffer and try to get data out of the buffer again (as in step 1, i.e. call the equivalent of ReadChunksFromConnectionBuffer). If we can read at least one byte, or we hit EOF, we are done.
(4) Otherwise, we must not have read enough data (e.g. we just got a partial frame header or something like that. Loop back to step 3 and repeat until we either get some data or hit EOF.

(Note that the chunked encoding logic actually loops back to step 2 instead of step 3. In theory that allows us to do the optimization in step 2 in a very narrow case, where we read exactly the "chunk header" in step 3 but don't read any chunk data. This seems unlikely in practice, but there's probably no harm in having the extra optimization here.)

HTTP3 DATA frames should actually be a bit simpler to handle than HTTP/1.1 chunked encoding, because their layout is simpler -- the frame header is a fixed size, the length is binary encoded instead of ASCII, there's no trailing frame delimiter as there is in chunked encoding (i.e. trailing "\r\n" for every chunk). Note however that for HTTP3, I think we do need to accept 0-length DATA frames -- whereas in chunked encoding, we don't (because a 0-length frame indicates EOF).

Also note the description above is similar to how we processing incoming SSL frames. See

runtime/src/libraries/System.Net.Security/src/System/Net/Security/SslStream.Implementation.cs

Line 914 in a579575

private async ValueTask<int> ReadAsyncInternal<TIOAdapter>(TIOAdapter adapter, Memory<byte> buffer)

From #58296 (comment)

Other than that, the buffer we have now is small (64 bytes) and we don't have a logic of growing it... (so it's only useful for envelopes) But we can use e.g. larger initial size, at least 4KB as in HTTP/1.1, so branching on whether to read into user's buffer or to our buffer would make sense (now I believe it's not)

We should probably use a size like 4K here, yeah.

QUIC is rather different than HTTP1/2 because it's in user mode and it actively pushes data to us. We should think through the implications of this and try to make sure we are optimizing for how QUIC works in practice. But at least for now it seems reasonable to make this work fairly similarly to HTTP/1.1 since we know that works correctly and gives pretty good performance.

ghost · 2021-08-31T16:03:42Z

Tagging subscribers to this area: @dotnet/ncl
See info in area-owners.md if you want to be subscribed.

Issue Details

Reads don't leverage buffering and most of the time they would read 1 (or less) data frame at a time, even if they are available in transport. Reads can be optimized in a way chunked encoding for HTTP/1.1 is implemented, see algorithm below

From #58296 (comment)

The logic here should be similar to how we handle chunked encoding for HTTP/1.1. See here: https://github.com/dotnet/runtime/blob/main/src/libraries/System.Net.Http/src/System/Net/Http/SocketsHttpHandler/ChunkedEncodingReadStream.cs#L38

I'm not sure why it was implemented so differently than chunked encoding in the first place.

In other words, we should do something like this:
(1) Read as much data out of our read buffer as we can. If we can read at least one byte, or we hit EOF, we are done and return to the caller. (This is the ReadChunksFromConnectionBuffer logic in the chunked encoding code.)
(2) At this point, we need to read more data. In certain cases it makes sense to avoid buffering the data ourselves and just read directly into the user's buffer. If so, do the read and (assuming it doesn't fail or return unexpected EOF) return.
(3) Otherwise, we need to read into the buffer. Do the read into the buffer and try to get data out of the buffer again (as in step 1, i.e. call the equivalent of ReadChunksFromConnectionBuffer). If we can read at least one byte, or we hit EOF, we are done.
(4) Otherwise, we must not have read enough data (e.g. we just got a partial frame header or something like that. Loop back to step 3 and repeat until we either get some data or hit EOF.

(Note that the chunked encoding logic actually loops back to step 2 instead of step 3. In theory that allows us to do the optimization in step 2 in a very narrow case, where we read exactly the "chunk header" in step 3 but don't read any chunk data. This seems unlikely in practice, but there's probably no harm in having the extra optimization here.)

HTTP3 DATA frames should actually be a bit simpler to handle than HTTP/1.1 chunked encoding, because their layout is simpler -- the frame header is a fixed size, the length is binary encoded instead of ASCII, there's no trailing frame delimiter as there is in chunked encoding (i.e. trailing "\r\n" for every chunk). Note however that for HTTP3, I think we do need to accept 0-length DATA frames -- whereas in chunked encoding, we don't (because a 0-length frame indicates EOF).

Also note the description above is similar to how we processing incoming SSL frames. See

runtime/src/libraries/System.Net.Security/src/System/Net/Security/SslStream.Implementation.cs

Line 914 in a579575

private async ValueTask<int> ReadAsyncInternal<TIOAdapter>(TIOAdapter adapter, Memory<byte> buffer)

From #58296 (comment)

Other than that, the buffer we have now is small (64 bytes) and we don't have a logic of growing it... (so it's only useful for envelopes) But we can use e.g. larger initial size, at least 4KB as in HTTP/1.1, so branching on whether to read into user's buffer or to our buffer would make sense (now I believe it's not)

We should probably use a size like 4K here, yeah.

QUIC is rather different than HTTP1/2 because it's in user mode and it actively pushes data to us. We should think through the implications of this and try to make sure we are optimizing for how QUIC works in practice. But at least for now it seems reasonable to make this work fairly similarly to HTTP/1.1 since we know that works correctly and gives pretty good performance.

Author:	CarnaViire
Assignees:	-
Labels:	`area-System.Net.Http`, `untriaged`
Milestone:	-

karelz · 2021-08-31T16:49:57Z

Triage: Perf optimization, not important for 6.0.

rzikm · 2024-05-29T11:02:57Z

Is there anything more to be done as part of this issue? We do have read-side buffering on QuicStream.

CarnaViire · 2024-05-31T11:46:59Z

Closing as fixed by read-side buffering on QuicStream.

dotnet-issue-labeler bot added area-System.Net.Http untriaged New issue has not been triaged by the area owner labels Aug 31, 2021

karelz added the tenet-performance Performance related issue label Aug 31, 2021

karelz added this to the 7.0.0 milestone Aug 31, 2021

karelz removed the untriaged New issue has not been triaged by the area owner label Aug 31, 2021

karelz added the enhancement Product code improvement that does NOT require public API changes/additions label Nov 23, 2021

ManickaP mentioned this issue Feb 18, 2022

HttpTelemetry instrumentation for HTTP/3 #40896

Closed

karelz modified the milestones: 7.0.0, Future Jun 14, 2022

CarnaViire closed this as completed May 31, 2024

karelz modified the milestones: Future, 9.0.0 Jun 24, 2024

github-actions bot locked and limited conversation to collaborators Jul 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[HTTP/3] Optimize reading from Quic stream #58445

[HTTP/3] Optimize reading from Quic stream #58445

CarnaViire commented Aug 31, 2021

ghost commented Aug 31, 2021

karelz commented Aug 31, 2021

rzikm commented May 29, 2024

CarnaViire commented May 31, 2024

[HTTP/3] Optimize reading from Quic stream #58445

[HTTP/3] Optimize reading from Quic stream #58445

Comments

CarnaViire commented Aug 31, 2021

ghost commented Aug 31, 2021

karelz commented Aug 31, 2021

rzikm commented May 29, 2024

CarnaViire commented May 31, 2024