Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor H2 #286

Merged
merged 46 commits into from
Feb 9, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
46 commits
Select commit Hold shift + click to select a range
e577d7d
Tidy up h2 handler's handle_call/info calls
mtrudel Jan 6, 2024
b717d9e
Misc tidying
mtrudel Jan 7, 2024
5207e1a
Add frame send/recv functions to simpleh2client
mtrudel Jan 8, 2024
349917d
Remove clauses deprecated by RFC9113
mtrudel Jan 8, 2024
eb2a222
Refactor SteamTask to be a GenServer
mtrudel Jan 6, 2024
06af9ac
Move tracking of content-length into stream process
mtrudel Jan 8, 2024
2e5fcbc
Move content encoding negotiation into stream process
mtrudel Jan 8, 2024
7e745ed
Remove stream ownership checks on sending calls
mtrudel Jan 12, 2024
d18f8fa
Add caller validation into HTTP2 adapter
mtrudel Jan 12, 2024
6ce09ed
Move reset handling into stream process
mtrudel Jan 13, 2024
f1e1136
Move span handling into stream process
mtrudel Jan 13, 2024
5da14a9
Simplify stream_terminated error conditions
mtrudel Jan 13, 2024
ceefe90
Move stream receive window handling into stream process
mtrudel Jan 14, 2024
dc94348
Move stream send window handling into stream process
mtrudel Jan 15, 2024
d780aae
Move req construction and stream ID validation into stream process
mtrudel Jan 18, 2024
c46c703
Split header processing out into a message instead of a startup arg
mtrudel Jan 18, 2024
6022131
First pass at StreamTransport
mtrudel Jan 19, 2024
7f1e75d
Refactor req bytes_remaining logic
mtrudel Jan 21, 2024
88a4e6b
Dialyzer tidy
mtrudel Jan 22, 2024
2cda329
Use guard based process validation
mtrudel Jan 21, 2024
b54fe30
Move header to discrete message
mtrudel Jan 22, 2024
19f4b23
Track state in stream process
mtrudel Jan 21, 2024
ea4d407
Refactor how connections maintain streams
mtrudel Jan 23, 2024
3ccd373
Rename StreamTransport to Stream
mtrudel Jan 24, 2024
97370e7
Reduce use of aliases, move a few types around
mtrudel Jan 24, 2024
f721c84
Improve coverage of end-of-request-handling cases
mtrudel Jan 24, 2024
58d3e5b
Experiment with sends vs calls
mtrudel Jan 24, 2024
715391f
Documentation update
mtrudel Jan 25, 2024
8ff36a6
factor closed stream handling into stream module
mtrudel Jan 25, 2024
1ac46d2
Misc minor stream tidying
mtrudel Jan 25, 2024
afce168
Trial of merging header/data and end_stream messages
mtrudel Jan 25, 2024
665a07e
Connection now returns structs instead of tuples where possible
mtrudel Jan 25, 2024
ab8f72e
Tweak internal return matches
mtrudel Jan 25, 2024
6cfd509
Rename shutdown_connection to close_connection
mtrudel Jan 25, 2024
e5fa21e
Tidy
mtrudel Jan 25, 2024
085d30a
Refactor frame deserialization for cleaner errors
mtrudel Jan 25, 2024
3f5d7e3
Add test for respecting client max frame size
mtrudel Jan 26, 2024
8649bed
Improve coverage of error cases & logging
mtrudel Jan 26, 2024
d474b9b
Remove untenable test
mtrudel Jan 26, 2024
22f6500
Replace header size validation in HTTP/2 with lower-level block size …
mtrudel Jan 26, 2024
697cd4e
Tweak flaky test to avoid having to care about send window updates
mtrudel Jan 26, 2024
415e1bb
Getting set up for upcoming Bandit.HTTPTransport work
mtrudel Jan 26, 2024
c260900
Pull header sending specifics into HTTP2 stream
mtrudel Jan 27, 2024
1808a25
Blindly pass Plug.Conn.Adapter opts through
mtrudel Jan 27, 2024
f2d2cc7
Factor last bits of h2 into something compatible with where we want t…
mtrudel Feb 8, 2024
6818dea
Add provisional 1.3.0 changelog
mtrudel Feb 9, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
27 changes: 27 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,30 @@
## 1.3.0 (TBD)

### Enhancements

* Complete refactor of HTTP/2. Improved process model is MUCH easier to
understand and yields about a 10% performance boost to HTTP/2 requests (#286)

### Changes

* **BREAKING CHANGE** The HTTP/2 header size limit options have been deprecated,
and have been replaced with a single `max_header_block_size` option. The setting
defaults to 50k bytes, and refers to the size of the compressed header block
as sent on the wire (including any continuation frames)
* We no longer log if processes that are linked to an HTTP/2 stream process
terminate unexpectedly. This has always been unspecified behaviour so is not
considered a breaking change
* Calls of `Plug.Conn` functions for an HTTP/2 connection must now come from the
stream process; any other process will raise an error. Again, this has always
been unspecified behaviour
* Reading the body of an HTTP/2 request after it has already been read will
return `{:ok, ""}` instead of raising a `Bandit.BodyAlreadyReadError` as it
previously did
* We now send RST_STREAM frames if we complete a stream and the remote end is
still open. This optimizes cases where the client may still be sending a body
that we never consumed and don't care about
* We no longer explicitly close the connection when we receive a GOAWAY frame

## 1.2.0 (31 Jan 2024)

### Enhancements
Expand Down
15 changes: 5 additions & 10 deletions lib/bandit.ex
Original file line number Diff line number Diff line change
Expand Up @@ -114,12 +114,9 @@ defmodule Bandit do
Options to configure the HTTP/2 stack in Bandit

* `enabled`: Whether or not to serve HTTP/2 requests. Defaults to true
* `max_header_key_length`: The maximum permitted length of any single header key
(expressed as the number of decompressed bytes) in an HTTP/2 request. Defaults to 10_000 bytes
* `max_header_value_length`: The maximum permitted length of any single header value
(expressed as the number of decompressed bytes) in an HTTP/2 request. Defaults to 10_000 bytes
* `max_header_count`: The maximum permitted number of headers in an HTTP/2 request.
Defaults to 50 headers
* `max_header_block_size`: The maximum permitted length of a field block of an HTTP/2 request
(expressed as the number of compressed bytes). Includes any concatenated block fragments from
continuation frames. Defaults to 50_000 bytes
* `max_requests`: The maximum number of requests to serve in a single
HTTP/2 connection before closing the connection. Defaults to 0 (no limit)
* `default_local_settings`: Options to override the default values for local HTTP/2
Expand All @@ -132,9 +129,7 @@ defmodule Bandit do
"""
@type http_2_options :: [
enabled: boolean(),
max_header_key_length: pos_integer(),
max_header_value_length: pos_integer(),
max_header_count: pos_integer(),
max_header_block_size: pos_integer(),
max_requests: pos_integer(),
default_local_settings: Bandit.HTTP2.Settings.t(),
compress: boolean(),
Expand Down Expand Up @@ -189,7 +184,7 @@ defmodule Bandit do

@top_level_keys ~w(plug scheme port ip keyfile certfile otp_app cipher_suite display_plug startup_log thousand_island_options http_1_options http_2_options websocket_options)a
@http_1_keys ~w(enabled max_request_line_length max_header_length max_header_count max_requests compress deflate_options)a
@http_2_keys ~w(enabled max_header_key_length max_header_value_length max_header_count max_requests default_local_settings compress deflate_options)a
@http_2_keys ~w(enabled max_header_block_size max_requests default_local_settings compress deflate_options)a
@websocket_keys ~w(enabled max_frame_size validate_text_frames compress)a
@thousand_island_keys ThousandIsland.ServerConfig.__struct__()
|> Map.from_struct()
Expand Down
68 changes: 37 additions & 31 deletions lib/bandit/http2/README.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,8 @@
# HTTP/2 Handler

Included in this folder is a complete `ThousandIsland.Handler` based implementation of HTTP/2 as
defined in [RFC 9113](https://datatracker.ietf.org/doc/rfc9113).
defined in [RFC 9110](https://datatracker.ietf.org/doc/rfc9110) & [RFC
9113](https://datatracker.ietf.org/doc/rfc9113)

## Process model

Expand All @@ -10,24 +11,31 @@ Within a Bandit server, an HTTP/2 connection is modeled as a set of processes:
* 1 process per connection, a `Bandit.HTTP2.Handler` module implementing the
`ThousandIsland.Handler` behaviour, and;
* 1 process per stream (i.e.: per HTTP request) within the connection, implemented as
a `Bandit.HTTP2.StreamTask` Task
a `Bandit.HTTP2.StreamProcess` process

Each of these processes model the majority of their state via a
`Bandit.HTTP2.Connection` & `Bandit.HTTP2.Stream` struct, respectively.

The lifetimes of these processes correspond to their role; a connection process lives for as long
as a client is connected, and a stream process lives only as long as is required to process
a single stream request within a connection.
a single stream request within a connection.

Connection processes are the 'root' of each connection's process group, and are supervised by
Thousand Island in the same manner that `ThousandIsland.Handler` processes are usually supervised
(see the [project README](https://github.com/mtrudel/thousand_island) for details).

Stream processes are not supervised by design. The connection process starts new stream processes as required, and does so
once a complete header block for a new stream has been received. It starts stream processes via
a standard `start_link` call, and manages the termination of the resultant linked stream processes
by handling `{:EXIT,...}` messages as described in the Elixir documentation. This approach is
aligned with the realities of the HTTP/2 model, insofar as if a connection process terminates
there is no reason to keep its constituent stream processes around, and if a stream process dies
the connection should be able to handle this without itself terminating. It also means that our
process model is very lightweight - there is no extra supervision overhead present because no such
Stream processes are not supervised by design. The connection process starts new
stream processes as required, via a standard `start_link`
call, and manages the termination of the resultant linked stream processes by
handling `{:EXIT,...}` messages as described in the Elixir documentation. Each
stream process stays alive long enough to fully model an HTTP/2 stream,
beginning its life in the `:init` state and ending it in the `:closed` state (or
else by a stream or connection error being raised). This approach is aligned
with the realities of the HTTP/2 model, insofar as if a connection process
terminates there is no reason to keep its constituent stream processes around,
and if a stream process dies the connection should be able to handle this
without itself terminating. It also means that our process model is very
lightweight - there is no extra supervision overhead present because no such
supervision is required for the system to function in the desired way.

## Reading client data
Expand All @@ -40,13 +48,15 @@ looks like the following:
2. Frames are parsed from these bytes by calling the `Bandit.HTTP2.Frame.deserialize/2`
function. If successful, the parsed frame(s) are returned. We retain any unparsed bytes in
a buffer in order to attempt parsing them upon receipt of subsequent data from the client
3. Parsed frames are passed into the `Bandit.HTTP2.Connection` module along with a struct of
same module. Frames are applied against this struct in a vaguely FSM-like manner, using pattern
matching within the `Bandit.HTTP2.Connection.handle_frame/3` function. Any side-effects of
received frames are applied in these functions, and an updated connection struct is returned to
represent the updated connection state. These side-effects can take the form of starting stream
tasks, conveying data to running stream tasks, responding to the client with various frames, or
any number of other actions
3. Parsed frames are passed into the `Bandit.HTTP2.Connection` module along with a struct of
same module. Frames are processed via the `Bandit.HTTP2.Connection.handle_frame/3` function.
Connection-level frames are handled within the `Bandit.HTTP2.Connection`
struct, and stream-level frames are passed along to the corresponding stream
process, which is wholly responsible for managing all aspects of a stream's
state (which is tracked via the `Bandit.HTTP2.Stream` struct). The one
exception to this is the handling of frames sent to streams which have
already been closed (and whose corresponding processes have thus terminated).
Any such frames are discarded without effect.
4. This process is repeated every time we receive data from the client until the
`Bandit.HTTP2.Connection` module indicates that the connection should be closed, either
normally or due to error. Note that frame deserialization may end up returning a connection
Expand All @@ -58,19 +68,15 @@ looks like the following:

## Processing requests

The details of a particular stream are contained within a `Bandit.HTTP2.Stream` struct
(as well as a `Bandit.HTTP2.StreamTask` process in the case of active streams). The
`Bandit.HTTP2.StreamCollection` module manages a collection of streams, allowing for the memory
efficient management of complete & yet unborn streams alongside active ones.

Once a complete header block has been read, a `Bandit.HTTP2.StreamTask` is started to manage the
actual calling of the configured `Plug` module for this server, using the `Bandit.HTTP2.Adapter`
module as the implementation of the `Plug.Conn.Adapter` behaviour. This adapter uses a simple
`receive` pattern to listen for messages sent to it from the connection process, a pattern chosen
because it allows for easy provision of the blocking-style API required by the `Plug.Conn.Adapter`
behaviour. Functions in the `Bandit.HTTP2.Adapter` behaviour which write data to the client use
`GenServer` calls to the `Bandit.HTTP2.Handler` module in order to pass data to the connection
process.
The state of a particular stream are contained within a `Bandit.HTTP2.Stream`
struct, maintained within a `Bandit.HTTP2.StreamProcess` process. As part of the
stream's lifecycle, the server's configured Plug is called, with an instance of
the `Bandit.HTTP2.Adapter` struct being used to interface with the Plug. There
is a separation of concerns between the aspect of HTTP semantics managed by
`Bandit.HTTP2.Adapter` (roughly, those concerns laid out in
[RFC9110](https://datatracker.ietf.org/doc/html/rfc9110)) and the more
transport-specific HTTP/2 concerns managed by `Bandit.HTTP2.Stream` (roughly the
concerns specified in [RFC9113](https://datatracker.ietf.org/doc/html/rfc9113)).

# Testing

Expand Down
Loading