-
Notifications
You must be signed in to change notification settings - Fork 151
fix(s2n-quic-dc): route secret control packets for UDP streams #2747
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR fixes the routing of secret control packets for UDP streams by adding queue_id support. Previously, secret control packets didn't contain a queue_id field, preventing the socket pool from properly forwarding them to the appropriate streams, causing timeouts instead of fast failures.
- Adds an optional queue_id field to secret control packet structures
- Introduces a new bit flag in packet tags to indicate queue_id presence
- Updates packet routing logic to dispatch control packets to specific queues
- Improves error handling for streams using unknown path secrets
Reviewed Changes
Copilot reviewed 26 out of 26 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| dc/wireshark/src/test.rs | Updates test enum and parsing logic to handle optional queue_id fields |
| dc/wireshark/src/dissect.rs | Adds queue_id parsing in packet dissection based on tag flags |
| dc/s2n-quic-dc/src/stream/tests/deterministic.rs | Adds test demonstrating fast failure behavior with unknown path secrets |
| dc/s2n-quic-dc/src/stream/testing.rs | Exposes map access for testing state manipulation |
| dc/s2n-quic-dc/src/stream/send/worker.rs | Implements secret control packet handling with fast error propagation |
| dc/s2n-quic-dc/src/stream/send/error.rs | Defines new error types for crypto key and path secret issues |
| dc/s2n-quic-dc/src/stream/recv/shared.rs | Adds error notification mechanism and application waker storage |
| dc/s2n-quic-dc/src/stream/recv/error.rs | Adds UnknownPathSecret error variant |
| dc/s2n-quic-dc/src/stream/recv/dispatch.rs | Implements dispatch methods for routing control packets by queue_id |
| dc/s2n-quic-dc/src/stream/endpoint.rs | Updates credentials pairing to include queue_id parameter |
| dc/s2n-quic-dc/src/socket/recv/router/*.rs | Updates router implementations to support new dispatch methods |
| dc/s2n-quic-dc/src/path/secret/map/*.rs | Refactors control packet handling and adds queue_id throughout the map layer |
| dc/s2n-quic-dc/src/packet/ | Updates packet structures and encoding/decoding to support optional queue_id fields |
560919f to
634a2b0
Compare
| } | ||
| Packet::StaleKey(packet) => { | ||
| secret_control!(packet, handle_stale_key_packet, |packet| { | ||
| // make sure that this stream would be rejected before processing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe worth noting why we want to be extra cautious here? IIUC, this is necessary because we're not authenticating the queue ID?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, the queue ID is authenticated. So I guess this is defense in depth against a bad implementation on the other side? But then I'm not sure it's the right thing -- we should close the stream regardless, right? The other side isn't expecting us to keep talking on this stream. If it got the stale key packet wrong then it's buggy but ignoring the packet doesn't do much good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah we are authenticating the queue ID. The idea here was to ensure we didn't blindly accept a packet intended for another stream. For example, I could capture a StaleKey packet from earlier for the same path secret ID, and replay it over and over again to streams that reuse the same queue ID. This check prevents that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, right, the queue ID isn't one time. It seems like we may want to embed the stream ID into the packets (in addition to the queue ID) to more strongly bind them while we're changing the format? Do you think that's worth doing? It's sort of duplicative with these checks but not completely. (e.g., stale key packet check here is actually pretty dependent on the replay implementation not having any 'holes' in its knowledge for preventing what you're outlining).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Talked offline - we're going to punt on doing this until we bump the wireversion later, which will be negotiated as part of the initial handshake.
| ); | ||
|
|
||
| // FIXME: More actively schedule a new handshake. | ||
| // See comment on requested_handshakes for details. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can follow up but while we're touching a bunch of the code we're now requesting handshakes actively (request_handshake calls into a callback with any registered client). So some FIXMEs can be dropped, like this one.
| Ok(()) => {} | ||
| Err(e) => { | ||
| self.send_control_error(&state, identity, e); | ||
| control_out.resize(control::UnknownPathSecret::PACKET_SIZE, 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not clear to me why this resize was here before -- did it just look like a typo/stale code and so that's why you deleted it? I can't tell if it's supposed to be load bearing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah this code was just straight up useless. But if you look at the pre_authentication method, it does actually return an error so this really isn't triggered.
| credentials: credentials::Id, | ||
| segment: descriptor::Filled, | ||
| ) { | ||
| tracing::warn!( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not just call on_unhandled_packet? I think that logs the same things? This is fine too of course, just curious.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the on_unhandled_packet method requires a full packet whereas this only has a few extracted fields plus the raw segment (so it can be sent and reparsed by the receiver).
| } | ||
| Packet::UnknownPathSecret(packet) => { | ||
| secret_control!(packet, handle_unknown_path_secret_packet, |_packet| { | ||
| UnknownPathSecret |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why isn't this checking that the UPS is for the credential this stream is using? Is that because it's handled by the auth layer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is - it's just done inside the secret_control macro
| Packet::ReplayDetected(packet) => { | ||
| secret_control!(packet, handle_replay_detected_packet, |packet| { | ||
| // make sure the rejected key id matches the credentials we're using | ||
| ensure!(packet.rejected_key_id == credentials.key_id, Ok(())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to above, why isn't this handled in the auth layer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to the StaleKey thing i mentioned, this would allow an attacker to capture packets for the same queue ID but for previous key_ids. This check ensures it only pertains to the present stream.
| let value = Self { | ||
| wire_version, | ||
| credential_id, | ||
| wire_version, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm trying to reason through why we opted for a tag change rather than a wire version change. Can you elaborate why we can't use the wire version to add this field?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah it's a good question. My thoughts:
- We don't actually have a way to negotiate wire versions right now and that would have to be added before we could make a change
- In either case, we would likely want this field optional since not all interactions have a queue_id so we'd need a bit for that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't actually have a way to negotiate wire versions right now and that would have to be added before we could make a change
This feels untrue -- we could just as easily add support for wire version 1 just like we're adding support for a new tag, right? I don't see much difference there.
In either case, we would likely want this field optional since not all interactions have a queue_id so we'd need a bit for that
Similarly we could and would keep supporting wire version 0 and 1 -- so that would be our "bit". (Perhaps it binds e.g. odd versions to be the ones with this field, but it's still not consuming a probably more valuable tag bit).
I don't have a super strong opinion though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i see what you're saying - basically if we needed to send a queue ID, then we'd bump the wire version? That could work - I'm not just sure if we want to always send the queue ID for the flows that don't need it (datagrams, streams over TCP, etc).
Description of changes:
There were a few TODOs around handling secret control packets on UDP streams. This is because they don't actually contain a
queue_idfield and so the socket pool wouldn't know where to forward them. Ultimately, this causes issues in the UDP streams where they time out rather than failing fast.This change adds a new bit to the packet tag to indicate if the secret control packet has a
queue_id. The packet router will now use that and forward packets to the appropriate stream.Testing:
I've included a test to show the change working. Before these changes, this test failed by exceeding the time budget in the test. After the change, the test fails fast by operating on the UnknownPathSecret packet in 1RTT.
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.