Skip to content

Conversation

@camshaft
Copy link
Contributor

@camshaft camshaft commented Aug 7, 2025

Description of changes:

There were a few TODOs around handling secret control packets on UDP streams. This is because they don't actually contain a queue_id field and so the socket pool wouldn't know where to forward them. Ultimately, this causes issues in the UDP streams where they time out rather than failing fast.

This change adds a new bit to the packet tag to indicate if the secret control packet has a queue_id. The packet router will now use that and forward packets to the appropriate stream.

Testing:

I've included a test to show the change working. Before these changes, this test failed by exceeding the time budget in the test. After the change, the test fails fast by operating on the UnknownPathSecret packet in 1RTT.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@camshaft camshaft requested a review from Copilot August 7, 2025 00:32
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR fixes the routing of secret control packets for UDP streams by adding queue_id support. Previously, secret control packets didn't contain a queue_id field, preventing the socket pool from properly forwarding them to the appropriate streams, causing timeouts instead of fast failures.

  • Adds an optional queue_id field to secret control packet structures
  • Introduces a new bit flag in packet tags to indicate queue_id presence
  • Updates packet routing logic to dispatch control packets to specific queues
  • Improves error handling for streams using unknown path secrets

Reviewed Changes

Copilot reviewed 26 out of 26 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
dc/wireshark/src/test.rs Updates test enum and parsing logic to handle optional queue_id fields
dc/wireshark/src/dissect.rs Adds queue_id parsing in packet dissection based on tag flags
dc/s2n-quic-dc/src/stream/tests/deterministic.rs Adds test demonstrating fast failure behavior with unknown path secrets
dc/s2n-quic-dc/src/stream/testing.rs Exposes map access for testing state manipulation
dc/s2n-quic-dc/src/stream/send/worker.rs Implements secret control packet handling with fast error propagation
dc/s2n-quic-dc/src/stream/send/error.rs Defines new error types for crypto key and path secret issues
dc/s2n-quic-dc/src/stream/recv/shared.rs Adds error notification mechanism and application waker storage
dc/s2n-quic-dc/src/stream/recv/error.rs Adds UnknownPathSecret error variant
dc/s2n-quic-dc/src/stream/recv/dispatch.rs Implements dispatch methods for routing control packets by queue_id
dc/s2n-quic-dc/src/stream/endpoint.rs Updates credentials pairing to include queue_id parameter
dc/s2n-quic-dc/src/socket/recv/router/*.rs Updates router implementations to support new dispatch methods
dc/s2n-quic-dc/src/path/secret/map/*.rs Refactors control packet handling and adds queue_id throughout the map layer
dc/s2n-quic-dc/src/packet/ Updates packet structures and encoding/decoding to support optional queue_id fields

@camshaft camshaft force-pushed the camshaft/udp-secret-control-pkts branch from 560919f to 634a2b0 Compare August 7, 2025 00:42
@camshaft camshaft marked this pull request as ready for review August 7, 2025 00:44
}
Packet::StaleKey(packet) => {
secret_control!(packet, handle_stale_key_packet, |packet| {
// make sure that this stream would be rejected before processing
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe worth noting why we want to be extra cautious here? IIUC, this is necessary because we're not authenticating the queue ID?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, the queue ID is authenticated. So I guess this is defense in depth against a bad implementation on the other side? But then I'm not sure it's the right thing -- we should close the stream regardless, right? The other side isn't expecting us to keep talking on this stream. If it got the stale key packet wrong then it's buggy but ignoring the packet doesn't do much good.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah we are authenticating the queue ID. The idea here was to ensure we didn't blindly accept a packet intended for another stream. For example, I could capture a StaleKey packet from earlier for the same path secret ID, and replay it over and over again to streams that reuse the same queue ID. This check prevents that.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, right, the queue ID isn't one time. It seems like we may want to embed the stream ID into the packets (in addition to the queue ID) to more strongly bind them while we're changing the format? Do you think that's worth doing? It's sort of duplicative with these checks but not completely. (e.g., stale key packet check here is actually pretty dependent on the replay implementation not having any 'holes' in its knowledge for preventing what you're outlining).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Talked offline - we're going to punt on doing this until we bump the wireversion later, which will be negotiated as part of the initial handshake.

);

// FIXME: More actively schedule a new handshake.
// See comment on requested_handshakes for details.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can follow up but while we're touching a bunch of the code we're now requesting handshakes actively (request_handshake calls into a callback with any registered client). So some FIXMEs can be dropped, like this one.

Ok(()) => {}
Err(e) => {
self.send_control_error(&state, identity, e);
control_out.resize(control::UnknownPathSecret::PACKET_SIZE, 0);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not clear to me why this resize was here before -- did it just look like a typo/stale code and so that's why you deleted it? I can't tell if it's supposed to be load bearing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this code was just straight up useless. But if you look at the pre_authentication method, it does actually return an error so this really isn't triggered.

credentials: credentials::Id,
segment: descriptor::Filled,
) {
tracing::warn!(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just call on_unhandled_packet? I think that logs the same things? This is fine too of course, just curious.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the on_unhandled_packet method requires a full packet whereas this only has a few extracted fields plus the raw segment (so it can be sent and reparsed by the receiver).

}
Packet::UnknownPathSecret(packet) => {
secret_control!(packet, handle_unknown_path_secret_packet, |_packet| {
UnknownPathSecret
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why isn't this checking that the UPS is for the credential this stream is using? Is that because it's handled by the auth layer?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is - it's just done inside the secret_control macro

Packet::ReplayDetected(packet) => {
secret_control!(packet, handle_replay_detected_packet, |packet| {
// make sure the rejected key id matches the credentials we're using
ensure!(packet.rejected_key_id == credentials.key_id, Ok(()));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to above, why isn't this handled in the auth layer?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to the StaleKey thing i mentioned, this would allow an attacker to capture packets for the same queue ID but for previous key_ids. This check ensures it only pertains to the present stream.

let value = Self {
wire_version,
credential_id,
wire_version,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm trying to reason through why we opted for a tag change rather than a wire version change. Can you elaborate why we can't use the wire version to add this field?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah it's a good question. My thoughts:

  • We don't actually have a way to negotiate wire versions right now and that would have to be added before we could make a change
  • In either case, we would likely want this field optional since not all interactions have a queue_id so we'd need a bit for that

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't actually have a way to negotiate wire versions right now and that would have to be added before we could make a change

This feels untrue -- we could just as easily add support for wire version 1 just like we're adding support for a new tag, right? I don't see much difference there.

In either case, we would likely want this field optional since not all interactions have a queue_id so we'd need a bit for that

Similarly we could and would keep supporting wire version 0 and 1 -- so that would be our "bit". (Perhaps it binds e.g. odd versions to be the ones with this field, but it's still not consuming a probably more valuable tag bit).

I don't have a super strong opinion though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i see what you're saying - basically if we needed to send a queue ID, then we'd bump the wire version? That could work - I'm not just sure if we want to always send the queue ID for the flows that don't need it (datagrams, streams over TCP, etc).

@camshaft camshaft merged commit c64faf9 into main Aug 7, 2025
130 checks passed
@camshaft camshaft deleted the camshaft/udp-secret-control-pkts branch August 7, 2025 17:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants