-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dealing with QUIC version updates #699
Comments
I'd like to avoid creating unnecessary TCP connections (takes a file descriptor, can cause issue with connection-tracking firewalls). Instead, we should try to learn which transports work and use/announce those. Can the QUIC implementation forward connection ID changes to the load balancer? |
That would add quite a lot of additional complexity.
It's strictly better than what we have now: now we'd use the same amount of file descriptors as in the connection-racing case. However, if QUIC wins the race, we quickly (max. a few seconds later) release the file descriptor.
That's a hard problem, and I'm worried that there will be a small percentage of nodes in weird network settings that will experience connectivity problems as soon as we make QUIC the default transport. It's not only about UDP being blocked, middlebox vendors are already developing middleboxes that can selectively block QUIC. And there's no guarantee that the only options are "QUIC works" or "QUIC doesn't work" for any given peer. |
It's strictly better but it's still crap. We shouldn't break interop with our default transport every time we upgrade it and handle that by racing with a fallback TCP connection.
We can always fallback if necessary. We can use something like (or even just use) our AutoNAT nodes to figure out which protocols appear to work. Then, we can announce all of them. On the dial side, we'd have to try them in some order of preference. |
This has been resolved, since the multiaddr now encodes the QUIC version: multiformats/multiaddr#145 |
Overview: Versions in QUIC
Unlike TCP, QUIC has versioning built right into the core protocol. Long Header packets (used during the handshake) carry a version number, whereas Short Header packets (used after the handshake) omit all header fields except for the connection ID.
When a server receives a packet with an unsupported version, it sends a Version Negotiation Packet, which lists all versions that the server supports. The client may then start a new connection attempt using one of those, or abort the connection if there's no overlap in supported versions.
While the IETF QUIC working group is working towards the final QUIC RFC, each draft comes with a new version number. In principle, implementations can support multiple versions at the same time (and quic-go used to do so at some point in the past). Depending on the diff between two draft versions this adds a lot of complexity though.
Question: I haven't found any information specifying how long we guarantee backwards compatibility. Have we made any promises for libp2p downstream users at all?
Options for libp2p
We want to switch to QUIC (see #688) as soon as possible. We need to consider what a QUIC version update means for libp2p users.
No Backwards Compatibility
The easiest option. When establishing a new connection to a peer, we first try dialing QUIC. If the QUIC handshake fails, we fall back to TCP. Unfortunately, this will cost us one round-trip every time we roll out a new quic-go version (if it drops support for previously supported versions).
Furthermore, it discourages being an early adopter: if you're the first peer in the network speaking a new QUIC version, all your handshakes will need to fall back.
Supporting multiple QUIC versions by using multiple quic-go releases
We could run two releases of quic-go in parallel. While each release just handles one (of a few) QUIC versions, together they'd span the range of versions that we want to support.
There are two ways to do this:
Happy-Eyeballs-style Connection Racing
We could implement a Happy Eyeballs-style connection establishment: If we know a QUIC and a TCP multiaddr of a peer, we can race two connection attempts (maybe even giving QUIC a headstart of 50ms or so). Whichever handshake finishes first wins, and the client silently kills the other connection.
Racing connections would be a valuable feature even after QUIC becomes more mature. According to Google's measurements (see paragraph 7.2 of their Sigcomm paper), UDP is blocked for ~4% of their users, mostly in entreprise networks. For these users, racing two connections would prevent them from (consciuously) running into the QUIC handshake timeout (which by default happens after 10 seconds).
Since happy-eyeballing tends to mask any connection problems, we would probably want to collect some handshake statistics, containing (at least):
The text was updated successfully, but these errors were encountered: