From 83c79c6b7efa2a660e43cc37532d7c0f12a45496 Mon Sep 17 00:00:00 2001 From: Max Inden Date: Mon, 12 Jul 2021 14:50:28 +0200 Subject: [PATCH 01/26] protocol-select/: Add first draft This commit adds a first draft of the _Protocol Select_ specification. > _Protocol Select_ is a protocol negotiation protocol. It is aimed at negotiating libp2p protocols on connections and streams. It replaces the _[Multistream Select]_ protocol. --- protocol-select/README.md | 536 ++++++++++++++++++++++++++++++++++++++ 1 file changed, 536 insertions(+) create mode 100644 protocol-select/README.md diff --git a/protocol-select/README.md b/protocol-select/README.md new file mode 100644 index 000000000..cad1d70bd --- /dev/null +++ b/protocol-select/README.md @@ -0,0 +1,536 @@ + + + + + + +# Protocol Select + +| Lifecycle Stage | Maturity | Status | Latest Revision | +|-----------------|----------------|--------|-----------------| +| 1A | Working Draft | Active | r0, 2021-XX-XX | + +Authors: [@marten-seemann], [@mxinden] + +Interest Group: + +[@marten-seemann]: https://github.com/marten-seemann +[@mxinden]: https://github.com/mxinden + +See the [lifecycle document][lifecycle-spec] for context about maturity level +and spec status. + +[lifecycle-spec]: https://github.com/libp2p/specs/blob/master/00-framework-01-spec-lifecycle.md + +## Table of Contents + +- [Protocol Select](#protocol-select) + - [Table of Contents](#table-of-contents) + - [Introduction](#introduction) + - [Improvements over _[Multistream Select]_](#improvements-over-_multistream-select_) + - [High-Level Overview](#high-level-overview) + - [Secure Channel Selection](#secure-channel-selection) + - [TCP Simultaneous Open](#tcp-simultaneous-open) + - [Coordinated TCP Simultaneous Open](#coordinated-tcp-simultaneous-open) + - [Uncoordinated TCP Simultaneous Open](#uncoordinated-tcp-simultaneous-open) + - [Stream Multiplexer Selection](#stream-multiplexer-selection) + - [Process](#process) + - [Early data optimization](#early-data-optimization) + - [Monoplexed connections](#monoplexed-connections) + - [0-RTT](#0-rtt) + - [Transitioning from [Multistream Select]](#transitioning-from-multistream-select) + - [Multiphase Rollout](#multiphase-rollout) + - [Phase 1](#phase-1) + - [Phase 2](#phase-2) + - [Phase 3](#phase-3) + - [Additional Rollout Mechanisms](#additional-rollout-mechanisms) + - [Heuristics](#heuristics) + - [TCP](#tcp) + - [QUIC](#quic) + - [Protocol Differentiation](#protocol-differentiation) + - [Protocol Specification](#protocol-specification) + - [Multiplexer Protocol Negotiation](#multiplexer-protocol-negotiation) + - [Stream Protocol Negotiation](#stream-protocol-negotiation) + - [Initiator](#initiator) + - [Listener](#listener) + - [Protocol Names vs. Protocol IDs](#protocol-names-vs-protocol-ids) + - [Migration from _Protocol Name_ to _Protocol ID_](#migration-from-_protocol-name_-to-_protocol-id_) + - [FAQ](#faq) + +## Introduction + +_Protocol Select_ is a protocol negotiation protocol. It is aimed at negotiating +libp2p protocols on connections and streams. It replaces the _[Multistream +Select]_ protocol. + +### Improvements over _[Multistream Select]_ + +- **Downgrade attacks** and **censorship resistance** + + Given that **[Multistream Select]** negotiates a connection's security + protocol unencrypted and unauthenticated it is prone to [downgrade attack]s. + In addition, a man-in-the-middle can detect that a given connection is used + to carry libp2p traffic, allowing attackers to censor such connections. + + **Protocol Select** is combined with a change to the Multiaddr format, + advertising the secure channel protocol through the latter instead of + negotiating them in-band. Thus [Downgrade attack]s are no longer possible at + the protocol negotiation level and a man-in-the-middle can no longer detect + a connection being used for libp2p traffic through the negotation process. + +- **Connection establishment** + + In addition to making us vulnerable to downgrade attacks, negotiating the + security protocol takes one round-trip in the common case with **[Multistream + Select]**. On top of that negotiating a stream multiplexer (on TCP) takes + another round-trip. + + **Protocol Select** on the other hand depends on security protocols being + advertised, thereby eliminating the need for negotiating them. For optimized + implementations, stream muxer negotiation will take zero round-trips for the + client (depending on the details of the cryptographic handshake protocol). In + that case, the client will be able to immediately open a stream after + completing the cryptographic handshake. In addition the protocol supports + zero-round-trip optimistic stream protocol negotiation when proposing a single + protocol. + +- **Data schema** + + The **[Multistream Select]** protocol is defined as a plaintext protocol + with no strict schema definition, making both implementation and protocol + evolution time consuming and error-prone. See [rust-libp2p/1795] showcasing + complexity for implementors and [specs/196] to showcase difficulty evolving + protocol. + + The **Protocol Select** protocol will use a binary data format defined in a + machine parseable schema language allowing protocol evolution at the schema + level. + +- **Bandwidth** + + **Multistream Select** is not as bandwidth efficient as it could be. For + example negotiating a protocol requires sending the protocol name back and + forth. For human readability protocol names are usually long strings (e.g. + `/ipfs/kad/1.0.0`). + + **Protocol Select** will include the option to improve bandwidth efficiency + e.g. around protocol names in the future. While _Protocol Select_ will not + solve this in the first iteration, the protocol should be designed with this + optimization in mind, and allow for a smooth upgrade in a future iteration. + +## High-Level Overview + +### Secure Channel Selection + +Conversely to [Multistream Select], secure channel protocols are not dynamically +negotiated in-band. Instead, they are announced upfront in the peer multiaddrs +(**TODO**: add link to multiaddr spec). This way, implementations can jump +straight into a cryptographic handshake, thus curtailing the possibility of +packet-inspection-based censorship and dynamic downgrade attacks. + +Given that there is no in-band security protocol negotiation, nodes have to +listen on different ports for each offered security protocol. As an example a +node supporting both [Noise] and [TLS] over TCP will need to listen on two TCP +ports e.g. `/ip6/2001:DB8::/tcp/9090/noise` and `/ip6/2001:DB8::/tcp/443/tls`. + +Advertising the secure channel protocol through the peer's Multiaddr instead of +negotiating the protocol in-band forces users to advertise an updated Multiaddr +when changing the secure channel protocol in use. This is especially cumbersome +when using hardcoded Multiaddresses. Users may leverage the [dnsaddr] Multiaddr +protocol as well as using a new UDP or TCP port for the new protocol to ease the +transition. + +Note: A peer MAY advertise a Multiaddr that includes a secure channel handshake +protocol like `/noise` even if it doesn't support Protocol Select. See +[Heuristic section](#heuristic) below for details on how listeners can +differentiate the negotiation protocol spoken by the dialer on incoming +connections. + +### TCP Simultaneous Open + +TCP allows the establishment of a single connection if two endpoints start +initiating a connection at the same time. This is called _TCP Simultaneous +Open_. Since many application protocols running on top of a connection (most +notably the secure channel protocols e.g. TLS) assume their role (client / +server) based on who initiated the connection, TCP Simultaneous Open connections +need special handling. This special handling is described below, differentiating +between two cases of TCP Simultaneous Open: coordinated and uncoordinated. + +#### Coordinated TCP Simultaneous Open + +When doing Hole Punching over TCP, the [_Direct Connection Upgrade through +Relay_][DCUTR] protocol coordinates the two nodes to _simultaneously_ dial each +other, thus, when successful, resulting in a TCP Simultaneous Open connection. +The two nodes are assigned their role (client / server) out-of-band by the +[_Direct Connection Upgrade through Relay_][DCUTR] protocol. + +#### Uncoordinated TCP Simultaneous Open + +In the uncoordinated case, where two nodes coincidentally simultaneously dial +each other, resulting in a TCP Simultaneous Open connection, the secure channel +protocol (e.g. TLS) will fail, given that both nodes assume to be in the +initiating / client role. Nodes SHOULD close the connection and back off for a +random amount of time before trying to reconnect. + + + +### Stream Multiplexer Selection + +This section only applies if Protocol Select is run over a transport that is not +natively multiplexed. For transports that provide stream multiplexing on the +transport layer (e.g. QUIC) this section should be ignored. + +#### Process + +First off, both endpoints, client and server, send a list of supported stream +multiplexer protocols. Nodes SHOULD order the list by preference. Once an +endpoint receives the list, the stream multiplexer to be used on the connection +is determined by intersecting ones own and the remote list, as follows: + +1. All stream multiplexers that aren't supported by both endpoints are removed + from the clients' list of stream multiplexers. + +2. The stream multiplexer chosen is then the first protocol of the client's + list. + +If there is no overlap between the two lists, the two endpoints can not +communicate and thus both endpoints MUST close the connection. + +#### Early data optimization + +Some handshake protocols (TLS 1.3, Noise) support sending of *Early Data*. We +use the term *Early Data* to mean any application data that is sent before the +proper completion of the handshake. + +In Protocol Select endpoints make use of Early Data to speed up stream +multiplexer selection. As soon as an endpoints reaches a state during the +handshake where it can send encrypted application data, it sends a list of +supported stream multiplexers. Note that depending on the handshake protocol +used (and the optimisations implemented), either the client or the server might +arrive at this state first. + +When using TLS 1.3, the server can send Early Data after it receives the +ClientHello. Early Data is encrypted, but at this point of the handshake the +client's identity is not yet verified. + +While Noise in principle allows sending of unencrypted data, endpoints MUST NOT +use this to send their list of stream multiplexers. An endpoint MAY send it as +soon it is possible to send encrypted data, even if the peers' identity is not +verified at that point. + +Handshake protocols (or implementations of handshake protocols) that don't +support sending of Early Data will have to run the stream multiplexer selection +after the handshake completes. + +#### Monoplexed connections + +This negotiation scheme allows peers to negotiate a "monoplexed" connection, +i.e. a connection that doesn't use any stream multiplexer, if we decide to add +support for this in the future. Endpoints can offer support for monoplexed +connections by offering the `/monoplex` stream multiplexer. + +#### 0-RTT + +When using 0-RTT session resumption as offered by TLS 1.3 and Noise, clients +SHOULD remember the stream multiplexer they used before and optimistically offer +that muxer only. A client can then optimistically send application data, not +waiting for the list of supported multiplexers by the server. If the server +still supports the muxer, it will choose the muxer offered by the client when +intersecting the two lists, and proceed with the connection. If not, the list +intersection fails and the connection is closed, which needs to be handled by +the upper protocols. + +## Transitioning from [Multistream Select] + +Protocol Select is not compatible with [Multistream Select] both in its +semantics as well as on the wire. Live libp2p-based networks, currently using +[Multistream Select], will need to follow the multiphased roll-out strategy +detailed below to guarantee a smooth transition. + +### Multiphase Rollout + +#### Phase 1 + +In the first phase of the transition from [Multistream Select] to Protocol +Select, nodes in the network are upgraded to support both [Multistream Select] +and Protocol Select when accepting inbound connections, i.e. when acting as a +listener. Differentiating the two protocols as a listener is detailed in the +[Heuristics](#heuristics) section below. Nodes, when dialing, MUST NOT yet use +Protocol Select, but instead continue to use [Multistream Select]. + +Once a large enogh fraction of the network has upgraded, one can transition to +phase 2. + +#### Phase 2 + +With a large enough fraction of the network supporting Protocol Select on +inbound connections, nodes MAY start using Protocol Select on outbound +connections. + +After a large enough fraction of the network has upgraded, i.e. uses Protocol +Select for outbound connections, one can transition to phase 3. + +#### Phase 3 + +Given that a large enough fraction of the network uses Protocol Select for both +out- and inbound connections, nodes can drop support for [Multistream Select] +concluding the transition. + +### Additional Rollout Mechanisms + +When attempting to upgrade an outbound connection with Protocol Select to a node +that does not yet support Protocol Select, the connection attempt will fail: + +* TCP: when the cryptographic handshake is started + +* QUIC: when the first stream is opened + +Implementations MAY implement a fallback mechanism: If the step described above +fails, they close the current connection and dial a new connection, this time +using [Multistream Select]. + +Note that it takes one connection attempt to discover this failure and an +additional attempt to perform the fallback. Thus this upgrade mechanism should +only be used in addition to the above multiphase rollout to ease the transition. + +### Heuristics + +When accepting a connection, an endpoint doesn't know whether the remote peer is +going to speak [Multistream Select] or Protocol Select to negotiate connection +or stream protocols. The below first describes **at which stage** of a TCP and +QUIC connection the two protocol negotiation protocols need to be +differentiated, followed by **how** one can differentiate the two. + +#### TCP + +Note: Since we decouple the multiaddr change (TODO: Be more specific. What is +the multiaddr change?) from support for Protocol Select, dialing a TCP based +address that contains the security handshake protocol *does not* imply that +we'll speak Protocol Select. + +The first message received on a freshly established and secured TCP connection +will be a message trying to negotiate the stream muxer using either Protocol +Select or [Multistream Select]. + +#### QUIC + +Since QUIC neither negotiates a security nor a stream muxer protocol, we'll have +wait a bit longer before we can distinguish between [Multistream Select] and +Protocol Select, namely until the client opens the first stream. Conversely, +this means that a server won't be able to open a stream until it has determined +which protocol is used. + +#### Protocol Differentiation + +Both Protocol Select and [Multistream Select] prefix their messages with the +varint encoded message length. The first message send by [Multistream Select] is +`/multistream/1.0.0`. Implementations should read the first few bytes and +proceed with either [Multistream Select] or Protocol Select depending on whether +it equals `/multistream/1.0.0` or not. + +## Protocol Specification + +Messages are encoded according to the Protobuf definition below using the +`proto2` syntax. The encoded messages are prefixed with their length in bytes, +encoded as an unsigned variable length integer as defined by the [multiformats +unsigned-varint spec][uvarint-spec]. + +The top level message type is `ProtocolSelect`. Both the `Offer` and the `Use` +messages are wrapped with the `ProtocolSelect` message at all time. + +```protobuf +# Wraps every message +message ProtoSelect { + oneof message { + Offer offer = 1; + Use use = 2; + } +} + +# Select a list of protocols. +message Offer { + message Protocol { + oneof protocol { + string name = 1; + uint64 id = 2; + } + } + repeated Protocol protocols = 1; +} + +# Declare that a protocol is used on this stream. +message Use { + message Protocol { + oneof protocol { + string name = 1; + uint64 id = 2; + } + } + Protocol protocol = 1; +} +``` + +### Multiplexer Protocol Negotiation + +When negotiating a stream multiplexer protocol each endpoint MUST send the +`Offer` message exactly once as the first message on a transport that does not +support native stream multiplexing. This message MUST NOT be sent on transports +that support native stream multiplexing (e.g. QUIC). Once an endpoint has both +sent and received the `Offer` message, it determines the stream multiplexer to +use on the connection as described in the [Stream Multiplexer +Selection](#Stream-Multiplexer-Selection) section. From this moment on, it now +has a multiplexed connection that can be used to exchange application data. + +Note that since depending on the handshake protocol in use, either peer may +arrive at this point in the connectionfirst, this is the only occurrence in the +protocol where both peers send an `Offer` message, and the `Use` message is not +used. + +An endpoint MUST treat the receipt of an empty `Offer` message as a connection +error. + +### Stream Protocol Negotiation + + +#### Initiator + +As the initiator of a new stream an endpoint uses the `Offer` message to +initiate a conversation on a new stream. The `Offer` message can be used in two +distinct scenarios: + +1. **Optimistic Protocol Negotiation**: The endpoint knows exactly which + protocol it wants to use. It then only lists this protocol in the `Offer` + message. It MAY start sending application data right after the protobuf + message. Since it has not received confirmation via an `Use` message stating + whether the remote peer actually supports the protocol, any such data might + be lost in such case. This is also referred to as _Optimistic Protocol + Negotiation_. + +2. **Multi Protocol Negotiation**: The endpoint wants to use any of a set of + protocols, and lets the remote peer decide which one. It then lists the set + of protocols in the `Offer` message. It MUST wait for the peer's choice of + the application protocol via a `Use` message before sending application data. + +An endpoint MUST treat the receipt of a `Use` message before having sent an +`Offer` message on the stream and an empty `Use` message as a connection error. + +#### Listener + +The `Use` message is sent in response to the `Offer` message by the listening +endpoint of a new stream. If the listening endpoint supports none of the +protocol(s) listed in the `Offer` message, the endpoint MUST reset both the +send- and the receive-side of the stream. + +1. **Optimistic Protocol Negotiation**: If an endpoint receives an `Offer` + message that only offers a single protocol, it accepts this protocol by + sending a `Use` message with that protocol only. + +2. **Multi Protocol Negotiation**: If an endpoint receives an `Offer` message + that offers multiple protocols, it chooses an application protocol that it + would like to speak on this stream. It informs the peer about its choice by + sending its selection in the `protocol` field of the `Use` message. + +An endpoint MUST treat the receipt of an empty `Offer` message as an error and +close the stream. To ease protocol evolution, an empty `Offer` message is not +considered a connection error. + +### Protocol Names vs. Protocol IDs + +Protocol Select allows to specify a protocol both by its _Protocol Name_ and +_Protocol ID_. The former being human readable, the latter being bandwidth +efficient. _Name_ and _ID_ of a protocol can be used interchangeably. The +mapping between the two is defined in . Users might define +their own _Protocol Name_ and _Protocol ID_ without updating , +though, to prevent conflicts with future libp2p standard protocols, _ID_ above +XXX SHOULD be chosen. Protocols negotiated via [Multistream Select] today should +use the [Multistream Select] protocol name as the Protocol Select _Protocol +Name_. + +#### Migration from _Protocol Name_ to _Protocol ID_ + +To migrate protocols of a live network to be negotiated using their _Protocol +IDs_ instead of _Protocol Names_ one might either: + +- Migrate along with the Protocol Select roll-out described in the + [Transitioning from Multistream + Select](#Transitioning-from-Multistream-Select) section. + +- Use a separate phased roll-out strategy similar to the one described in + described in the [Transitioning from Multistream + Select](#Transitioning-from-Multistream-Select) section. + +- Optimistically use _Protocol IDs_, retrying with _Protocol Names_ on failure. + +When not choosing the first, users likely want to combine the last two. + +## FAQ + +* _Why don't we define something more sophisticated for uncoordinated TCP + Simultaneous Open?_ + + We make use of TCP Simultaneous Open for NAT Traversal. In this situation, we + coordinate the roles of client and server using the DCUtR protocol, so there's + no need to do anything beyond that. The only situation where a Simultaneous + Open might otherwise occur in the wild is when two peers happen to dial each + other at the same time. This should occur rarely, and if it happens, a sane + strategy would be to re-dial the peer after a (randomized) exponential + backoff. + +* _Why don't we use the presence of a security protocol in the multiaddr to + signal support for Protocol Select?_ + + First of all, it's nice to keep unrelated parts of the system independent from + each other. More importantly though, the proposed logic only works with TCP + addresses. For QUIC, we didn't plan to change the multiaddr, so we'd have to + build logic to distinguish between multistream and Protocol Select anyway. We + _could_ change the multicodec for QUIC, but that would be yet another change + we'd tie into Protocol Select. + +* _Why statically out-of-band specify Protocol Name and Protocol ID mapping, why + not negotiate mapping in-band?_ + + An alternative approach to the proposed _Protocol Name_ _Protocol ID_ mapping + would be to have a dialer use a _Protocol Name_ at first. A listener could, + when replying with a `Use` specify a _Protocol Name_ _Protocol ID_ mapping. + One could then use the _Protocol ID_ instead of the _Protocol Name_ for future + negotiations on that same connection. + + While this approach would reliev us of the need to specify the _Protocol Name_ + _Protocol ID_ mapping in e.g. libp2p/specs, it does add state to be kept + across negotiations, thus complicating implementations and potentially + resulting in state-mismatch edge-cases. Another argument for the current + approach is that one already has to specify the _Protocol Name_ for each + protocol, the effort to specify a _Protocol ID_ in addition thus seems + negligible. + +* _Why not Offer and OfferMultiplexer?_ +* _Why did you use proto2 and not proto3?_ +* _Why Protocol IDs_? + +[Multistream Select]: https://github.com/multiformats/multistream-select +[Noise]: https://github.com/libp2p/specs/tree/master/noise +[TLS]: https://github.com/libp2p/specs/blob/master/tls/tls.md +[DCUtR]: https://github.com/libp2p/specs/pull/173 +[uvarint-spec]: https://github.com/multiformats/unsigned-varint +[dnsaddr]: https://github.com/multiformats/multiaddr/blob/master/protocols/DNSADDR.md + + From 14d5f36966681eea3b6aa6e1ce89e281df698a9f Mon Sep 17 00:00:00 2001 From: Max Inden Date: Tue, 13 Jul 2021 10:23:26 +0200 Subject: [PATCH 02/26] protocol-select/: Extend on uncoordinated TCP Simultaneous Open --- protocol-select/README.md | 41 +++++++++++++++------------------------ 1 file changed, 16 insertions(+), 25 deletions(-) diff --git a/protocol-select/README.md b/protocol-select/README.md index cad1d70bd..bc3790469 100644 --- a/protocol-select/README.md +++ b/protocol-select/README.md @@ -171,27 +171,12 @@ The two nodes are assigned their role (client / server) out-of-band by the In the uncoordinated case, where two nodes coincidentally simultaneously dial each other, resulting in a TCP Simultaneous Open connection, the secure channel -protocol (e.g. TLS) will fail, given that both nodes assume to be in the -initiating / client role. Nodes SHOULD close the connection and back off for a -random amount of time before trying to reconnect. - - +protocol handshake will fail, given that both nodes assume to be in the +initiating / client role. E.g. in the case of TLS the protocol will report the +receipt of a ClientHello while it expected a ServerHello. Once the security +handshake failed due to TCP Simultaneous Open, i.e. due to both sides assuming +to be the client, nodes SHOULD close the connection and back off for a random +amount of time before trying to reconnect. ### Stream Multiplexer Selection @@ -495,6 +480,14 @@ When not choosing the first, users likely want to combine the last two. strategy would be to re-dial the peer after a (randomized) exponential backoff. +* _Why don't we use the peer IDs to break the tie on uncoordinated TCP + Simultaneous Open?_ + + We cannot assume that the remote peer knows our peer ID when it is dialing us. + While this is true in *most* cases, it is possible to dial a multiaddr without + knowing the peer's ID, and derive the ID from the information presented during + the handshake. + * _Why don't we use the presence of a security protocol in the multiaddr to signal support for Protocol Select?_ @@ -504,7 +497,7 @@ When not choosing the first, users likely want to combine the last two. build logic to distinguish between multistream and Protocol Select anyway. We _could_ change the multicodec for QUIC, but that would be yet another change we'd tie into Protocol Select. - + * _Why statically out-of-band specify Protocol Name and Protocol ID mapping, why not negotiate mapping in-band?_ @@ -513,7 +506,7 @@ When not choosing the first, users likely want to combine the last two. when replying with a `Use` specify a _Protocol Name_ _Protocol ID_ mapping. One could then use the _Protocol ID_ instead of the _Protocol Name_ for future negotiations on that same connection. - + While this approach would reliev us of the need to specify the _Protocol Name_ _Protocol ID_ mapping in e.g. libp2p/specs, it does add state to be kept across negotiations, thus complicating implementations and potentially @@ -532,5 +525,3 @@ When not choosing the first, users likely want to combine the last two. [DCUtR]: https://github.com/libp2p/specs/pull/173 [uvarint-spec]: https://github.com/multiformats/unsigned-varint [dnsaddr]: https://github.com/multiformats/multiaddr/blob/master/protocols/DNSADDR.md - - From cb68970d6ca28d92cda18a525718620baf17416e Mon Sep 17 00:00:00 2001 From: Max Inden Date: Tue, 13 Jul 2021 11:31:08 +0200 Subject: [PATCH 03/26] protocol-select/: Document protocol evolution --- protocol-select/README.md | 48 ++++++++++++++++++++++++++++++++++++--- 1 file changed, 45 insertions(+), 3 deletions(-) diff --git a/protocol-select/README.md b/protocol-select/README.md index bc3790469..d1b1e79e7 100644 --- a/protocol-select/README.md +++ b/protocol-select/README.md @@ -339,15 +339,21 @@ Messages are encoded according to the Protobuf definition below using the encoded as an unsigned variable length integer as defined by the [multiformats unsigned-varint spec][uvarint-spec]. -The top level message type is `ProtocolSelect`. Both the `Offer` and the `Use` +The top level message type is `ProtoSelect`. With the current version of +_Protocol Select_ detailed in this document, the `version` field of the +`ProtocolSelect` message is set to `1`. Implementations MUST reject messages +with a `version` other than the current version. See [Protocol +Evolution](#Protocol-Evolution) for details. Both the `Offer` and the `Use` messages are wrapped with the `ProtocolSelect` message at all time. ```protobuf # Wraps every message message ProtoSelect { + uint32 version = 1; + oneof message { - Offer offer = 1; - Use use = 2; + Offer offer = 2; + Use use = 3; } } @@ -467,6 +473,41 @@ IDs_ instead of _Protocol Names_ one might either: When not choosing the first, users likely want to combine the last two. +### Protocol Evolution + +While we can not foresee all future use-cases of _Protocol Select_, we can +design _Protocol Select_ in a way to be easy to evolve, and thus be able to +adapt _Protocol Select_ to support those unknown future use-cases. + +#### Non-Breaking Changes + +Non-breaking changes to the protocol can be done at the schema level, more +specifically through the _Protocol Buffer_ framework. Instead of enumerating the +various update mechanisms, we refer to the _[Updating a Message Type]_ section +of the _Protocol Buffer_ specification. + +As an example for a non-breaking change, say we would like to exchange a made up +name via the _Protocol Select_ protocol. We can simply extend the `ProtoSelect` +message type by an `optional string name = 4;` field. Updated implementations +would be able to extract the name from the payload, old implementations would +simply ignore the new field. + +#### Breaking Changes + +When making breaking changes to the _Protocol Select_ protocol, +implementations need to be able to differentiate the old and the new version on +the wire. This is done via the `version` field in the `ProtoSelect` message, +treated as an ordinal monotonically increasing number, with each increase +identifying a new breaking version of the protocol. + +As an example for a made-up breaking change, say we would like the listed +protocols in the `Offer` message to enumerate the protocols that the local node +does *not* support. One would bump the `version` field by `1`. Implementations +supporting both versions are able to differentiate an old and new version +message. Implementations supporting only the old version would reject a new +version message and fail the negotiation. Roll-out strategies need to cope with +such negotiation failure, e.g. through retries with an older version. + ## FAQ * _Why don't we define something more sophisticated for uncoordinated TCP @@ -525,3 +566,4 @@ When not choosing the first, users likely want to combine the last two. [DCUtR]: https://github.com/libp2p/specs/pull/173 [uvarint-spec]: https://github.com/multiformats/unsigned-varint [dnsaddr]: https://github.com/multiformats/multiaddr/blob/master/protocols/DNSADDR.md +[Updating a Message Type]: https://developers.google.com/protocol-buffers/docs/proto#updating From 03e9e27c8927bd64affd767ee24374be0b4065fb Mon Sep 17 00:00:00 2001 From: Max Inden Date: Tue, 13 Jul 2021 12:38:49 +0200 Subject: [PATCH 04/26] protocol-select/: Restructure message flow - Remove `Use` and `Offer` message type, embedding the list of protocols in the `ProtoSelect` message instead. - Allow non-multiplexer protocols on first protocol negotiation. - Mention nested stream protocol negotiation - Send empty protocol list to say that one supports none of the offered protocols. --- protocol-select/README.md | 184 ++++++++++++++++---------------------- 1 file changed, 79 insertions(+), 105 deletions(-) diff --git a/protocol-select/README.md b/protocol-select/README.md index d1b1e79e7..ffa182dde 100644 --- a/protocol-select/README.md +++ b/protocol-select/README.md @@ -123,6 +123,23 @@ Select]_ protocol. ## High-Level Overview +### Basic Flow + +Both endpoints, client and server, send a list of supported protocols. Whether +an endpoint sends its list before or after it has received the remote's list +depends on the context and is detailed below. Nodes SHOULD order the list by +preference. Once an endpoint receives the list, the protocol to be used on the +connection or stream is determined by intersecting ones own and the remote list, +as follows: + +1. All protocols that aren't supported by both endpoints are removed from the + clients' list of protocols. + +2. The protocol chosen is the first protocol of the client's list. + +If there is no overlap between the two lists, the two endpoints can not +communicate and thus both endpoints MUST close the connection or stream. + ### Secure Channel Selection Conversely to [Multistream Select], secure channel protocols are not dynamically @@ -178,27 +195,15 @@ handshake failed due to TCP Simultaneous Open, i.e. due to both sides assuming to be the client, nodes SHOULD close the connection and back off for a random amount of time before trying to reconnect. -### Stream Multiplexer Selection +### Connection Protocol Negotiation This section only applies if Protocol Select is run over a transport that is not natively multiplexed. For transports that provide stream multiplexing on the transport layer (e.g. QUIC) this section should be ignored. -#### Process - -First off, both endpoints, client and server, send a list of supported stream -multiplexer protocols. Nodes SHOULD order the list by preference. Once an -endpoint receives the list, the stream multiplexer to be used on the connection -is determined by intersecting ones own and the remote list, as follows: - -1. All stream multiplexers that aren't supported by both endpoints are removed - from the clients' list of stream multiplexers. - -2. The stream multiplexer chosen is then the first protocol of the client's - list. - -If there is no overlap between the two lists, the two endpoints can not -communicate and thus both endpoints MUST close the connection. +While the first protocol to be negotiated on a non-multiplexed connection is +currently always a multiplexer protocol, future libp2p versions might want to +negotiate non-multiplexer protocols as the first protocol on a connection. #### Early data optimization @@ -206,43 +211,76 @@ Some handshake protocols (TLS 1.3, Noise) support sending of *Early Data*. We use the term *Early Data* to mean any application data that is sent before the proper completion of the handshake. -In Protocol Select endpoints make use of Early Data to speed up stream -multiplexer selection. As soon as an endpoints reaches a state during the -handshake where it can send encrypted application data, it sends a list of -supported stream multiplexers. Note that depending on the handshake protocol -used (and the optimisations implemented), either the client or the server might -arrive at this state first. +In _Protocol Select_ endpoints make use of Early Data to speed up protocol +negotiation. As soon as an endpoints reaches a state during the handshake where +it can send encrypted application data, it sends a list of supported protocols, +no matter whether it is in the role of a client or server. Note that depending +on the handshake protocol used (and the optimisations implemented), either the +client or the server might arrive at this state first. When using TLS 1.3, the server can send Early Data after it receives the ClientHello. Early Data is encrypted, but at this point of the handshake the client's identity is not yet verified. While Noise in principle allows sending of unencrypted data, endpoints MUST NOT -use this to send their list of stream multiplexers. An endpoint MAY send it as -soon it is possible to send encrypted data, even if the peers' identity is not -verified at that point. +use this to send their list of protocols. An endpoint MAY send it as soon it is +possible to send encrypted data, even if the peers' identity is not verified at +that point. Handshake protocols (or implementations of handshake protocols) that don't -support sending of Early Data will have to run the stream multiplexer selection -after the handshake completes. - -#### Monoplexed connections - -This negotiation scheme allows peers to negotiate a "monoplexed" connection, -i.e. a connection that doesn't use any stream multiplexer, if we decide to add -support for this in the future. Endpoints can offer support for monoplexed -connections by offering the `/monoplex` stream multiplexer. +support sending of Early Data will have to run the protocol negotiation after +the handshake completes. #### 0-RTT When using 0-RTT session resumption as offered by TLS 1.3 and Noise, clients -SHOULD remember the stream multiplexer they used before and optimistically offer -that muxer only. A client can then optimistically send application data, not -waiting for the list of supported multiplexers by the server. If the server -still supports the muxer, it will choose the muxer offered by the client when -intersecting the two lists, and proceed with the connection. If not, the list -intersection fails and the connection is closed, which needs to be handled by -the upper protocols. +SHOULD remember the protocol they used before and optimistically offer that +muxer only. A client can then optimistically send application data, not waiting +for the list of supported protocols by the server. If the server still supports +the muxer, it will choose the muxer offered by the client when intersecting the +two lists, and proceed with the connection. If not, the list intersection fails +and the connection is closed, which needs to be handled by the upper protocols. + +### Stream Protocol Negotiation + +Contrary to the above [Connection Protocol +Negotiation](#Connection-Protocol-Negotiation) and its early data optimization, +we assume that the initiator of a stream is always the endpoint able to send +data first. + +Note: While libp2p currently does not support nested stream protocols, e.g. a +compression protocol wrapping bitswap, future versions of libp2p might change +that. The above assumption of the initiator being the endpoint to send data +first, does not apply to protocol negotiations following the first negotiation +on a stream. + +#### Initiator + +The initiator of a new stream is the first endpoint to send a message. We +differentiate in the following two scenarios. + +1. **Optimistic Protocol Negotiation**: The endpoint knows exactly which + protocol it wants to use. It then only sends this protocol. It MAY start + sending application data right after the _Protocol Select_ protobuf message. + Since it has not received confirmation from the remote peer for the protocol, + any such data might be lost in such case. + +2. **Multi Protocol Negotiation**: The endpoint wants to use any of a set of + protocols, and lets the remote peer decide which one. It then sends the list + of protocols. It MUST wait for the peer's protocol choice before sending + application data. + +An initiator MUST treat the receipt of an empty list of protocols in response to +its list of protocols as a negotiation failure and thus a stream error. + +#### Listener + +The listening endpoint replies to a list of protocols from the initiator by +either: + +- Sending back a single entry list with the protocol it would like to speak. + +- Rejecting all proposed protocols by replying with an empty list of protocols. ## Transitioning from [Multistream Select] @@ -380,70 +418,6 @@ message Use { } ``` -### Multiplexer Protocol Negotiation - -When negotiating a stream multiplexer protocol each endpoint MUST send the -`Offer` message exactly once as the first message on a transport that does not -support native stream multiplexing. This message MUST NOT be sent on transports -that support native stream multiplexing (e.g. QUIC). Once an endpoint has both -sent and received the `Offer` message, it determines the stream multiplexer to -use on the connection as described in the [Stream Multiplexer -Selection](#Stream-Multiplexer-Selection) section. From this moment on, it now -has a multiplexed connection that can be used to exchange application data. - -Note that since depending on the handshake protocol in use, either peer may -arrive at this point in the connectionfirst, this is the only occurrence in the -protocol where both peers send an `Offer` message, and the `Use` message is not -used. - -An endpoint MUST treat the receipt of an empty `Offer` message as a connection -error. - -### Stream Protocol Negotiation - - -#### Initiator - -As the initiator of a new stream an endpoint uses the `Offer` message to -initiate a conversation on a new stream. The `Offer` message can be used in two -distinct scenarios: - -1. **Optimistic Protocol Negotiation**: The endpoint knows exactly which - protocol it wants to use. It then only lists this protocol in the `Offer` - message. It MAY start sending application data right after the protobuf - message. Since it has not received confirmation via an `Use` message stating - whether the remote peer actually supports the protocol, any such data might - be lost in such case. This is also referred to as _Optimistic Protocol - Negotiation_. - -2. **Multi Protocol Negotiation**: The endpoint wants to use any of a set of - protocols, and lets the remote peer decide which one. It then lists the set - of protocols in the `Offer` message. It MUST wait for the peer's choice of - the application protocol via a `Use` message before sending application data. - -An endpoint MUST treat the receipt of a `Use` message before having sent an -`Offer` message on the stream and an empty `Use` message as a connection error. - -#### Listener - -The `Use` message is sent in response to the `Offer` message by the listening -endpoint of a new stream. If the listening endpoint supports none of the -protocol(s) listed in the `Offer` message, the endpoint MUST reset both the -send- and the receive-side of the stream. - -1. **Optimistic Protocol Negotiation**: If an endpoint receives an `Offer` - message that only offers a single protocol, it accepts this protocol by - sending a `Use` message with that protocol only. - -2. **Multi Protocol Negotiation**: If an endpoint receives an `Offer` message - that offers multiple protocols, it chooses an application protocol that it - would like to speak on this stream. It informs the peer about its choice by - sending its selection in the `protocol` field of the `Use` message. - -An endpoint MUST treat the receipt of an empty `Offer` message as an error and -close the stream. To ease protocol evolution, an empty `Offer` message is not -considered a connection error. - ### Protocol Names vs. Protocol IDs Protocol Select allows to specify a protocol both by its _Protocol Name_ and From 780570b16856142e21da5614cf6d184a9b812aff Mon Sep 17 00:00:00 2001 From: Max Inden Date: Tue, 13 Jul 2021 13:51:38 +0200 Subject: [PATCH 05/26] protocol-select/: Move Protocol IDs to Extension section --- protocol-select/README.md | 121 +++++++++++++++++++------------------- 1 file changed, 59 insertions(+), 62 deletions(-) diff --git a/protocol-select/README.md b/protocol-select/README.md index ffa182dde..43769217a 100644 --- a/protocol-select/README.md +++ b/protocol-select/README.md @@ -32,15 +32,17 @@ and spec status. - [Introduction](#introduction) - [Improvements over _[Multistream Select]_](#improvements-over-_multistream-select_) - [High-Level Overview](#high-level-overview) + - [Basic Flow](#basic-flow) - [Secure Channel Selection](#secure-channel-selection) - [TCP Simultaneous Open](#tcp-simultaneous-open) - [Coordinated TCP Simultaneous Open](#coordinated-tcp-simultaneous-open) - [Uncoordinated TCP Simultaneous Open](#uncoordinated-tcp-simultaneous-open) - - [Stream Multiplexer Selection](#stream-multiplexer-selection) - - [Process](#process) + - [Connection Protocol Negotiation](#connection-protocol-negotiation) - [Early data optimization](#early-data-optimization) - - [Monoplexed connections](#monoplexed-connections) - [0-RTT](#0-rtt) + - [Stream Protocol Negotiation](#stream-protocol-negotiation) + - [Initiator](#initiator) + - [Listener](#listener) - [Transitioning from [Multistream Select]](#transitioning-from-multistream-select) - [Multiphase Rollout](#multiphase-rollout) - [Phase 1](#phase-1) @@ -52,12 +54,11 @@ and spec status. - [QUIC](#quic) - [Protocol Differentiation](#protocol-differentiation) - [Protocol Specification](#protocol-specification) - - [Multiplexer Protocol Negotiation](#multiplexer-protocol-negotiation) - - [Stream Protocol Negotiation](#stream-protocol-negotiation) - - [Initiator](#initiator) - - [Listener](#listener) - - [Protocol Names vs. Protocol IDs](#protocol-names-vs-protocol-ids) - - [Migration from _Protocol Name_ to _Protocol ID_](#migration-from-_protocol-name_-to-_protocol-id_) + - [Protocol Evolution](#protocol-evolution) + - [Non-Breaking Changes](#non-breaking-changes) + - [Breaking Changes](#breaking-changes) + - [Extensions](#extensions) + - [Protocol IDs](#protocol-ids) - [FAQ](#faq) ## Introduction @@ -377,75 +378,28 @@ Messages are encoded according to the Protobuf definition below using the encoded as an unsigned variable length integer as defined by the [multiformats unsigned-varint spec][uvarint-spec]. -The top level message type is `ProtoSelect`. With the current version of -_Protocol Select_ detailed in this document, the `version` field of the -`ProtocolSelect` message is set to `1`. Implementations MUST reject messages +Messages are encoded via the `ProtoSelect` message type. With the current +version of _Protocol Select_ detailed in this document, the `version` field of +the `ProtocolSelect` message is set to `1`. Implementations MUST reject messages with a `version` other than the current version. See [Protocol Evolution](#Protocol-Evolution) for details. Both the `Offer` and the `Use` messages are wrapped with the `ProtocolSelect` message at all time. ```protobuf -# Wraps every message message ProtoSelect { uint32 version = 1; - oneof message { - Offer offer = 2; - Use use = 3; - } -} - -# Select a list of protocols. -message Offer { message Protocol { oneof protocol { string name = 1; - uint64 id = 2; } } - repeated Protocol protocols = 1; -} - -# Declare that a protocol is used on this stream. -message Use { - message Protocol { - oneof protocol { - string name = 1; - uint64 id = 2; - } - } - Protocol protocol = 1; + repeated Protocol protocols = 2; } ``` -### Protocol Names vs. Protocol IDs - -Protocol Select allows to specify a protocol both by its _Protocol Name_ and -_Protocol ID_. The former being human readable, the latter being bandwidth -efficient. _Name_ and _ID_ of a protocol can be used interchangeably. The -mapping between the two is defined in . Users might define -their own _Protocol Name_ and _Protocol ID_ without updating , -though, to prevent conflicts with future libp2p standard protocols, _ID_ above -XXX SHOULD be chosen. Protocols negotiated via [Multistream Select] today should -use the [Multistream Select] protocol name as the Protocol Select _Protocol -Name_. - -#### Migration from _Protocol Name_ to _Protocol ID_ - -To migrate protocols of a live network to be negotiated using their _Protocol -IDs_ instead of _Protocol Names_ one might either: - -- Migrate along with the Protocol Select roll-out described in the - [Transitioning from Multistream - Select](#Transitioning-from-Multistream-Select) section. - -- Use a separate phased roll-out strategy similar to the one described in - described in the [Transitioning from Multistream - Select](#Transitioning-from-Multistream-Select) section. - -- Optimistically use _Protocol IDs_, retrying with _Protocol Names_ on failure. - -When not choosing the first, users likely want to combine the last two. + ### Protocol Evolution @@ -482,6 +436,48 @@ message. Implementations supporting only the old version would reject a new version message and fail the negotiation. Roll-out strategies need to cope with such negotiation failure, e.g. through retries with an older version. + +## Extensions + +### Protocol IDs + +The first version of _Protocol Select_ will allow specifying protocols by their +_Protocol Name_, i.e. human readable string representation, only. In order to +optimize on bandwidth, future versions might introduce alternative +representations in a non-breaking manner. + +More specifically, this extension would allow specifying protocols by their +_Protocol ID_. A _Protocol ID_ is a [Multicodec] or a combination of +[Multicodec]s. Implementations can specify a protocol either via a _Protocol +Name_ or a _Protocol ID_ by extending the `Protocol` message type definition as +follows: + +```diff +message Protocol { + oneof protocol { + string name = 1; ++ uint64 id = 2; + } +} +``` + +_Protocol Name_ and _Protocol ID_ can be used interchangeably. To ease roll-out +of a _Protocol ID_ for a protocol that has previously been negotiated via its +_Protocol Name_, one might leverage one (or multiple) of the following +mechanisms: + +- Extending the libp2p identify protocol, allowing nodes to announce their + supported protocols both by _Protocol Name_ and _Protocol ID_, thus signaling + the support for the _Protocol ID_ extension for the concrete protocols. + +- Including a protocol both by its _Protocol Name_ and _Protocol ID_ in the list + of supported protocols. + + Note, when optimistically negotiating a stream protocol as an initiator, with a + remote which might or might not support a protocol's _Protocol ID_, one can + send a list containing both the _Protocol Name_ and the _Protocol ID_ for the + same protocol and directly optimistically send application data. + ## FAQ * _Why don't we define something more sophisticated for uncoordinated TCP @@ -541,3 +537,4 @@ such negotiation failure, e.g. through retries with an older version. [uvarint-spec]: https://github.com/multiformats/unsigned-varint [dnsaddr]: https://github.com/multiformats/multiaddr/blob/master/protocols/DNSADDR.md [Updating a Message Type]: https://developers.google.com/protocol-buffers/docs/proto#updating +[Multicodec]: https://github.com/multiformats/multicodec From c9f6814f404def8534b944df7372ef6021021233 Mon Sep 17 00:00:00 2001 From: Max Inden Date: Tue, 13 Jul 2021 17:32:36 +0200 Subject: [PATCH 06/26] protocol-select: Replace mention of muxer with generic protocol --- protocol-select/README.md | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/protocol-select/README.md b/protocol-select/README.md index 43769217a..07e236020 100644 --- a/protocol-select/README.md +++ b/protocol-select/README.md @@ -119,8 +119,8 @@ Select]_ protocol. **Protocol Select** will include the option to improve bandwidth efficiency e.g. around protocol names in the future. While _Protocol Select_ will not - solve this in the first iteration, the protocol should be designed with this - optimization in mind, and allow for a smooth upgrade in a future iteration. + solve this in the first iteration, the protocol is designed with this + optimization in mind, and allows for a smooth upgrade in a future iteration. ## High-Level Overview @@ -129,9 +129,9 @@ Select]_ protocol. Both endpoints, client and server, send a list of supported protocols. Whether an endpoint sends its list before or after it has received the remote's list depends on the context and is detailed below. Nodes SHOULD order the list by -preference. Once an endpoint receives the list, the protocol to be used on the -connection or stream is determined by intersecting ones own and the remote list, -as follows: +preference. Once an endpoint receives a list from a remote, the protocol to be +used on the connection or stream is determined by intersecting ones own and the +remote list, as follows: 1. All protocols that aren't supported by both endpoints are removed from the clients' list of protocols. @@ -236,9 +236,9 @@ the handshake completes. When using 0-RTT session resumption as offered by TLS 1.3 and Noise, clients SHOULD remember the protocol they used before and optimistically offer that -muxer only. A client can then optimistically send application data, not waiting +protocol only. A client can then optimistically send application data, not waiting for the list of supported protocols by the server. If the server still supports -the muxer, it will choose the muxer offered by the client when intersecting the +the protocol, it will choose the protocol offered by the client when intersecting the two lists, and proceed with the connection. If not, the list intersection fails and the connection is closed, which needs to be handled by the upper protocols. @@ -252,8 +252,8 @@ data first. Note: While libp2p currently does not support nested stream protocols, e.g. a compression protocol wrapping bitswap, future versions of libp2p might change that. The above assumption of the initiator being the endpoint to send data -first, does not apply to protocol negotiations following the first negotiation -on a stream. +first, does not apply to protocol negotiations following the first negotiation - +a nested negotiation - on a stream. #### Initiator From aea6579db814d940171982bd0dcd85ccaa55dead Mon Sep 17 00:00:00 2001 From: Max Inden Date: Tue, 13 Jul 2021 17:37:20 +0200 Subject: [PATCH 07/26] protocol-select: Extend FAQ section --- protocol-select/README.md | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/protocol-select/README.md b/protocol-select/README.md index 07e236020..4860f4a28 100644 --- a/protocol-select/README.md +++ b/protocol-select/README.md @@ -336,6 +336,7 @@ Note that it takes one connection attempt to discover this failure and an additional attempt to perform the fallback. Thus this upgrade mechanism should only be used in addition to the above multiphase rollout to ease the transition. + ### Heuristics When accepting a connection, an endpoint doesn't know whether the remote peer is @@ -526,9 +527,9 @@ mechanisms: protocol, the effort to specify a _Protocol ID_ in addition thus seems negligible. -* _Why not Offer and OfferMultiplexer?_ +* _Why not two messages, e.g. `Offer` and `Use`?_ * _Why did you use proto2 and not proto3?_ -* _Why Protocol IDs_? +* _Why not include Protocol IDs from the start_? [Multistream Select]: https://github.com/multiformats/multistream-select [Noise]: https://github.com/libp2p/specs/tree/master/noise From a814240033e2f0aac80cb599ce7236eb50172b96 Mon Sep 17 00:00:00 2001 From: Max Inden Date: Mon, 19 Jul 2021 16:25:27 +0200 Subject: [PATCH 08/26] protocol-select/README.md: Fix typo Co-authored-by: Thomas Eizinger --- protocol-select/README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/protocol-select/README.md b/protocol-select/README.md index 4860f4a28..afb69134e 100644 --- a/protocol-select/README.md +++ b/protocol-select/README.md @@ -430,7 +430,7 @@ treated as an ordinal monotonically increasing number, with each increase identifying a new breaking version of the protocol. As an example for a made-up breaking change, say we would like the listed -protocols in the `Offer` message to enumerate the protocols that the local node +protocols in the `ProtoSelect` message to enumerate the protocols that the local node does *not* support. One would bump the `version` field by `1`. Implementations supporting both versions are able to differentiate an old and new version message. Implementations supporting only the old version would reject a new From 1a9adc5244d63ec930f9365f93743517483a1b8f Mon Sep 17 00:00:00 2001 From: Max Inden Date: Mon, 19 Jul 2021 16:26:20 +0200 Subject: [PATCH 09/26] protocol-select/README.md: Fix typo Co-authored-by: Adrian Lanzafame --- protocol-select/README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/protocol-select/README.md b/protocol-select/README.md index afb69134e..518e64c6e 100644 --- a/protocol-select/README.md +++ b/protocol-select/README.md @@ -358,7 +358,7 @@ Select or [Multistream Select]. #### QUIC -Since QUIC neither negotiates a security nor a stream muxer protocol, we'll have +Since QUIC neither negotiates a security nor a stream muxer protocol, we'll have to wait a bit longer before we can distinguish between [Multistream Select] and Protocol Select, namely until the client opens the first stream. Conversely, this means that a server won't be able to open a stream until it has determined From 9bfe1159b4acb3e61c61ee0b5defb4ac5f1c164b Mon Sep 17 00:00:00 2001 From: Marten Seemann Date: Sun, 25 Jul 2021 16:50:48 +0200 Subject: [PATCH 10/26] split the multiaddr change out of this spec --- protocol-select/README.md | 69 +++++++-------------------------------- 1 file changed, 12 insertions(+), 57 deletions(-) diff --git a/protocol-select/README.md b/protocol-select/README.md index 518e64c6e..7a29a27a7 100644 --- a/protocol-select/README.md +++ b/protocol-select/README.md @@ -69,31 +69,11 @@ Select]_ protocol. ### Improvements over _[Multistream Select]_ -- **Downgrade attacks** and **censorship resistance** - - Given that **[Multistream Select]** negotiates a connection's security - protocol unencrypted and unauthenticated it is prone to [downgrade attack]s. - In addition, a man-in-the-middle can detect that a given connection is used - to carry libp2p traffic, allowing attackers to censor such connections. - - **Protocol Select** is combined with a change to the Multiaddr format, - advertising the secure channel protocol through the latter instead of - negotiating them in-band. Thus [Downgrade attack]s are no longer possible at - the protocol negotiation level and a man-in-the-middle can no longer detect - a connection being used for libp2p traffic through the negotation process. - -- **Connection establishment** - - In addition to making us vulnerable to downgrade attacks, negotiating the - security protocol takes one round-trip in the common case with **[Multistream - Select]**. On top of that negotiating a stream multiplexer (on TCP) takes - another round-trip. - - **Protocol Select** on the other hand depends on security protocols being - advertised, thereby eliminating the need for negotiating them. For optimized - implementations, stream muxer negotiation will take zero round-trips for the - client (depending on the details of the cryptographic handshake protocol). In - that case, the client will be able to immediately open a stream after +- **Protocol Select** requires security protocols to be advertised, and doesn't + allow negotiation them. For optimized implementations, stream muxer + negotiation will take zero round-trips for the client (depending on the + details of the cryptographic handshake protocol). + In that case, the client will be able to immediately open a stream after completing the cryptographic handshake. In addition the protocol supports zero-round-trip optimistic stream protocol negotiation when proposing a single protocol. @@ -144,28 +124,8 @@ communicate and thus both endpoints MUST close the connection or stream. ### Secure Channel Selection Conversely to [Multistream Select], secure channel protocols are not dynamically -negotiated in-band. Instead, they are announced upfront in the peer multiaddrs -(**TODO**: add link to multiaddr spec). This way, implementations can jump -straight into a cryptographic handshake, thus curtailing the possibility of -packet-inspection-based censorship and dynamic downgrade attacks. - -Given that there is no in-band security protocol negotiation, nodes have to -listen on different ports for each offered security protocol. As an example a -node supporting both [Noise] and [TLS] over TCP will need to listen on two TCP -ports e.g. `/ip6/2001:DB8::/tcp/9090/noise` and `/ip6/2001:DB8::/tcp/443/tls`. - -Advertising the secure channel protocol through the peer's Multiaddr instead of -negotiating the protocol in-band forces users to advertise an updated Multiaddr -when changing the secure channel protocol in use. This is especially cumbersome -when using hardcoded Multiaddresses. Users may leverage the [dnsaddr] Multiaddr -protocol as well as using a new UDP or TCP port for the new protocol to ease the -transition. - -Note: A peer MAY advertise a Multiaddr that includes a secure channel handshake -protocol like `/noise` even if it doesn't support Protocol Select. See -[Heuristic section](#heuristic) below for details on how listeners can -differentiate the negotiation protocol spoken by the dialer on incoming -connections. +negotiated in-band. Implementations MUST advertise multiaddr containing the +security protocol, as described in the [multiaddr spec](https://github.com/libp2p/specs/pull/353). ### TCP Simultaneous Open @@ -347,22 +307,17 @@ differentiated, followed by **how** one can differentiate the two. #### TCP -Note: Since we decouple the multiaddr change (TODO: Be more specific. What is -the multiaddr change?) from support for Protocol Select, dialing a TCP based -address that contains the security handshake protocol *does not* imply that -we'll speak Protocol Select. - The first message received on a freshly established and secured TCP connection will be a message trying to negotiate the stream muxer using either Protocol Select or [Multistream Select]. #### QUIC -Since QUIC neither negotiates a security nor a stream muxer protocol, we'll have to -wait a bit longer before we can distinguish between [Multistream Select] and -Protocol Select, namely until the client opens the first stream. Conversely, -this means that a server won't be able to open a stream until it has determined -which protocol is used. +Since QUIC provides native stream multiplexing, there's no need to negotiate +a stream multiplexer. We therefore have to wait a bit longer before we can +distinguish between [Multistream Select] and Protocol Select, namely until +the client opens the first stream. Conversely, this means that a server won't +be able to open a stream until it has determined which protocol is used. #### Protocol Differentiation From 4ccea37af7cb6377999be6d2ce0db13e60d26efc Mon Sep 17 00:00:00 2001 From: Marten Seemann Date: Sun, 25 Jul 2021 17:04:21 +0200 Subject: [PATCH 11/26] reword description of the challenges associated with upgrading multistream --- protocol-select/README.md | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/protocol-select/README.md b/protocol-select/README.md index 518e64c6e..0115f3a4f 100644 --- a/protocol-select/README.md +++ b/protocol-select/README.md @@ -100,11 +100,11 @@ Select]_ protocol. - **Data schema** - The **[Multistream Select]** protocol is defined as a plaintext protocol - with no strict schema definition, making both implementation and protocol - evolution time consuming and error-prone. See [rust-libp2p/1795] showcasing - complexity for implementors and [specs/196] to showcase difficulty evolving - protocol. + The **[Multistream Select]** protocol is defined as a bespoke format that + doesn't clearly delineate between different atomic messages,making both + implementation and protocol evolution time consuming and error-prone. + See [rust-libp2p/1795] showcasing complexity for implementors and [specs/196] + to showcase difficulty evolving protocol. The **Protocol Select** protocol will use a binary data format defined in a machine parseable schema language allowing protocol evolution at the schema From 79138303d007afc9f721cba1d18cb05747eec6e7 Mon Sep 17 00:00:00 2001 From: Marten Seemann Date: Mon, 26 Jul 2021 21:07:23 +0200 Subject: [PATCH 12/26] add heading for Protocol Select connection establishment Co-authored-by: Max Inden --- protocol-select/README.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/protocol-select/README.md b/protocol-select/README.md index 7a29a27a7..32ac6ee83 100644 --- a/protocol-select/README.md +++ b/protocol-select/README.md @@ -69,7 +69,9 @@ Select]_ protocol. ### Improvements over _[Multistream Select]_ -- **Protocol Select** requires security protocols to be advertised, and doesn't +- **Connection establishment** + + **Protocol Select** requires security protocols to be advertised, and doesn't allow negotiation them. For optimized implementations, stream muxer negotiation will take zero round-trips for the client (depending on the details of the cryptographic handshake protocol). From f589bef0c7a47719d308da489d5d35d4a1af273f Mon Sep 17 00:00:00 2001 From: Max Inden Date: Tue, 27 Jul 2021 11:37:49 +0200 Subject: [PATCH 13/26] protocol-select/README.md: Use bytes for Protocol ID Co-authored-by: Steven Allen --- protocol-select/README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/protocol-select/README.md b/protocol-select/README.md index d689d6a04..f4eb4790c 100644 --- a/protocol-select/README.md +++ b/protocol-select/README.md @@ -414,7 +414,7 @@ follows: message Protocol { oneof protocol { string name = 1; -+ uint64 id = 2; ++ bytes id = 2; } } ``` From 843e76097a50f7bad3693f31e24fe64031f0a865 Mon Sep 17 00:00:00 2001 From: Max Inden Date: Tue, 27 Jul 2021 11:49:21 +0200 Subject: [PATCH 14/26] protocols-select/: Describe remote accepting Protocol ID --- protocol-select/README.md | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/protocol-select/README.md b/protocol-select/README.md index f4eb4790c..035a255bf 100644 --- a/protocol-select/README.md +++ b/protocol-select/README.md @@ -431,10 +431,12 @@ mechanisms: - Including a protocol both by its _Protocol Name_ and _Protocol ID_ in the list of supported protocols. - Note, when optimistically negotiating a stream protocol as an initiator, with a - remote which might or might not support a protocol's _Protocol ID_, one can + Note, when optimistically negotiating a stream protocol as an initiator, with + a remote which might or might not support a protocol's _Protocol ID_, one can send a list containing both the _Protocol Name_ and the _Protocol ID_ for the - same protocol and directly optimistically send application data. + same protocol and directly optimistically send application data. The remote + signals whether it supports the _Protocol Name_ or the _Protocol ID_ by + accepting either in their response. ## FAQ From b93ff637bee0fcacb60a66f54f59ba1fccc5ad6f Mon Sep 17 00:00:00 2001 From: Max Inden Date: Tue, 27 Jul 2021 11:52:55 +0200 Subject: [PATCH 15/26] protocol-select/: Wrap lines --- protocol-select/README.md | 19 +++++++++---------- 1 file changed, 9 insertions(+), 10 deletions(-) diff --git a/protocol-select/README.md b/protocol-select/README.md index 035a255bf..ebc38b3bd 100644 --- a/protocol-select/README.md +++ b/protocol-select/README.md @@ -71,22 +71,21 @@ Select]_ protocol. - **Connection establishment** - **Protocol Select** requires security protocols to be advertised, and doesn't + **Protocol Select** requires security protocols to be advertised, and doesn't allow negotiation them. For optimized implementations, stream muxer negotiation will take zero round-trips for the client (depending on the - details of the cryptographic handshake protocol). - In that case, the client will be able to immediately open a stream after - completing the cryptographic handshake. In addition the protocol supports - zero-round-trip optimistic stream protocol negotiation when proposing a single - protocol. + details of the cryptographic handshake protocol). In that case, the client + will be able to immediately open a stream after completing the cryptographic + handshake. In addition the protocol supports zero-round-trip optimistic stream + protocol negotiation when proposing a single protocol. - **Data schema** The **[Multistream Select]** protocol is defined as a bespoke format that - doesn't clearly delineate between different atomic messages,making both - implementation and protocol evolution time consuming and error-prone. - See [rust-libp2p/1795] showcasing complexity for implementors and [specs/196] - to showcase difficulty evolving protocol. + doesn't clearly delineate between different atomic messages,making both + implementation and protocol evolution time consuming and error-prone. See + [rust-libp2p/1795] showcasing complexity for implementors and [specs/196] to + showcase difficulty evolving protocol. The **Protocol Select** protocol will use a binary data format defined in a machine parseable schema language allowing protocol evolution at the schema From 6eb00236cd4c25d244d41185a231f3122f59c2d8 Mon Sep 17 00:00:00 2001 From: Max Inden Date: Tue, 27 Jul 2021 11:57:29 +0200 Subject: [PATCH 16/26] protocol-select/: Add FAQ entry on Protocol IDs --- protocol-select/README.md | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/protocol-select/README.md b/protocol-select/README.md index ebc38b3bd..bdaf9baa7 100644 --- a/protocol-select/README.md +++ b/protocol-select/README.md @@ -489,6 +489,11 @@ mechanisms: * _Why did you use proto2 and not proto3?_ * _Why not include Protocol IDs from the start_? + _Protocol IDs_ are part of the initial _Protocol Select_ version to reduce + complexity and thus ease the initial roll-out. As detailed above, introducing + _Protocol IDs_ at a later stage can be done with low coordination and + performance overhead. + [Multistream Select]: https://github.com/multiformats/multistream-select [Noise]: https://github.com/libp2p/specs/tree/master/noise [TLS]: https://github.com/libp2p/specs/blob/master/tls/tls.md From 24c9ea2efc91738c9ee7105ac7165af7de59a5d1 Mon Sep 17 00:00:00 2001 From: Max Inden Date: Tue, 27 Jul 2021 12:40:40 +0200 Subject: [PATCH 17/26] protocol-select/: Add FAQ entry on single message type --- protocol-select/README.md | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/protocol-select/README.md b/protocol-select/README.md index bdaf9baa7..f5bf41091 100644 --- a/protocol-select/README.md +++ b/protocol-select/README.md @@ -486,6 +486,14 @@ mechanisms: negligible. * _Why not two messages, e.g. `Offer` and `Use`?_ + + Differentiating the two roles of an endpoint offering a list of protocols and + an endpoint accepting a protocol at the type level through an `Offer` and a + `Use` message would increase type safety. At the same time, as described in + the [Basic Flow section](#basic-flow) whether there is a clear cut between the + two roles depends on the context. Thus, to reduce complexity a single message + type is used. + * _Why did you use proto2 and not proto3?_ * _Why not include Protocol IDs from the start_? From 849b9933520dc641118129229de02196ee75f734 Mon Sep 17 00:00:00 2001 From: Max Inden Date: Tue, 27 Jul 2021 12:48:58 +0200 Subject: [PATCH 18/26] protocol-select/: Add FAQ entry on proto2 vs proto3 --- protocol-select/README.md | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/protocol-select/README.md b/protocol-select/README.md index f5bf41091..f087a83a2 100644 --- a/protocol-select/README.md +++ b/protocol-select/README.md @@ -343,6 +343,8 @@ Evolution](#Protocol-Evolution) for details. Both the `Offer` and the `Use` messages are wrapped with the `ProtocolSelect` message at all time. ```protobuf +syntax = "proto2"; + message ProtoSelect { uint32 version = 1; @@ -495,6 +497,12 @@ mechanisms: type is used. * _Why did you use proto2 and not proto3?_ + + By default [all libp2p protocols use proto2 over + proto3](../README.md#protocols). In addition, to the best of our knowledge, + there are no proto3 features that the _Protocol Select_ protocol could benefit + off. + * _Why not include Protocol IDs from the start_? _Protocol IDs_ are part of the initial _Protocol Select_ version to reduce From bb049b51ae5f939e0eeca6481e6aa4a92f16be4d Mon Sep 17 00:00:00 2001 From: Max Inden Date: Tue, 27 Jul 2021 18:02:06 +0200 Subject: [PATCH 19/26] protocol-select/: Mention both NAT and firewalls --- protocol-select/README.md | 17 +++++++---------- 1 file changed, 7 insertions(+), 10 deletions(-) diff --git a/protocol-select/README.md b/protocol-select/README.md index f087a83a2..229764cd5 100644 --- a/protocol-select/README.md +++ b/protocol-select/README.md @@ -2,9 +2,6 @@ https://github.com/libp2p/specs/tree/master/connections#protocol-negotiation in Protocol Select pull request. --> - - # Protocol Select @@ -444,13 +441,13 @@ mechanisms: * _Why don't we define something more sophisticated for uncoordinated TCP Simultaneous Open?_ - We make use of TCP Simultaneous Open for NAT Traversal. In this situation, we - coordinate the roles of client and server using the DCUtR protocol, so there's - no need to do anything beyond that. The only situation where a Simultaneous - Open might otherwise occur in the wild is when two peers happen to dial each - other at the same time. This should occur rarely, and if it happens, a sane - strategy would be to re-dial the peer after a (randomized) exponential - backoff. + We make use of TCP Simultaneous Open for firewall and NAT Traversal. In this + situation, we coordinate the roles of client and server using the DCUtR + protocol, so there's no need to do anything beyond that. The only situation + where a Simultaneous Open might otherwise occur in the wild is when two peers + happen to dial each other at the same time. This should occur rarely, and if + it happens, a sane strategy would be to re-dial the peer after a (randomized) + exponential backoff. * _Why don't we use the peer IDs to break the tie on uncoordinated TCP Simultaneous Open?_ From 78867982dc6a23fe2edba1a3e347bfbcda712e82 Mon Sep 17 00:00:00 2001 From: Max Inden Date: Tue, 27 Jul 2021 18:05:35 +0200 Subject: [PATCH 20/26] protocol-select/: Use dialer/listener instead of client/server Using dialer/listener instead of client/server seems to be in line with most other libp2p specifications. --- protocol-select/README.md | 42 +++++++++++++++++++-------------------- 1 file changed, 20 insertions(+), 22 deletions(-) diff --git a/protocol-select/README.md b/protocol-select/README.md index 229764cd5..2249129cf 100644 --- a/protocol-select/README.md +++ b/protocol-select/README.md @@ -2,8 +2,6 @@ https://github.com/libp2p/specs/tree/master/connections#protocol-negotiation in Protocol Select pull request. --> - - # Protocol Select | Lifecycle Stage | Maturity | Status | Latest Revision | @@ -70,8 +68,8 @@ Select]_ protocol. **Protocol Select** requires security protocols to be advertised, and doesn't allow negotiation them. For optimized implementations, stream muxer - negotiation will take zero round-trips for the client (depending on the - details of the cryptographic handshake protocol). In that case, the client + negotiation will take zero round-trips for the dialer (depending on the + details of the cryptographic handshake protocol). In that case, the dialer will be able to immediately open a stream after completing the cryptographic handshake. In addition the protocol supports zero-round-trip optimistic stream protocol negotiation when proposing a single protocol. @@ -104,7 +102,7 @@ Select]_ protocol. ### Basic Flow -Both endpoints, client and server, send a list of supported protocols. Whether +Both endpoints, dialer and listener, send a list of supported protocols. Whether an endpoint sends its list before or after it has received the remote's list depends on the context and is detailed below. Nodes SHOULD order the list by preference. Once an endpoint receives a list from a remote, the protocol to be @@ -112,9 +110,9 @@ used on the connection or stream is determined by intersecting ones own and the remote list, as follows: 1. All protocols that aren't supported by both endpoints are removed from the - clients' list of protocols. + dialer's list of protocols. -2. The protocol chosen is the first protocol of the client's list. +2. The protocol chosen is the first protocol of the dialer's list. If there is no overlap between the two lists, the two endpoints can not communicate and thus both endpoints MUST close the connection or stream. @@ -130,8 +128,8 @@ security protocol, as described in the [multiaddr spec](https://github.com/libp2 TCP allows the establishment of a single connection if two endpoints start initiating a connection at the same time. This is called _TCP Simultaneous Open_. Since many application protocols running on top of a connection (most -notably the secure channel protocols e.g. TLS) assume their role (client / -server) based on who initiated the connection, TCP Simultaneous Open connections +notably the secure channel protocols e.g. TLS) assume their role (dialer / +listener) based on who initiated the connection, TCP Simultaneous Open connections need special handling. This special handling is described below, differentiating between two cases of TCP Simultaneous Open: coordinated and uncoordinated. @@ -140,7 +138,7 @@ between two cases of TCP Simultaneous Open: coordinated and uncoordinated. When doing Hole Punching over TCP, the [_Direct Connection Upgrade through Relay_][DCUTR] protocol coordinates the two nodes to _simultaneously_ dial each other, thus, when successful, resulting in a TCP Simultaneous Open connection. -The two nodes are assigned their role (client / server) out-of-band by the +The two nodes are assigned their role (dialer / listener) out-of-band by the [_Direct Connection Upgrade through Relay_][DCUTR] protocol. #### Uncoordinated TCP Simultaneous Open @@ -148,10 +146,10 @@ The two nodes are assigned their role (client / server) out-of-band by the In the uncoordinated case, where two nodes coincidentally simultaneously dial each other, resulting in a TCP Simultaneous Open connection, the secure channel protocol handshake will fail, given that both nodes assume to be in the -initiating / client role. E.g. in the case of TLS the protocol will report the +initiating / dialer role. E.g. in the case of TLS the protocol will report the receipt of a ClientHello while it expected a ServerHello. Once the security handshake failed due to TCP Simultaneous Open, i.e. due to both sides assuming -to be the client, nodes SHOULD close the connection and back off for a random +to be the dialer, nodes SHOULD close the connection and back off for a random amount of time before trying to reconnect. ### Connection Protocol Negotiation @@ -173,13 +171,13 @@ proper completion of the handshake. In _Protocol Select_ endpoints make use of Early Data to speed up protocol negotiation. As soon as an endpoints reaches a state during the handshake where it can send encrypted application data, it sends a list of supported protocols, -no matter whether it is in the role of a client or server. Note that depending +no matter whether it is in the role of a dialer or listener. Note that depending on the handshake protocol used (and the optimisations implemented), either the -client or the server might arrive at this state first. +dialer or the dialer might arrive at this state first. -When using TLS 1.3, the server can send Early Data after it receives the +When using TLS 1.3, the listener can send Early Data after it receives the ClientHello. Early Data is encrypted, but at this point of the handshake the -client's identity is not yet verified. +dialer's identity is not yet verified. While Noise in principle allows sending of unencrypted data, endpoints MUST NOT use this to send their list of protocols. An endpoint MAY send it as soon it is @@ -192,11 +190,11 @@ the handshake completes. #### 0-RTT -When using 0-RTT session resumption as offered by TLS 1.3 and Noise, clients +When using 0-RTT session resumption as offered by TLS 1.3 and Noise, dialers SHOULD remember the protocol they used before and optimistically offer that -protocol only. A client can then optimistically send application data, not waiting -for the list of supported protocols by the server. If the server still supports -the protocol, it will choose the protocol offered by the client when intersecting the +protocol only. A dialer can then optimistically send application data, not waiting +for the list of supported protocols by thelistener. If the listener still supports +the protocol, it will choose the protocol offered by the dialer when intersecting the two lists, and proceed with the connection. If not, the list intersection fails and the connection is closed, which needs to be handled by the upper protocols. @@ -314,7 +312,7 @@ Select or [Multistream Select]. Since QUIC provides native stream multiplexing, there's no need to negotiate a stream multiplexer. We therefore have to wait a bit longer before we can distinguish between [Multistream Select] and Protocol Select, namely until -the client opens the first stream. Conversely, this means that a server won't +the dialer opens the first stream. Conversely, this means that a listener won't be able to open a stream until it has determined which protocol is used. #### Protocol Differentiation @@ -442,7 +440,7 @@ mechanisms: Simultaneous Open?_ We make use of TCP Simultaneous Open for firewall and NAT Traversal. In this - situation, we coordinate the roles of client and server using the DCUtR + situation, we coordinate the roles of dialer and listener using the DCUtR protocol, so there's no need to do anything beyond that. The only situation where a Simultaneous Open might otherwise occur in the wild is when two peers happen to dial each other at the same time. This should occur rarely, and if From 77470f0921c46d8105815314c71ce1a89f15ad95 Mon Sep 17 00:00:00 2001 From: Max Inden Date: Tue, 27 Jul 2021 18:08:07 +0200 Subject: [PATCH 21/26] protocol-select: Remove links in headings --- protocol-select/README.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/protocol-select/README.md b/protocol-select/README.md index 2249129cf..dda8134e1 100644 --- a/protocol-select/README.md +++ b/protocol-select/README.md @@ -25,7 +25,7 @@ and spec status. - [Protocol Select](#protocol-select) - [Table of Contents](#table-of-contents) - [Introduction](#introduction) - - [Improvements over _[Multistream Select]_](#improvements-over-_multistream-select_) + - [Improvements over Multistream Select](#improvements-over-multistream-select) - [High-Level Overview](#high-level-overview) - [Basic Flow](#basic-flow) - [Secure Channel Selection](#secure-channel-selection) @@ -38,7 +38,7 @@ and spec status. - [Stream Protocol Negotiation](#stream-protocol-negotiation) - [Initiator](#initiator) - [Listener](#listener) - - [Transitioning from [Multistream Select]](#transitioning-from-multistream-select) + - [Transitioning from Multistream Select](#transitioning-from-multistream-select) - [Multiphase Rollout](#multiphase-rollout) - [Phase 1](#phase-1) - [Phase 2](#phase-2) @@ -62,7 +62,7 @@ _Protocol Select_ is a protocol negotiation protocol. It is aimed at negotiating libp2p protocols on connections and streams. It replaces the _[Multistream Select]_ protocol. -### Improvements over _[Multistream Select]_ +### Improvements over Multistream Select - **Connection establishment** @@ -239,7 +239,7 @@ either: - Rejecting all proposed protocols by replying with an empty list of protocols. -## Transitioning from [Multistream Select] +## Transitioning from Multistream Select Protocol Select is not compatible with [Multistream Select] both in its semantics as well as on the wire. Live libp2p-based networks, currently using From b6b5d2efa4cde7bcbb57ecc1b4960e2787c4f2ec Mon Sep 17 00:00:00 2001 From: Max Inden Date: Tue, 27 Jul 2021 18:20:06 +0200 Subject: [PATCH 22/26] connections/: Replace Multistream Select with Protocol Select --- connections/README.md | 79 +++++++++++---------------------------- protocol-select/README.md | 4 -- 2 files changed, 21 insertions(+), 62 deletions(-) diff --git a/connections/README.md b/connections/README.md index 820bc6209..a90cc3bb7 100644 --- a/connections/README.md +++ b/connections/README.md @@ -26,7 +26,8 @@ and spec status. - [Overview](#overview) - [Definitions](#definitions) - [Protocol Negotiation](#protocol-negotiation) - - [multistream-select](#multistream-select) + - [Protocol Select](#protocol-select) + - [Multistream Select](#multistream-select) - [Upgrading Connections](#upgrading-connections) - [Opening New Streams Over a Connection](#opening-new-streams-over-a-connection) - [Practical Considerations](#practical-considerations) @@ -113,8 +114,7 @@ Each protocol supported by a peer is identified using a unique string called a **protocol id**. While any string can be used, the conventional format is a path-like structure containing a short name and a version number, separated by `/` characters. For example: `/mplex/1.0.0` identifies version 1.0.0 of the -[`mplex` stream multiplexing protocol][mplex]. multistream-select itself has a -protocol id of `/multistream/1.0.0`. +[`mplex` stream multiplexing protocol][mplex]. Including a version number in the protocol id simplifies the case where you want to concurrently support multiple versions of a protocol, perhaps a stable version @@ -125,10 +125,18 @@ receives the protocol id negotiated for each new stream, so it's possible to register the same handler for multiple versions of a protocol and dynamically alter functionality based on the version in use for a given stream. -### multistream-select +libp2p supports two protocol negotiation protocols, _Protocol Select_ and +_Multistream Select_, the former replacing the latter and the latter being +deprecated. -libp2p uses a protocol called multistream-select for protocol negotiation. Below -we cover the basics of multistream-select and its use in libp2p. For more +### Protocol Select + +The _Protocol Select_ protocol as well as how it is embedded in libp2p is +described in the [_Protocol Select_ specification][protocol-select]. + +### Multistream Select + +Below we cover the basics of multistream-select and its use in libp2p. For more details, see [the multistream-select repository][mss]. Before engaging in the multistream-select negotiation process, it is assumed @@ -152,6 +160,7 @@ hex): The first byte is the varint-encoded length (`0x03`), followed by `na` (`0x6e 0x61`), then the newline (`0x0a`). +Multistream-select itself has a protocol id of `/multistream/1.0.0`. The basic multistream-select interaction flow looks like this: @@ -182,6 +191,9 @@ traffic over the channel will adhere to the rules of the agreed-upon protocol. If a peer receives a `"na"` response to a proposed protocol id, they can either try again with a different protocol id or close the channel. +Note: In the case where both peers initially act as initiators, e.g. during NAT +hole punching, tie-breaking is done via the [multistream-select simultaneous +open protocol extension][simopen]. ## Upgrading Connections @@ -193,48 +205,17 @@ connections is called "upgrading" the connection. Because there are many valid ways to provide the libp2p capabilities, the connection upgrade process uses protocol negotiation to decide which specific protocols to use for each capability. The protocol negotiation process uses -multistream-select as described in the [Protocol +_Protocol Select_ as described in the [Protocol Negotiation](#protocol-negotiation) section. When raw connections need both security and multiplexing, security is always established first, and the negotiation for stream multiplexing takes place over the encrypted channel. -Here's an example of the connection upgrade process: - -![see conn-upgrade.plantuml for diagram source](conn-upgrade.svg) - -First, the peers both send the multistream protocol id to establish that they'll -use multistream-select to negotiate protocols for the connection upgrade. - -Next, the Initiator proposes the [TLS protocol][tls-libp2p] for encryption, but -the Responder rejects the proposal as they don't support TLS. - -The Initiator then proposes the [Noise protocol][noise-spec], which is supported -by the Responder. The Listener echoes back the protocol id for Noise to indicate -agreement. - -At this point the Noise protocol takes over, and the peers exchange the Noise -handshake to establish a secure channel. If the Noise handshake fails, the -connection establishment process aborts. If successful, the peers will use the -secured channel for all future communications, including the remainder of the -connection upgrade process. - -Once security has been established, the peers negotiate which stream multiplexer -to use. The negotiation process works in the same manner as before, with the -dialing peer proposing a multiplexer by sending its protocol id, and the -listening peer responding by either echoing back the supported id or sending -`"na"` if the multiplexer is unsupported. - Once security and stream multiplexing are both established, the connection upgrade process is complete, and both peers are able to use the resulting libp2p connection to open new secure multiplexed streams. -Note: In the case where both peers initially act as initiators, e.g. during NAT -hole punching, tie-breaking is done via the [multistream-select simultaneous -open protocol extension][simopen]. - - ## Opening New Streams Over a Connection Once we've established a libp2p connection to another peer, new streams are @@ -244,25 +225,12 @@ process](#upgrading-connections) if the transport lacks native multiplexing. Either peer can open a new stream to the other over an existing connection. When a new stream is opened, a protocol is negotiated using -`multistream-select`. The [protocol negotiation process](#protocol-negotiation) +_Protocol Select_. The [protocol negotiation process](#protocol-negotiation) for new streams is very similar to the one used for upgrading connections. However, while the security and stream multiplexing modules for connection upgrades are typically libp2p framework components, the protocols negotiated for new streams can be easily defined by libp2p applications. -Streams are routed to application-defined handler functions based on their -protocol id string. Incoming stream requests will propose a protocol id to use -for the stream using `multistream-select`, and the peer accepting the stream -request will determine if there are any registered handlers capable of handling -the protocol. If no handlers are found, the peer will respond to the proposal -with `"na"`. - -When registering protocol handlers, it's possible to use a custom predicate or -"match function", which will receive incoming protocol ids and return a boolean -indicating whether the handler supports the protocol. This allows more flexible -behavior than exact literal matching, which is the default behavior if no match -function is provided. - ## Practical Considerations This section will go over a few aspects of connection establishment and state @@ -354,12 +322,6 @@ See [hole punching][hole-punching] document. ## Future Work -A replacement for multistream-select is [being discussed][mss-2-pr] which -proposes solutions for several inefficiencies and shortcomings in the current -protocol negotiation and connection establishment process. The ideal outcome of -that discussion will require many changes to this document, once the new -multistream semantics are fully specified. - For connection management, there is currently a draft of a [connection manager specification][connmgr-v2-spec] that may replace the current [connmgr interface][connmgr-go-interface] in go-libp2p and may also form the basis of @@ -409,3 +371,4 @@ updated to incorporate the changes. [simopen]: ./simopen.md [resource-manager-issue]: https://github.com/libp2p/go-libp2p/issues/635 [hole-punching]: ./hole-punching.md +[protocol-select]: ../protocol-select/README.md diff --git a/protocol-select/README.md b/protocol-select/README.md index dda8134e1..876bbdcfe 100644 --- a/protocol-select/README.md +++ b/protocol-select/README.md @@ -1,7 +1,3 @@ - - # Protocol Select | Lifecycle Stage | Maturity | Status | Latest Revision | From 324ff28a8a3b85f55259a1e0cd2c6b2d8c359f1f Mon Sep 17 00:00:00 2001 From: Max Inden Date: Tue, 27 Jul 2021 18:23:20 +0200 Subject: [PATCH 23/26] protocol-select/: Remove reference to Offer and Use message --- protocol-select/README.md | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/protocol-select/README.md b/protocol-select/README.md index 876bbdcfe..951174eda 100644 --- a/protocol-select/README.md +++ b/protocol-select/README.md @@ -330,8 +330,7 @@ Messages are encoded via the `ProtoSelect` message type. With the current version of _Protocol Select_ detailed in this document, the `version` field of the `ProtocolSelect` message is set to `1`. Implementations MUST reject messages with a `version` other than the current version. See [Protocol -Evolution](#Protocol-Evolution) for details. Both the `Offer` and the `Use` -messages are wrapped with the `ProtocolSelect` message at all time. +Evolution](#Protocol-Evolution) for details. ```protobuf syntax = "proto2"; From 74ba8a324938435b13e31691618b76099e7742d6 Mon Sep 17 00:00:00 2001 From: Max Inden Date: Tue, 27 Jul 2021 18:29:39 +0200 Subject: [PATCH 24/26] protocol-select/: Document Protocol Name structure --- protocol-select/README.md | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/protocol-select/README.md b/protocol-select/README.md index 951174eda..222e8d479 100644 --- a/protocol-select/README.md +++ b/protocol-select/README.md @@ -330,7 +330,11 @@ Messages are encoded via the `ProtoSelect` message type. With the current version of _Protocol Select_ detailed in this document, the `version` field of the `ProtocolSelect` message is set to `1`. Implementations MUST reject messages with a `version` other than the current version. See [Protocol -Evolution](#Protocol-Evolution) for details. +Evolution](#Protocol-Evolution) for details. The `Protocol` `name` field is a +UTF-8 encoded string identifying a specific protocol, same as in _Multistream +Select_. See [Protocol Negotiation +section](../connections/README.md#protocol-negotiation) for libp2p specific +_Protocol Name_ conventions such as path-like structure. ```protobuf syntax = "proto2"; @@ -347,9 +351,6 @@ message ProtoSelect { } ``` - - ### Protocol Evolution While we can not foresee all future use-cases of _Protocol Select_, we can From 76d2d783531e4c92c62a58a60cdb63297bae6716 Mon Sep 17 00:00:00 2001 From: Marten Seemann Date: Wed, 25 Aug 2021 19:09:37 +0100 Subject: [PATCH 25/26] apply @yusefnapora's suggestions Co-authored-by: Yusef Napora --- protocol-select/README.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/protocol-select/README.md b/protocol-select/README.md index 222e8d479..619a98c69 100644 --- a/protocol-select/README.md +++ b/protocol-select/README.md @@ -169,7 +169,7 @@ negotiation. As soon as an endpoints reaches a state during the handshake where it can send encrypted application data, it sends a list of supported protocols, no matter whether it is in the role of a dialer or listener. Note that depending on the handshake protocol used (and the optimisations implemented), either the -dialer or the dialer might arrive at this state first. +dialer or the listener might arrive at this state first. When using TLS 1.3, the listener can send Early Data after it receives the ClientHello. Early Data is encrypted, but at this point of the handshake the @@ -253,7 +253,7 @@ listener. Differentiating the two protocols as a listener is detailed in the [Heuristics](#heuristics) section below. Nodes, when dialing, MUST NOT yet use Protocol Select, but instead continue to use [Multistream Select]. -Once a large enogh fraction of the network has upgraded, one can transition to +Once a large enough fraction of the network has upgraded, one can transition to phase 2. #### Phase 2 @@ -470,7 +470,7 @@ mechanisms: One could then use the _Protocol ID_ instead of the _Protocol Name_ for future negotiations on that same connection. - While this approach would reliev us of the need to specify the _Protocol Name_ + While this approach would relieve us of the need to specify the _Protocol Name_ _Protocol ID_ mapping in e.g. libp2p/specs, it does add state to be kept across negotiations, thus complicating implementations and potentially resulting in state-mismatch edge-cases. Another argument for the current @@ -496,7 +496,7 @@ mechanisms: * _Why not include Protocol IDs from the start_? - _Protocol IDs_ are part of the initial _Protocol Select_ version to reduce + _Protocol IDs_ are not part of the initial _Protocol Select_ version to reduce complexity and thus ease the initial roll-out. As detailed above, introducing _Protocol IDs_ at a later stage can be done with low coordination and performance overhead. From 6e947b85cc5528bea4985530c30e1911507cc012 Mon Sep 17 00:00:00 2001 From: Max Inden Date: Sat, 27 Nov 2021 21:57:54 +0100 Subject: [PATCH 26/26] protocol-select/README: Mark `version` field as `required` Fields in proto2 have to be either `required`, `optional` or `repeated`. Marking `version` as `required` as it should be set at all times. --- protocol-select/README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/protocol-select/README.md b/protocol-select/README.md index 619a98c69..7b839721f 100644 --- a/protocol-select/README.md +++ b/protocol-select/README.md @@ -340,7 +340,7 @@ _Protocol Name_ conventions such as path-like structure. syntax = "proto2"; message ProtoSelect { - uint32 version = 1; + required uint32 version = 1; message Protocol { oneof protocol {