Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft of noise-libp2p spec #202

Merged
merged 9 commits into from
Aug 18, 2019
Merged

Draft of noise-libp2p spec #202

merged 9 commits into from
Aug 18, 2019

Conversation

yusefnapora
Copy link
Contributor

Hey all, this is what I've got so far to address #195 and specify a Noise security protocol for libp2p. It basically follows the outline I put in the issue comments there and is very similar to the rust-libp2p implementation.

I'll fill in the Design Consideration sections soon, but didn't want to commit before we're done considering designs :)

Does this seem like a good direction?

@raulk @tomaka @romanb - I added you all to the interest group. If anyone else wants in, let me know!

@shahankhatch
Copy link

If possible, could I please be added to the interest group? Thanks in advance!

@Mikerah
Copy link
Contributor

Mikerah commented Aug 5, 2019

Count me in the interest group!

@raulk
Copy link
Member

raulk commented Aug 6, 2019

I’d like to nominate @burdges and @zmanian for the interest group too. Read about the role of the interest group here: https://github.com/libp2p/specs/blob/master/00-framework-01-spec-lifecycle.md

@raulk
Copy link
Member

raulk commented Aug 6, 2019

Also @arnetheduck and @AgeManning, if they’re so inclined.

@AgeManning
Copy link
Contributor

If you'd like extra eyes on it, happy to help, however my contribution may be low as I'm traveling for the next stretch of time.

Copy link
Contributor

@marten-seemann marten-seemann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for writing this up. I added a few comments.

Why are we restricting ourselves to just one DH, cipher suite and hash function? This will make it harder to roll out an upgrade if we ever have to deprecate one of them (e.g. in case it is broken).

noise/README.md Outdated
### Supported Cipher Functions

noise-libp2p supports the ChaChaPoly cipher functions [as defined in the Noise
spec][npf-cipher-chachapoly].
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AESGCM is a lot faster when running on hardware that has AES-support. For TLS, @Kubuxu came up with a nifty trick to select AES if both peers have hardware support.

Is there any reason to exclude AESGCM here?

Copy link
Member

@Kubuxu Kubuxu Aug 6, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@marten-seemann it was mostly about selecting ChaCha/Salsa on hardware that doesn't support hardware AES. Hardware AES vs Salsa family are very similar in speed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could add a MAY for AESGCM support here:

Implementations running on hardware-accelerated AES environments MAY offer the AESGCM cipher function, by advertising an alternate protocol suite (see Handshake Negotiation). Note that doing so may leak details about the peer's runtime environment.

Copy link
Contributor

@marten-seemann marten-seemann Aug 6, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I remember correctly, the difference was quite large. I don't have the benchmark data at hand any more, but I did a quick Google search and found this article (unfortunately from 2014): https://www.zeitgeist.se/2014/08/23/optimize-aes-and-chacha20-usage-with-boringssl/. In that benchmark, AES beats ChaCha by a factor of 2, if the CPU has hardware support.

What's the reason for making ChaCha required and AES-GCM optional? Just for reference, that's the opposite of what TLS1.3 is requiring.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just went with what's currently in rust-libp2p and figured we'd sort it out in discussion. I think we probably should add AESGCM since hardware support is pretty pervasive.

Is there a reason not to line up with TLS 1.3 and make AES required, with a recommendation to also support ChaCha? Seems fairly sensible.

Copy link
Member

@Kubuxu Kubuxu Aug 8, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my runs on i5-4200M (only one I have available right now), I'mne getting:

BenchmarkChacha20Poly1305-4   	  100000	    466381 ns/op	1124.16 MB/s
BenchmarkAES128GCM-4          	  100000	    315214 ns/op	1663.27 MB/s
BenchmarkAES256GCM-4          	  100000	    360130 ns/op	1455.83 MB/s

Here is Go's benchmark: https://github.com/Kubuxu/go-crypto-bench/
It would best to compare with more powerful Intel and AMD processor and also some ARMs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Kubuxu Thanks for the benchmark! I ran it on my Macbook, and it looks like the difference is a lot bigger here:

goos: darwin
goarch: amd64
BenchmarkChacha20Poly1305-12    	   30000	    321145 ns/op	1632.56 MB/s
BenchmarkAES128GCM-12           	  100000	    145071 ns/op	3614.00 MB/s
BenchmarkAES256GCM-12           	   50000	    156157 ns/op	3357.44 MB/s

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a real cryptographer (and I hope @burges can correct me where I'm wrong), but I have a strong preference of ChaCha-Poly constructions over AES-GCM, for the following reasons:

  1. I think we should also at least benchmark Chacha-Poly vs AES on some smartphone CPUs. Those usually don't have AES-NI or comparable hardware acceleration, and I expect the results to be much in favor of Chacha (since it was benchmarked "in software" in your tests as well, and performs at least comparable to AES "in hardware"). Chacha is based on add-rotate-xor (ARX) PRNG, which can be relatively efficiently implemented in most hardware infrastructures.
    I would pretty much expect that mobile (or even embedded and IoT) devices should be in scope for libp2p design choices?

  2. Continuation of the previous point, we would require AES implementations "in software" which don't suck, and, surprise-surprise, those are rare — there're a lot of potential footguns with timing attacks and AES in general not being very copiler- or developer-friendly. It's much easier to achieve constant-time execution for Chacha than for AES, and the implementation will be much easier (taking in account you probably shouldn't trust your compiler for the critical parts anyway, and should roll out your own platform-specific assembly to substitute for that; I would expect ARX in assembly to be much much simpler).

  3. AFAIU there're a couple of cryptography-related footguns in AES-GCM as well, so AFAIR NIST doesn't recommend using it for more than 2³² messages with the same key, even if you use cryptographically-strong nonces, and do everything else correctly.
    Not sure how much of a practical limitation it is for libp2p, but at least it worth pointing out.

noise/README.md Outdated Show resolved Hide resolved
noise/README.md Outdated
Following a successful `XX` handshake, both peers may store the static Noise
public key of the other. Future connections can attempt the `IK` handshake using
the stored static key, which offers the benefits of zero-RTT encryption and
fewer message round-trips.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Am I right to assume that the security properties of zero-RTT data are equivalent to those of 0-RTT data in TLS1.3, i.e. can be replayed?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It depends on the handshake pattern. If you have prior knowledge of the responder's static key (*K handshakes), the initiator can encrypt 0-RTT data. You could counteract replay attacks in various manners, with different security guarantees.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@marten-seemann I think you're right that the IK handshake is susceptible to replay. It has this destination security property:

Encryption to a known recipient, forward secrecy for sender compromise only, vulnerable to replay. This payload is encrypted based only on DHs involving the recipient's static key pair. If the recipient's static private key is compromised, even at a later date, this payload can be decrypted. This message can also be replayed, since there's no ephemeral contribution from the recipient.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @yusefnapora. I think we should document these security properties carefully.
This is something that an implementation will need to account for in the API in one way or the other.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let’s be careful not to replicate the Noise spec here, but rather reference it.

Copy link
Member

@raulk raulk Aug 8, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that lack of encryption of 0-RTT data or replay ability, in practice, do not pose a threat for us. (besides censorship resistance, insofar encryption is concerned)

Remember this handshake takes place in the context of a libp2p connection establishment, 0-RTT data will presumably be used by multiselect 2.0 to preemptively negotiate the multiplexer. We should not expose 0-RTT facilities to the application itself, IMO.

Notes:

  1. Replaying the handshake itself is fruitless. Even in the case of IX or IK (1-RTT, both susceptible to replay), because the attacker does not know the initiator's private keys, unless compromised (in which case, it's a different attack).
  2. The 0-RTT data we'll presumably send has no side-effects, unlike in TLS 1.3 with HTTP, where you could be sending an HTTP POST.

noise/README.md Outdated
peer id][peer-id-spec], which we will refer to as their "identity keypair." To
avoid potential static key reuse, and to allow libp2p peers with any type of
identity keypair to use Noise, noise-libp2p uses a separate static keypair for
Noise that is distinct from the peer's identity keypair.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is that way I interpret the relationship between the handshake static DH keypair and the libp2p signing keypair.

For each signing identity key, there will be at least be one or more handshake static keys.

If the application is trying to initiate a connection to a specific identity key, they only determine if they have secure connection to that identity key after a handshake has taken place by receiving a certificate for the static public keys that were used in the connection.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is how I understand it as well. However, "signed identity payload" instead of "certificate" are the words I would lean towards. I believe that the "signed identity payload" may then be used for whichever key type and payload content is sent, which could be one or more certificates or other public key forms.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zmanian yes, I think that sums it up well - I'll try to clarify it some more in the write up.

I like the phrase "signed identity payload", that does seem better than "certificate".

noise/README.md Outdated
produced by the private libp2p identity key. The Noise static key is encoded
into a byte array according to the rules defined in [section 5 of RFC
7748][rfc-7748-sec-5] and signed as described in the [peer id
spec][peer-id-spec].
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should be using domain separation, so the signed message in another domain (i.e. IPNS) for sure cannot be used in here. See https://github.com/libp2p/specs/blob/master/tls/tls.md#libp2p-public-key-extension

Copy link
Member

@raulk raulk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Solid first iteration! <3 Some initial notes:

  • Let's adopt normative language. Implementers will welcome clear instructions. Talk in terms of what implementations have to adhere to do, versus describing noise-libp2p.
  • Specify how implementations should manage the DH static key. IMO, they SHOULD persist it across restarts to profit from the Noise Pipeline handshake choreography, but MAY discard it anytime.
  • Consider adding support for the IX pattern. XX avoids dumping the initiator's public key in plaintext (initiator privacy), but it adds 0.5 RTT in contrast to IX.
  • Unclear how we'll transport/encode message data, other than the certificate.
  • Offer guidance/requirements in terms of the user API.
    • Users should be able to choose the handshake patterns they wish participate in (considering the tradeoffs).
    • How to attach message data (I don't have an answer for this). multiselect 2.0 will leverage this to preempt multiplexer selection.
    • User-supplied static key.

noise/README.md Outdated Show resolved Hide resolved
noise/README.md Outdated Show resolved Hide resolved
noise/README.md Outdated Show resolved Hide resolved
noise/README.md Outdated Show resolved Hide resolved
noise/README.md Outdated Show resolved Hide resolved
noise/README.md Outdated Show resolved Hide resolved
noise/README.md Outdated Show resolved Hide resolved
noise/README.md Outdated Show resolved Hide resolved
noise/README.md Outdated Show resolved Hide resolved
noise/README.md Outdated Show resolved Hide resolved
@Warchant
Copy link

Warchant commented Aug 6, 2019

How about PSK? With PSK it would be possible to create private networks.

@shahankhatch
Copy link

Regarding transport and encoding of post-handshake data, I haven't seen much/any reference to the chaining key that is obtained once the handshake completes, and which is used for the transport messages. Is this part of the protocol relevant?
https://noiseprotocol.org/noise.html#overview-of-handshake-state-machine

@Kubuxu
Copy link
Member

Kubuxu commented Aug 6, 2019

PSK could be a good idea (0 overhead private networks) and protection against the quantum break of forward secrecy.

@raulk
Copy link
Member

raulk commented Aug 6, 2019

@shahankhatch good point. @yusefnapora – besides the handshake procedure, I believe this spec should also cover how the agreed key is used for encryption/decryption.

One aspect worth considering is rekeying, but coordinating on this could require a meta-message of some kind on a dedicated stream (à la CRYPTO frame in QUIC). Might not be worth it for v1 of this secure channel.

EDIT -- continuous rekeying is an option, but could be expensive depending on the cipher suite.

@yusefnapora
Copy link
Contributor Author

Thanks for all the great feedback so far everyone.

Here's what I'll plan to focus on for today / tomorrow:

  • add a section on PSK / private networks
  • add AESGCM as a required DH function.
    • maybe make ChaChaPoly optional?
  • clarify identity payload section & add domain separation for signature.
  • more detail about transport encryption keys, message framing, and encryption / decryption process
  • write up user-facing API

@shahankhatch about the chaining key, I think that it gets deleted after the handshake, since it's part of the SymmetricState object. The two CipherState objects are what we keep around and use for encryption of transport messages. There's also a handshake hash for channel binding after the handshake completes, which could be interesting.

@raulk

Consider adding support for the IX pattern. XX avoids dumping the initiator's public key in plaintext (initiator privacy), but it adds 0.5 RTT in contrast to IX.

I think this is a good idea. Users / applications that don't care about initiator privacy can skip the whole Noise Pipes dance entirely and just use IX. I'll also add an overview of the security and identity hiding properties of each handshake pattern, so it's more clear what the tradeoffs are.

@marten-seemann
Copy link
Contributor

One aspect worth considering is rekeying, but coordinating on this could require a meta-message of some kind on a dedicated stream (à la CRYPTO frame in QUIC). Might not be worth it for v1 of this secure channel.

Most AEADs should be safe up to several TB of data, according to today's research. However, we never know how a connection is used, and further research will lower this bound in the future.

EDIT -- continuous rekeying is an option, but could be expensive depending on the cipher suite.

I benchmarked this when implementing it for QUIC. The cost of computing the new (AES) key is about 20x the cost of opening / sealing a full-size packet. So it should be negligible as long as you don't do it for every single message you send.

@yusefnapora
Copy link
Contributor Author

Quick RFC: I'm not sure I've really thought through the replay attack scenarios, or if I'm thinking about it correctly. In the bit about the signed identity payload, I wrote

Encrypting the handshake payload is required to avoid replay attacks, as there is no timestamp or other validity criteria in the payload itself apart from the signature.

But after some more reading of the Noise spec, I think that you are only protected against replay attacks if you've performed a DH with an ephemeral key belonging to the recipient.

The IK handshake's first message is "vulnerable to replay" according to the spec, since there's no ephemeral DH contributing to the encryption. And in IX, there's no encryption of the first message at all, so the initiator's identity payload would have to be sent in the clear as well.

How much of a problem is this? Is it acceptable to have the identity payload in plaintext for the IX handshake? There's no identity hiding in that handshake pattern anyway, although it seems like if you compromised the Noise static key you could impersonate a node without needing their libp2p identity key by replaying a handshake message.

One way to add a bit of extra assurance that you actually possess the libp2p identity key would be to sign the handshake hash with the identity key after the handshake completes, as described in the channel binding section of the noise spec. We could then require the token to be present in each encrypted transport message.

Other thoughts?

@Kubuxu
Copy link
Member

Kubuxu commented Aug 7, 2019

We could then require the token to be present in each encrypted transport message.

The signature of the token could be sent just once.

@Kubuxu
Copy link
Member

Kubuxu commented Aug 8, 2019

Regarding rekeying: In case of Salsa family, it is not needed, you can transfer up to: 64 * 2^64 bytes = 1ZiB over a stream without rekeying. I'm not sure about AES, it very much depends on the scheme used for it.

@yusefnapora
Copy link
Contributor Author

yusefnapora commented Aug 8, 2019

@Kubuxu good call on just sending the token once.

Here's a thought... if we end up sending the channel binding token in the first transport message because the signed identity payloads are replay-able (or in plaintext), why not just send the identity payload in the first transport message instead of the handshake and include the channel binding token in the payload?

The problem with that is that it requires both sides to send a transport message before the connection is authenticated. But if we don't trust the connection until we've received the channel binding token, that's the case anyway...

@Kubuxu
Copy link
Member

Kubuxu commented Aug 8, 2019

Hmm, in theory if we are storying a static key for some peer, this static key was proven in the past to correspond to given peer id (either through signing the key or signing the session token to that peer).

So in case of 0-rtt you don't really need reconfirm that session is bound to given peer.

@raulk
Copy link
Member

raulk commented Aug 9, 2019

Thoughts from a sync discussion with @yusefnapora:

  1. 0-RTT or message data facilities should NOT be exposed to the application layer. They should stay internal in libp2p. 0-RTT data should be purely informational and should not lead to side effects or mutation.

    • We plan to use 0-RTT to expedite multiplexer selection. That way, two peers can establish a capable libp2p connection (encrypted, multiplexed) within a single RTT, by announcing and intersecting supported multiplexers in message data.
      • This may be subject to downgrade attacks, though, but I think it's unavoidable.
      • It's definitely censorable in IX, but not in XX (plaintext).
    • Unreplayability is not a requirement if data is purely informational.
    • At the very least, implementations MUST ensure that 0-RTT data is signed to prevent MITM.
  2. Assuming the motivation for peer A to open a connection to peer B is to perform an RPC, IX and XX are equivalent in terms of round trips, whilst XX offers better security properties. Rationale: A can pipeline message 3 of XX + the initial message (encrypted).

  3. Exposing IK separately may be useless. A use case would be bootstrappers or other nodes whose identity we know beforehand, but the Noise static key is separate from the libp2p identity key, and we don't know it beforehand.

  4. I'm really for Noise Pipes.

@yusefnapora
Copy link
Contributor Author

Thanks for capturing that raúl!

Talking through the replay scenarios helped clarify my thinking a lot. Like you mention earlier in the thread, replaying the first message doesn't let you complete a handshake without the private keys, in which case the game is over anyway. So long as accepting an initial handshake message has no potential side-effects, replays aren't a big concern.

Using the handshake messages to advertise multiplexers seems really useful, and it suggests that the internal API should be able to accept some "early data" to include in the handshake payloads. We should think about how that ought to look in practice, since it feels pretty intertwined with the multiselect 2 design. We might just want to note the possibility now, and note that the signed handshake payload might be extended in the future to include muxer negotiation data.

@yusefnapora
Copy link
Contributor Author

yusefnapora commented Aug 13, 2019

I just pushed a big commit that restructures things a fair bit - I ended up ripping things up too much for a tidy commit history, and there are still some missing pieces, but I wanted to put it up and see what people think.

The biggest change is separating out the negotiation of Noise protocols from the rest of libp2p's protocol negotiation stack. multistream-select is still used to decide whether to use Noise at all, but the noise-libp2p implementation is in charge of selecting which concrete Noise protocol to use.

Choosing a Noise protocol happens using the Noise Socket protocol, which is a pretty simple message framing protocol that includes negotiation data. The Noise Pipes 0-RTT with fallback pattern is pretty simple to implement using Noise Sockets, which is nice.

I personally like this approach a bit better than defining a libp2p protocol ID for each Noise protocol and letting multistream-select figure it out. It seems a bit cleaner, and the negotiation data is fed into the Noise prologue to prevent tampering by a MITM. I put some bits of rationale at the end, but I'm very happy to hear counterpoints. It definitely pushes some complexity into the noise-libp2p implementation compared to letting multistream-select figure things out.

Other changes:

  • AESGCM is now required, ChaChaPoly is a SHOULD
  • the handshake payload can contain an opaque "early data" byte string (signed by the identity key)
  • the message framing is more clearly specified, largely thanks to being stolen from Noise Socket
  • Noise Pipes is communicated as optional, with the IK and XXfallback patterns dependent on it

I haven't addressed PSK at all yet but have been thinking about it. It seems like you should be able to initialize the module with a PSK at boot, which would convert all the handshake patterns you support into a psk variant, probably psk0. Anyone attempting a Noise handshake with you would then require the key, or else you will fail to decrypt the initial message and the handshake will fail.

@Kubuxu
Copy link
Member

Kubuxu commented Aug 13, 2019

It makes sense to use psk0 patterns. There is also a bit of a problem with how it would work with current private networks implementation.

Private Network in libp2p right now is implemented with an additional layer of encryption with the PSK key (before anything else). This means that initial multiselect for crypto types is already encrypted and inclusion of PSK in Noise is meaningless (additional overhead already exists).

We could transition off the private networks as currently implemented, to having every security protocol to allow the inclusion of PSK in the negotiation/shared secret but until then this path would be dormant.

My recommendation is to spec this out for the future when we transition to implementing PKS on a transport security layer and not before it.

@marten-seemann AFAIK TLS 1.3 allows for the inclusion of PSK, does QUIC also supports it?

@burdges
Copy link

burdges commented Aug 13, 2019

AESGCM is now required, ChaChaPoly is a SHOULD

As noted in #202 (comment) I think AES-GCM should be required on platforms with AES-NI, and recommend when some degree of hardware acceleration and timing protections exist, ala https://blog.mozilla.org/security/2017/09/29/improving-aes-gcm-performance/ but recommend against AES-GCM when no hardware or timing protections exist. And ChaCha-Poly1305 should be required on all platforms.

That said, individual libp2p based networks should set their own policies. If you only run validator nodes on high end hardware, then sure require AES-GCM.

We should never negotiate about the handshakes or ciphersuites when using Noise.

TLS 1.3 makes negotiation secure. We lack the resources and buy-in by cryptographers to do negotiation correctly. Just use TLS 1.3 anytime you want negotiation.

Noise provides fast niche handshakes with precise control over when you reveal static keys. You only profit form Noise when you know exactly what you're doing, which any negotiation breaks.

@burdges
Copy link

burdges commented Aug 13, 2019

Also, Noise supports signature, certificates, etc. with early lesser protected protocol messages, so one must expose precise control over these messages, and never send data with the wrong message.

In polkadot's distant future, there are some situations where some node might contact another node, and anonymously* prove it possesses some credential, likely by winning some ring VRF contest. In this, the ring VRF would act as a certificate while we'd authenticate the connection with some ring KEX or ring signature, or maybe use another ring implicit certificate.

We'd select Noise for anything like this because we first select some handshake that mostly reveals handles everything exactly when we like, and then insert the ring blabla into the appropriate early messages.

(* It's not anonymous if you always use the same IP address, but we might care enough about performance that we'd avoid anything more anonymous.)

@yusefnapora
Copy link
Contributor Author

We should never negotiate about the handshakes or ciphersuites when using Noise.

This would make me pretty happy, tbh, since it would simplify this spec & implementations enormously.

I feel like there are two motivations for negotiation that we should examine; maybe we can address or dismiss them and get rid of a lot of potential pitfalls.

The first motivation is runtime flexibility - if we want to support multiple handshake patterns and ciphers we need a way to distinguish between them. That doesn't necessarily mean that we need to negotiate protocols though; we just need to identify them. Instead of allowing the responder to propose a fallback protocol, we could just have them accept / reject. I'm pretty sure we could still adopt Noise Pipes without requiring the ability to negotiate arbitrary protocols by using trial decryption.

The second motivation is "future proofing", or the ability to evolve the protocol after it's been deployed in the wild. There seems to be a trend in modern security protocols against this goal, which I'm pretty sympathetic to. Allowing "upgrades" also allows downgrades and introduces potentially exploitable complexity in a sensitive area. We could take the position that future-proofing is a non-goal; breaking changes to the security protocol should break the deployed network in this mindset.

I think the runtime flexibility motivation is less compelling than future proofing, but the problem with future proofing is that you don't know what the future holds, so there's no guarantee that the complexity you add now will actually pay off. And in the meantime, it has a concrete cost in the form of edge cases and increased attack surface.

I'm not a Real Cryptographer by any means, so I mostly view my role in this process as flushing out discussions like this and writing up the outcome. Despite writing up how negotiation could work, I'm not tied to the idea at all and would be happy to scrap it.

Thanks for promoting that viewpoint @burdges - I think this is a really important point and I'm glad you bring it up. What does everyone think?

@kirushik
Copy link

I really don't think that having cypher agility is desired in this particular case — it goes against the major principle of robust crypto design, "have a single joint and keep it well-oiled".
Sometimes agility inevitable (like in TLS case, but even there they become pretty aggressive in reducing amount of supported variants in the latest versions) — but if we can get away without it, we definitely should. Otherwise we will end up with a poor, incomplete and buggy reimplementation of TLS — and maybe we should've picked the actual TLS instead then?

This also applies to the future-proofing; can't we just specify this as noise-v1 or something, hardcode the cypher choice, and then upgrade to something different in noise-v2 when we feel like it's needed?
Luckily, crypto protocols (decent ones, I mean) are almost never broken suddenly, and instead we observe a gradually shifting state-of-the-art, weakening the security guarantees of the partiucular algorithm (something like this).
This will give us enough time (like, a couple of years) to specify and implement the replacement cyphersuite selection to be deployed in any maintained software.

We can then avoid this weird negotiation-inside-of-negotiation setup, and completely rely on multistream protocol selection (both current and redesigned, no additional requirements on multiselect2 imposed); our multiaddresses will still be readable (/ip4/1.2.3.4/tcp/42/noise2 is not any worse than just /ip4/1.2.3.4/tcp/42/noise); the resulting code will be much simpler (resulting both in more diverse and less buggy implementations); there will be less real-life problems with cypher selection incompatibilities (I've seen weird problems even in SSH cypher negotiation in real life).

PS. I still think that Chacha-Poly should be the cypher to go for the current revision of noise in libp2p; at least its speed shouldn't be considered a limiting factor, at all.
For example, Wireguard uses combination noise and Chacha-poly — and it easily saturates a 1GB link on an average hardware.
And to add another datapoint to my "hardcoded cyphersuites make software easier to implement and deploy", Wireguard is incredibly easy to set up (in contrast to, say, OpenVPN, which supports cypher agility and in general tends to delegate crypto-related choices to the end-user).

@yusefnapora
Copy link
Contributor Author

@kirushik that sounds like a very reasonable take to me - defining a new multiaddr code or libp2p protocol id for infrequent breaking changes to the spec seems acceptable, and removing agility makes things so much simpler.

ChaChaPoly does seem like the cipher with the most "reach", in the sense that a solid implementation should be available on all platforms, whereas AES more-or-less needs hardware support. If we define a single cipher, that seems like the sensible choice.

The more I think about it, the more I dislike the "weird negotiation in negotiation" scheme I proposed :) I felt compelled to write it up because using multistream for Noise protocol negotiation seemed like it wasn't quite right, but maybe that's because we really shouldn't be doing Noise protocol negotiation at all...

@yusefnapora
Copy link
Contributor Author

BTW, just want to quickly thank everyone for being so involved in this thread and helping me "think out loud" about the design. I don't trust myself to do it well by myself, but I do trust us to figure it out. 💯

@burdges
Copy link

burdges commented Aug 14, 2019

We should support TLS 1.3 as a transport and TLS 1.3 addresses for numerous reasons, including that people might want runtime flexibility and future proofing TLS-style @yusefnapora

I think Noise should handle both flexibility and future proofing at the public key infrastruture level. In other words, there is an implicit negotiation of sorts when the code using libp2p first learns some node's public key and decides to ask libp2p to open a connection, because the requesting code must tell libp2p what exactly which Noise handshake and ciphersuite.

You might fold relevant information into some multi-address format, so any libp2p network changing ciphersuite preferences happens in the defaults their code supplies to the multi-address writer and reader code.

Yes, we'll witness some networks making mistakes like distributing builds without AES-GCM but later deciding it should be the default, and making the transition too quickly so the nodes without AES-GCM were still running while nodes started decoding multi-address with AES-GCM as the default. Yet honestly, if they choose Noise over TLS 1.3 then they signed up for being careful about upgrade paths.

It's also possible the multi-address could express the protocol version for the code invoking libp2p, so that users only need to maintain a mapping from protocol versions and "roles" to noise configurations. In other words, a versioned-address instead of a multi-address.

Also, all developers understand versioning, so actually the more restrictive form designed and explained via versioning might prove more usable than any fancy magic negotiation schemes.


After seeing kirushik's comment, I'm saying there are several parameters so full names might resemble noise-xx-(curve25519)-aesgcm. At some level, the caller should always supply the xx since this really comes from the node's network role. You could optionally simplify the aesgcm part in some way, not by adding it into every multi-address, but by asking callers to provide a version mapping, so they give their own code something more like noise-role-joe_file_sharing-v3 and the address has something more like noise-joe_file_sharing-v2, so their own mapping determines the handshake and ciphersuites for talking between these two versions. I do not think this needs to be any sort of priority, but I'm saying it's ideologically how noise was designed to work, and if the goal is to make noise more user-friendly then one can do so without leaving this model.

@tomaka
Copy link
Member

tomaka commented Aug 15, 2019

Since that we agree that being flexible over the cipher suit is a bad idea, and that the cipher suit should be known ahead of time, why exactly aren't we doing the same for the format of the static public key by restricting the use of Noise to ED25519 and/or X25519 keys, rather than having nodes generate a separate key just for Noise?

I've asked this question before, to which the answer was "we can be flexible, so why not be flexible". But what I'd like to raise is: why be flexible?

@yusefnapora
Copy link
Contributor Author

My objection to forcing Ed25519 keys isn't really about wanting flexible key types; it's about avoiding static key reuse with other security protocols. If we use the identity key as the static Noise key, I think we also have to tell users that if they use Noise, they can only use Noise and shouldn't allow other protocols like SECIO or TLS to use the same key.

I can't personally think of a scenario where reusing the key with SECIO would be a problem, but I also don't really think I'm qualified to evaluate that and so am taking a very conservative view.

@yusefnapora
Copy link
Contributor Author

BTW, when I said a moment ago

I think we also have to tell users that if they use Noise, they can only use Noise and shouldn't allow other protocols like SECIO or TLS to use the same key.

Maybe this is okay? If the people that want to use Noise really don't care about using other protocols and are fine with generating new identities if they ever change their mind, then maybe this isn't a problem. In that case, I don't really have a problem with making this spec as "rigid" as possible, including the key type.

If we restrict to Ed25519 keys, we should still send the original public key in the handshake payload, since conversion to X25519 loses the sign bit of the key. But we don't need to send any signatures, which is a big simplification.

@raulk
Copy link
Member

raulk commented Aug 15, 2019

@tomaka the Noise spec advises against reusing the static key beyond Noise handshaking (e.g. for identity).

I concur with all the above comments that choosing a specific cipher suite and tagging that as noise-libp2p v1 is the way forward.

We can choose Curve25519, however restricting the libp2p host identity key to the same curve on the grounds of reusing it as the Noise static key is discouraged by the Noise spec itself.

So I’d rather follow that advise and link/assert the static key with the (polymorphic) libp2p identity key via a signature during handshake.

@raulk
Copy link
Member

raulk commented Aug 15, 2019

In terms of message data, the proposal was to NOT make this feature available to the application layer. Rather, it would stay “confined” inside libp2p-land, and we’d use it to advertise and intersect multiplexers. This allows us to complete the entire connection bootstrapping process in a single round trip.

If we want to make it available to the application, we’d need to craft that API very carefully because the details of the Noise handshake are abstracted away, yet the application needs to be made a participant in the process.

I have ideas, but they create complexity which I’m not convinced it’s necessary.

@kirushik @burdges could you walk us through scenarios where you’d use this feature, and how you’d vary your usage depending on the available guarantees?

@yusefnapora
Copy link
Contributor Author

the Noise spec advises against reusing the static key beyond Noise handshaking (e.g. for identity).

The way you phrase that reminds me that users might well be using their identity keypair for other purposes (e.g. signing pubsub messages, etc) and we can't assume that restricting the security protocol to Noise is enough to be safe.

So with that in mind I think using the identity key is a non-starter, and we might as well support all identity key types.

@yusefnapora
Copy link
Contributor Author

About restricting the early data to "libp2p-land", since the set of multiplexers should be static, we can have the API accept the early data when you init the noise module. So there won't be an opportunity for any connection-specific data to ever end up in the payload.

If you do change your supported multiplexers at runtime, you'd have to re-init the module, but that seems fine.

@yusefnapora
Copy link
Contributor Author

The latest commit removes explicit negotiation of Noise protocols, and defines a "cipher suite" of
25519_ChaChaPoly_SHA256, and also fills in the libp2p api section and adds some more color around treatment of early handshake data.

I also removed the optional IX handshake, since Noise Pipes can provide similar efficiency without sacrificing initiator privacy, and in many cases the final handshake message can be "pipelined" with the initiator's initial transport message, effectively removing the cost of the final 0.5 RTT in XX. Also, without explicit protocol negotiation, Noise Pipes is much simpler to implement without IX in the mix, since it's easier to distinguish between handshakes.

Still no PSK - I think we should defer this to the next round, when we promote to Candidate Recommendation. I'll add a "future work" section in a bit to cover this and other things like potential integration with QUIC.

I'll be off on vacation next week, but will be around reading feedback and chiming in.

Copy link
Member

@raulk raulk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a solid document. I believe we should merge it as a working draft, in compliance with our specs lifecycle maturity process.

We have a fully-formed interest group (5 members), and implementations are already in progress in jvm-libp2p (@shahankhatch). go-libp2p is being bountied in ETHBerlinZwei.

A few aspects I would revisit:

  • Clarify the prose around the support of XX, IK and Noise Pipes. An implementation should either support XX or Noise Pipes. I think this is what we're trying to say, but we can benefit from being more explicit for clarity.
  • Refine the rationale for separating the Noise static key from the identity key. The conclusion is correct, but I think the explanation is imprecise.
  • Clarify that even though we specify several "Valid Noise Protocol Names", we don't actually use them in our wire messages.
  • Revisit the encrypted packet format, and particularly the usage and length-delimiting of the padding.

@raulk raulk merged commit c4a47d5 into master Aug 18, 2019
@raulk raulk deleted the rfc/noise branch August 18, 2019 12:21
@raulk raulk restored the rfc/noise branch August 18, 2019 12:21
@raulk raulk deleted the rfc/noise branch August 18, 2019 12:21
@cheatfate cheatfate mentioned this pull request Oct 22, 2019
37 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.