Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SECIO spec #106

Merged
merged 10 commits into from
Jun 11, 2019
338 changes: 338 additions & 0 deletions secio/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,338 @@
# SECIO 1.0.0

> A stream security transport for libp2p. Streams wrapped by SECIO use secure
> sessions to encrypt all traffic.

| Lifecycle Stage | Maturity Level | Status | Latest Revision |
|-----------------|----------------|--------|-----------------|
| 3A | Recommendation | Active | r0, 2019-05-27 |

Authors: [@jbenet], [@bigs], [@yusefnapora]

Interest Group: [@Stebalien], [@richardschneider], [@tomaka], [@raulk]

[@jbenet]: https://github.com/jbenet
[@bigs]: https://github.com/bigs
[@yusefnapora]: https://github.com/yusefnapora
[@Stebalien]: https://github.com/Stebalien
[@richardschneider]: https://github.com/richardschneider
[@tomaka]: https://github.com/tomaka
[@raulk]: https://github.com/raulk

See the [lifecycle document](../00-framework-01-spec-lifecycle.md) for context
about maturity level and spec status.

## Table of Contents

- [SECIO 1.0.0](#secio-100)
- [Table of Contents](#table-of-contents)
- [Implementations](#implementations)
- [Algorithm Support](#algorithm-support)
- [Exchanges](#exchanges)
- [Ciphers](#ciphers)
- [Hashes](#hashes)
- [Data Structures](#data-structures)
- [Protocol](#protocol)
- [Prerequisites](#prerequisites)
- [Message framing](#message-framing)
- [Proposal Generation](#proposal-generation)
- [Determining Roles and Algorithms](#determining-roles-and-algorithms)
- [Key Exchange](#key-exchange)
- [Key marshaling](#key-marshaling)
- [Shared Secret Generation](#shared-secret-generation)
- [Key Stretching](#key-stretching)
- [Creating the Cipher and HMAC signer](#creating-the-cipher-and-hmac-signer)
- [Initiate Secure Channel](#initiate-secure-channel)
- [Secure Message Framing](#secure-message-framing)
- [Initial Packet Verification](#initial-packet-verification)

## Implementations

- [js-libp2p-secio](https://github.com/libp2p/js-libp2p-secio)
- [go-secio](https://github.com/libp2p/go-libp2p-secio)
- [rust-libp2p](https://github.com/libp2p/rust-libp2p/tree/master/protocols/secio)

## Algorithm Support

SECIO allows participating peers to support a subset of the following
algorithms.

### Exchanges

The following elliptic curves are used for ephemeral key generation:

- P-256
- P-384
- P-521

### Ciphers
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ciphers used for key stretching and for message encryption once SECIO channel is established.

Copy link

@GriffinMB GriffinMB Apr 11, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is a deprecated cipher (Blowfish) supported? From the Go package docs:

Blowfish is a legacy cipher and its short block size makes it vulnerable to birthday bound attacks (see https://sweet32.info). It should only be used where compatibility with legacy systems, not security, is the goal.

Deprecated: any new system should use AES (from crypto/aes, if necessary in an AEAD mode like crypto/cipher.NewGCM) or XChaCha20-Poly1305 (from golang.org/x/crypto/chacha20poly1305).

Since the spec is not yet finalized, it would be cool if support was removed. If there is some mitigating circumstance (I haven't gone through the source very thoroughly!), a note about it would be helpful.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would also be worth noting what mode the AES ciphers should use. From the Go implementation, it looks like it's CTR!

Also, more generally, is this an implementation of any well-reviewed specification? If not, is there a reason why? I would imagine there are a number of specs for low-overhead, encrypted channels which may prevent future pitfalls.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is definitely not well-reviewed and will be will be deprecated in favor of TLS quite soon. See: https://github.com/libp2p/go-libp2p-tls/.

(go-ipfs now has experimental support which we'll likely upgrade to default support after a release or two).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha, makes sense! And that answers my follow-up questions as well :)


The following symmetric ciphers are used for encryption of messages once
the SECIO channel is established:

- AES-256
- AES-128

Note that current versions of `go-libp2p` support the Blowfish cipher, however
support for Blowfish will be dropped in future releases and should not be
considered part of the SECIO spec.

### Hashes
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hashes used for key stretching, and for HMACs once SECIO channel is established.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What mode and padding is AES-* using? Are there any parameters for Blowfish?


The following hash algorithms are used for key stretching and for HMACs once
the SECIO channel is established:

- SHA256
- SHA512

## Data Structures

The SECIO wire protocol features two message types defined in the version 2 syntax of the
[protobuf description language](https://developers.google.com/protocol-buffers/docs/proto).

```protobuf
syntax = "proto2";

message Propose {
optional bytes rand = 1;
optional bytes pubkey = 2;
optional string exchanges = 3;
optional string ciphers = 4;
optional string hashes = 5;
}

message Exchange {
optional bytes epubkey = 1;
optional bytes signature = 2;
}
```


These two messages, `Propose` and `Exchange` are the only serialized types
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's dump the current state of the protobufs here. Relying on a reference that can mutate can render the spec incoherent at a later time. Also, we're seeking to version specs in general, so capturing the current state and versioning the spec as it evolves is fair play.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nevermind, I see this feedback is recurrent below.

required to implement SECIO.

## Protocol

### Prerequisites

Prior to undertaking the SECIO handshake described below, it is assumed that
we have already established a dedicated bidirectional channel between both
parties, and that both have agreed to proceed with the SECIO handshake
using [multistream-select][multistream-select] or some other form of protocol
negotiation.

### Message framing

All messages sent over the wire are prefixed with the message length in bytes,
encoded as an unsigned variable length integer as defined
by the [multiformats unsigned-varint spec][unsigned-varint].

### Proposal Generation

SECIO channel negotiation begins with a proposal phase.

Each side will construct a `Propose` protobuf message (as defined [above](#data-structures)),
setting the fields as follows:

| field | value |
|-------------|--------------------------------------------------------------------------------------|
| `rand` | A 16 byte random nonce, generated using the most secure means available |
| `pubkey` | The sender's public key, serialized [as described in the peer-id spec][peer-id-spec] |
| `exchanges` | A list of supported [key exchanges](#exchanges) as a comma-separated string |
| `ciphers` | A list of supported [ciphers](#ciphers) as a comma-separated string |
| `hashes` | A list of supported [hashes](#hashes) as a comma-separated string |


Both parties serialize this message and send it over the wire. If either party
has prior knowledge of the other party's peer id, they may attempt to validate
that the given public key can be used to generate the same peer id, and may
close the connection if there is a mismatch.


### Determining Roles and Algorithms

Next, the peers use a deterministic formula to compute their roles in the coming
exchanges. Each peer computes:

```
oh1 := sha256(concat(remotePeerPubKeyBytes, myNonce))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the first use of the word "nonce". Does it refer to the rand field of the Propose message?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes.

oh2 := sha256(concat(myPubKeyBytes, remotePeerNonce))
```

Where `myNonce` is the `rand` component of the local peer's `Propose` message,
and `remotePeerNonce` is the `rand` field from the remote peer's proposal.

With these hashes, determine which peer's preferences to favor. This peer will
be referred to as the "preferred peer". If `oh1 == oh2`, then the peer is
communicating with itself and should return an error. If `oh1 < oh2`, use the
remote peer's preferences. If `oh1 > oh2`, prefer the local peer's preferences.

Given our preference, we now sort through each of the `exchanges`, `ciphers`,
and `hashes` provided by both peers, selecting the first item from our preferred
peer's set that is also shared by the other peer.

### Key Exchange

Now the peers prepare a key exchange.

Both peers generate an ephemeral keypair using the elliptic curve algorithm that was
chosen from the proposed `exchanges` in the previous step.

With keys generated, both peers create an `Exchange` message. First, they start by
generating a "corpus" that they will sign.

```
corpus := concat(myProposalBytes, remotePeerProposalBytes, ephemeralPubKey)
```

The `corpus` is then signed using the permanent private key associated with the local
peer's peer id, producing a byte array `signature`.


| field | value |
|-------------|---------------------------------------------------------------------------|
| `epubkey` | The ephemeral public key, marshaled as described [below](#key-marshaling) |
| `signature` | The `signature` of the `corpus` described above |


The peers serialize their `Exchange` messages and write them over the wire. Upon
receiving the remote peer's `Exchange`, the local peer will compute the remote peer's
expected `corpus` using the known proposal bytes and the ephemeral public key sent by
the remote peer in the `Exchange`. The `signature` can then be validated using the
permanent public key of the remote peer obtained in the initial proposal.

Peers MUST close the connection if the signature does not validate.

#### Key marshaling

Within the `Exchange` message, ephemeral public keys are marshaled into the
uncompressed form specified in section 4.3.6 of ANSI X9.62.

This is the behavior provided by the go standard library's
[`elliptic.Marshal`](https://golang.org/pkg/crypto/elliptic/#Marshal) function.

### Shared Secret Generation

Peers now generate their shared secret by combining their ephemeral private key with the
remote peer's ephemeral public key.

First, the remote ephemeral public key is unmarshaled into a point on the elliptic curve
used in the agreed-upon exchange algorithm. If the point is not valid for the agreed-upon
curve, secret generation fails and the connection must be closed.

The remote ephemeral public key is then combined with the local ephemeral private key
by means of elliptic curve scalar multiplication. The result of the multiplication is
the shared secret, which will then be stretched to produce MAC and cipher keys, as
described in the next section.

### Key Stretching

The key stretching process uses an HMAC algorithm to derive encryption and MAC keys
and a stream cipher initialization vector from the shared secret.

Key stretching produces the following three values for each peer:

- A MAC key used to initialize an HMAC algorithm for message verification
- A cipher key used to initialize a block cipher
- An initialization vector (IV), used to generate a CTR stream cipher from the block cipher

The key stretching function will return two data structures `k1` and `k2`, each containing
the three values above.

Before beginning the stretching process, the size of the IV and cipher key are determined
according to the agreed-upon cipher algorithm. The sizes (in bytes) used are as follows:

| cipher type | cipher key size | IV size |
|-------------|-----------------|---------|
| AES-128 | 16 | 16 |
| AES-256 | 32 | 16 |

The generated MAC key will always have a size of 20 bytes.

Once the sizes are known, we can compute the total size of the output we need to generate
as `outputSize := 2 * (ivSize + cipherKeySize + macKeySize)`.

The stretching algorithm will then proceed as follows:

First, an HMAC instance is initialized using the agreed upon hash function and shared secret.

A fixed seed value of `"key expansion"` (encoded into bytes as UTF-8) is fed into the HMAC
to produce an initial digest `a`.

Then, the following process repeats until `outputSize` bytes have been generated:

- reset the HMAC instance or generate a new one using the same hash function and shared secret
- compute digest `b` by feeding `a` and the seed value into the HMAC:
- `b := hmac_digest(concat(a, "key expansion"))`
- append `b` to previously generated output (if any).
- if, after appending `b`, the generated output exceeds `outputSize`, the output is truncated to `outputSize` and generation ends.
- reset the HMAC and feed `a` into it, producing a new value for `a` to be used in the next iteration
- `a = hmac_digest(a)`
- repeat until `outputSize` is reached

Having generated `outputSize` bytes, the output is then split into six parts to
produce the final return values `k1` and `k2`:

```
| k1.IV | k1.CipherKey | k1.MacKey | k2.IV | k2.CipherKey | k2.MacKey |
```

The size of each field is determined by the cipher key and IV sizes detailed above.

### Creating the Cipher and HMAC signer

With `k1` and `k2` computed, swap the two values if the remote peer is the
preferred peer. After swapping if necessary, `k1` becomes the local peer's key
and `k2` the remote peer's key.

Each peer now generates an HMAC signer using the agreed upon algorithm and the
`MacKey` produced by the key stretcher.

Each peer will also initialize the agreed-upon block cipher using the generated
`CipherKey`, and will then initialize a CTR stream cipher from the block cipher
using the generated initialization vector `IV`.

### Initiate Secure Channel

With the cipher and HMAC signer created, the secure channel is ready to be
opened.

#### Secure Message Framing

To communicate over the channel, peers send packets containing an encrypted
body and an HMAC signature of the encrypted body.

The encrypted body is produced by applying the stream cipher initialized
previously to an arbitrary plaintext message payload. The encrypted data
is then fed into the HMAC signer to produce the HMAC signature.

Once the encrypted body and HMAC signature are known, they are concatenated
together, and their combined length is prefixed to the resulting payload.

Each packet is of the form:

```
[uint32 length of packet | encrypted body | hmac signature of encrypted body]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • We encode the lengths as big-endian (network-order).
  • Need to specify what "encrypt" means.
  • Need to specify how we mac.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the padding style for encrypt need to be specified?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really, we should specify everything. Ideally, we'd point to an RFC. However, we should optimize for merging something that's correct rather than waiting for something that's perfect.

Copy link
Contributor

@richardschneider richardschneider Nov 13, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm assuming that the length includes the encrypted body and the hmac signature.

```

The packet length is in bytes, and it is encoded as an unsigned 32-bit integer
in network (big endian) byte order.

#### Initial Packet Verification

The first packet transmitted by each peer must be the remote peer's nonce.

Each peer will decrypt the message body and validate the HMAC signature,
comparing the decrypted output to the nonce recieved in the initial
`Propose` message. If either peer is unable to validate the initial
packet against the known nonce, they must abort the connection.

If both peers successfully validate the initial packet, the secure channel has
been opened and is ready for use, using the framing rules described
[above](#secure-message-framing).


[peer-id-spec]: https://github.com/libp2p/specs/peer-ids/peer-ids.md

[multistream-select]: https://github.com/multiformats/multistream-select
[unsigned-varint]: https://github.com/multiformats/unsigned-varint