Skip to content
Tony Arcieri edited this page Apr 13, 2014 · 12 revisions

The Problem

In order for two parties to communicate securely, they must somehow establish a shared secret which is used to encrypt a message. We can establish this shared secret using what's known as a Diffie-Hellman exchange. This lets us swap public keys with the person we intend to communicate with. Then we can combine our private key with the other person's public key to derive a shared secret. They do the same thing on their end with our public key and their secret key and arrive on the same shared secret.

However, unless we have some way to ensure we have the correct public key, a man-in-the-middle can easily swap his key for the one we're expecting. To have a secure communication channel, we must bootstrap from some other secure communication channel.

Even if we pull this off successfully, there's a possibility this shared secret will get leaked at some point in the future. In some messaging systems, people maintain long-term secrets that, if compromised, would reveal the contents of all previous messages. We'd like to minimize this risk if possible.

Improving on Historical Failures

We conjecture that systems which prompt users to verify a public key fingerprint and provide a clickthrough dialogue for doing so are fundamentally flawed, because they do not incentivize users to actually verify public key fingerprints, and that in practice public key fingerprints go unchecked because users simply do not care to verify these fingerprints.

To remedy this, we propose a system where no communication can take place unless a secret value has been communicated between users. We also seek to structure the system in such a way that even if this secret bootstrap value is communicated across an insecure channel and exposed to malicious users, the confidentiality of the system is not affected (however metadata about the parties communicating will be leaked)

Establishing a Secure Channel

Systems like GPG would typically have you verify what's known as a public key fingerprint to make sure you have someone's correct public key. This can be done in many ways: in person, via a telephone call (if you actually recognize the person's voice and provided you're not talking to a voice impersonator), or through an existing secure channel (e.g. OTR-encrypted IM, provided you've taken the aforementioned steps to verify someone's OTR public key.

The problem is, in practice, people quite often don't verify public key fingerprints. No mechanism in the software forces you to do this. People can simply click through prompts asking them to confirm the authenticity of a public key fingerprint without ever bothering to check if they have the right one. This undermines the security of the software.

We'll be taking a slightly different approach. We'll be randomly generating a password (as well as a salt) and using this to encrypt a set of ephemeral keys.

Ephemeral Keys

Two parties interested in communicating will be agreeing on a set of ephemeral keys. Confusion generates 128 of them per key exchange. Once they have all been used, another key exchange will have to take place.

All of the ephemeral private keys are generated randomly and independently using the randombytes implementation from libsodium. This will use /dev/urandom on *IX operating systems, and CryptGenRandom on Windows. We specifically avoid the use of KDFs to derive the entire set of private keys from some seed value to improve the forward secrecy of the protocol, and attempt to ensure that there is no deterministic relationship between the keys which can easily be leaked (although e.g. /dev/urandom is likely to be a deterministic PRNG underneath. We can't help that)

Confusion's ephemeral keys are presently based on the Curve25519 high-speed elliptic curve Diffie-Hellman algorithm. The protocol is designed to be upgradable, and future versions of the software may switch to the Ed25519 or Curve3617 algorithms.

Public Key IDs

Along with each of the public keys, we'll generate a public key ID which they are paired with. This is to avoid exposing public keys on the wire. While an attacker who obtains a public key should not be able to derive its shared secret without knowing a private key because, at least to our knowledge, no one has solved the discrete logarithm problem yet, we're going to try to leak as little information as possible and try to keep the public keys off the wire unencrypted. We're paranoid.

In order to do this, we're going to generate a random number associated with each key, then hash it. We'll be matching the key sizes of Curve25519: the public key ID will be based on a 256-bit random number which we then produce a 256-bit digest of.

It's hoped in this way that public key IDs are largely unrelated to the public keys they're associated with, although they may ultimately stem from the same output stream of an underlying PRNG.

Keyset Exchange

To establish a secure communication channel with a third party, a person (let's call her Alice) generates what's known as a keyset which consists of a set of Curve25519 public keys:

-----BEGIN CONFUSION KEYSET-----
qz2imk66nb53oqogcwe7jwjcwv2y6ski6ebq6idajrtnpu7yh2qq
b265kyrkwu26x23lqflafuodzxdpcilkmpbynlwnl7bb3fcxif5a
zhs2pqfrdkkmapjtahekc7diresb6vv2u7zg4jcc75wk66kye2oa
7a6sg4m3q6dq5rwlgq2nfb72hbyxjgw5jk7csdcg4eh2n4vughea
rqownarf4ootqk2525vecl3s75vcibhowbeqd26adfmlwkyrl6pa
[...]
-----END CONFUSION KEYSET-----

There should be exactly 2048 of these pairs inside of the file. The first 1024 are used for sending messages to Alice, and the last 1024 are keys Alice uses to send a specific recipient messages. All values are encoded in Base32 which has been converted to lower case.

The Confusion software itself then randomly selects a password. We have the software generate a password rather than relying on the user to pick one because users pick notoriously bad passwords. The passwords we select look like the following:

words random chosen entropy sufficient enough

This example intends to show that we select 6 English words at random from a large dictionary. This password will be used as the basis for a symmetric key which we will use to establish trust between two parties.

To derive a shared symmetric key, we will use the scrypt algorithm which is designed for use as a password-based key derivation function. We'll also generate a random salt, use this as an input to scrypt along with the password, and compute a symmetric key which is used to encrypt the keyset block.

We will encrypt the keyset block plaintext using the secretbox authenticated symmetric encryption primitive, which is based on the Salsa20 stream cipher and the Poly1305 message authentication code (MAC). By using a cryptographic MAC we ensure both that an attacker cannot compromise the integrity of the handshake in transit without us noticing, and also that the recipient has correctly typed the password.

After encrypting the plaintext of the keyset block, we encode it in ORDO encrypted block format. The resulting file looks like the following:

-----BEGIN ENCRYPTED ORDO BLOCK-----
kdf:
  ordo.pbkdf:///scrypt-400-8-36;pd2k42md65qrsq36y3hflivzfdofn742pugn
  v66x3ppetwjifq4kicyugtui6jgpka
scheme:
  ordo.pbkdf+xsalsa20poly1305

[...]

This file now contains an authenticated encrypted version of the ephemeral keyset which can now be sent to the other party.

Handshake Example

We'll now describe the handshake in terms of two parties named Alice and Bob.

Alice wants to send encrypted messages to Bob using Confusion. The first thing she'll do is request a set of keys from the software. Confusion will generate 2048 random Diffie-Hellman keypairs. The private keys will be kept locally. Confusion then chooses a passphrase we'll call P with which to encrypt Alice's set of 2048 public keys. We'll call the encrypted file containing these public keys C.

The Confusion software posts C to the central Tombstone server. The ciphertext is made publicly available to all users of Tombstone, alongside all other recently submitted D-H rendezvous. All Confusion clients eagerly consume the newly submitted D-H keysets, masking any individual attempt by a client to obtain a particular keyset. Clients try all active rendezvous passwords on the keysets they've downloaded, and if they're successful, begin part of the rendezvous "dance".

Bob wants to rendevous with Alice, who has posted keys to Tombstone, and gives her a phone call. We'll assume that Bob can recognize Alice's voice, and that Bob's ability to recognize Alice's voice is his way of authenticating that he's talking to the real Alice. We will also note that Alice is a privacy-conscious person and keeps her voice conversations confidential using an encrypted voice application (such as RedPhone or any other ZRTP-based voice app)

Alice now gives Bob the passphrase P to unlock the file. Now anyone who has access to C and P (i.e. Alice, Bob, and any eavesdroppers) can decrypt the set of D-H keys and use them to exchange messages.