Threat Model

Confusion is an encrypted messaging system which seeks to provide the following guarantees:

end-to-end confidentiality: an eavesdropper should not be able to learn anything about the contents of messages
end-to-end integrity: a man-in-the-middle should not be able to undetectably alter ciphertexts
pseudonymity: messages do not contain any identifiers (e.g. public keys, public key fingerprints) as these could tie messages to a particular sender or receiver
forward secrecy: Confusion uses unique keys for each message and never exposes public keys on the wire
replay prevention: an attacker should never be able to confuse Confusion (pardon the pun) into accepting a replay of an old message as a new message

Confusion 101

Confusion provides end-to-end client-side encryption which rendezvous via a store-and-forward service (henceforth referred to as "Tombstone") that sees only ciphertexts and never sees any keys of any kind (only key fingerprints, i.e. hashes). Here's a simple text diagram:

[Alice]     <--- TLS <=== (crypto_box<message>) ===> TLS ---> [Tombstone]
[Tombstone] <--- TLS <=== (crypto_box<message>) ===> TLS ---> [Bob]

The following crypto protocols are used:

data-in-motion: Transport Layer Security (v1.2)
data-at-rest: NaCl's crypto_box

How The Bad Guys Want To Screw With Us

Here's some attacks and how much we feel they're worth mitigating:

Passive MitM

Priority: High
Likelihood: High

A passive eavesdropper should not be able to learn the contents of messages, or observe any type of repeat-use key ID/fingerprint/public key sent in plaintext. We mitigate this first with authenticated public key encryption (using NaCl's crypto_box) of the message, and further use TLS encryption when communicating to a store-and-forward service.

We do nothing to mask IP addresses or the lengths of the messages. An attacker that is able to observe the lengths of messages and track the flow of information can likely deduce the sender and recipient. To mitigate this we suggest using an anonymizing overlay network like Tor.

Active MitM

Priority: High
Likelihood: High

Active attackers have wide ranging powers. We claim to mitigate the following threats:

Replay Attacks: Communication to the Tombstone store-and-forward service occurs over TLS which mitigates replay attacks. Additionally, Confusion's messaging model is idempotent, and even a successful TLS replay will have no effect.
Chosen Ciphertext Attacks: TLS has a somewhat spotty track record here but fortunately we depend on NaCl's crypto_box for ensuring our ciphertexts are authentic.

Denial of Service

Priority: Low
Likelihood: Directly proportional to the service's popularity

We'll try to stop DoS attacks if we can, but we don't care that much. This is a research project. We're not trying to run a 99.99999% service here. This isn't rocket surgery.

Tombstone Compromise

Priority: Low
Likelihood: High

It's rather likely Tombstone will be compromised. We'll try to make sure it doesn't happen, but you should assume it will. But... it's okay if it gets compromised! Tombstone is a "Trust No One" service (i.e. Tombstone itself is considered untrusted) and we're planning for that scenario. That's why we don't prioritize mitigating a Tombstone compromise.

Here is some of what an attacker might learn if they manage to compromise Tombstone:

Addresses of Tombstone users: We'll try hard to ensure this isn't logged, but we can't promise that. An attacker who has compromised Tombstone without our knowing will at least be able to monitor who it is talking with from T0 after they've compromised it.
Message Ciphertexts: All Confusion messages are encrypted and contain only single-use identifiers for recipients which are single-use public key fingerprints (i.e. hashes of public keys, not raw public keys). An attacker that compromises Tombstone will be able to see single-use public key fingerprints and crypto_box encrypted ciphertexts. Nothing else.

tl;dr: Tombstone double-encrypts messages using TLS for paranoia's sake, but other than that the encryption that actually matters occurs in a completely end-to-end capacity.

Endpoint Compromise

Priority: Low
Likelihood: High

If your box is pwned, you're pwned, sorry. One day we hope to have a free dial-an-infosec-ninja service where your very own infosec ninja will descend from a rope lowered out of a Chinook helicopter which has just flown to your house on-demand. He will scurry about your home checking all your equipment for potential NSA compromises, probably ripping a lot of it out and replacing it with Ninja-certified NinjaGear.

Until then, sorry, we're not responsible for endpoint compromises. And that's probably how you'll get owned.

tl;dr: Don't fuck with the NSA or other nation state-level adversaries, period. You'll lose unless you are literally MacGyver.

Target Channels

Our system will use two communication channels which are of interest to attackers:

Network: the messaging network over which most communications will be performed (a.k.a. the Internet)
OOB: an out-of-band channel we will use to establish trust by communicating a shared secret

Attacker Capabilities

We conjecture that attackers against this system are capable of observing both Network and OOB channels.

We will assume the attacker has basic man-in-the-middle capabilities and can both observe and manipulate traffic en route.

Goals

Given the attacker capabilities listed above, we would like to provide the following:

Message Confidentiality: The attacker should not be able to read the messages being sent between participants. In addition, we will strive for confidentiality even in the event that the OOB channel (which is used to establish trust) is being monitored.
Message Integrity: The attacker should not be able to undetectably manipulate messages
Pseudonymity: We avoid using repeat identifiers (i.e. long-term keys) in messages

Extras

We will take some additional steps that add a modicum of defense-in-depth. These are paranoid features with arguably nebulous security value, but we'll take them anyway. It's likely that none of the following are serious threats, and that if this software is compromised that it will happen through a much more practical vector than any of these, but hey: why not?

Public keys never sent in plaintext: It's unlikely we'll have an adversary that can solve the discrete logarithm problem which would be necessary to derive a private key from a public key. However, we'll avoid sending public keys in the clear anyway, instead preferring a key fingerprint derived as H(pk) where pk is a Diffie-Hellman public key and H() is a cryptographically secure hash function (we use Blake2b).
Random nonces never sent in plaintext: A common technique in cryptography is to use a random nonce or initialization vector and send it in the clear along with the message. The downside of this approach is that if a poor random number generator is used, the output provided by the RNG can potentially be used by an attacker to figure out things like keys. While we strive to use good random number generators, we will avoid the issue of providing the attacker with information about the state of a (broken) RNG by design and use a unique key per message, which is never sent in the clear (see above)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly