From 49741f498805fe57ffb880e37ba026350a34d4f7 Mon Sep 17 00:00:00 2001 From: The8472 Date: Thu, 8 Oct 2015 19:18:43 +0200 Subject: [PATCH 01/14] first draft of encrypted torrents --- html/beps/bep_encrypted_data.rst | 229 +++++++++++++++++++++++++++++++ 1 file changed, 229 insertions(+) create mode 100644 html/beps/bep_encrypted_data.rst diff --git a/html/beps/bep_encrypted_data.rst b/html/beps/bep_encrypted_data.rst new file mode 100644 index 0000000..a6ec2bb --- /dev/null +++ b/html/beps/bep_encrypted_data.rst @@ -0,0 +1,229 @@ +:BEP: ?? +:Title: Encrypted Torrent Payload +:Version: $Revision$ +:Last-Modified: $Date$ +:Author: The 8472 +:Status: Draft +:Type: Standards Track +:Content-Type: text/x-rst +:Created: 04-Oct-2015 +:Post-History: + + +Abstract +======== + +This BEP specifies a way to apply symmetric encryption to torrent payload at the storage layer and additionally encrypt some metadata with the following goals: + +* confidentiality +* limited privacy + +and non-goals: + +* forward-secrecy +* anonymity +* signature-based authentication +* authentication of peer connections + + + +Rationale +========= + +BitTorrent swarms are mostly an open system well-suited for mass-distribution of data to the public. + +Some use-cases require that the data is only distributed to a closed, trusted group of peers. +In other cases the content may be meant for open distribution within a community without intent of announcing the content to the whole world. This is analogous to web content that is open to human visitors but requests via robots.txt[robots]_ that it should not be announced to the world by web crawlers. + + +While the private flag [BEP-27]_ may be sufficient in a controlled environment to prevent information about the torrent (e.g. its infohash) from escaping and thus preventing others from connecting to the swarm this is a very brittle form of security which also prevents the use of public infrastructure such as open trackers, PEX or the DHT. +Similarly Message Stream Encryption provides limited protection from passive eavesdroppers on the network layer but does not prevent the infohash from escaping. + + +Instead of attempting to restrict access to the swarm or metadata itself this BEP proposes to simply making all data opaque to 3rd parties by encrypting it with a shared secret that is not communicated by any torrent-related protocol, i.e. must be obtained separately by the user. + +In principle the same properties can be provided simply by storing the data in an encrypted archive and using nondescript filename, but that requires users to store the data twice or use additional filesystem layers to transparently access the data, which is even more cumbersome when encryption is involved. It also prevents bittorrent clients from reusing already-downloaded files in a multi-file torrent. + +Metadata format +=============== + + +.. parsed-literal:: + + { + info: { + bepXX: { + probe: *<32bytes of hash output (string)>*, + salt: *<32bytes of random binary data (string)>*, + shadow: ** + v: , + }, + length: **, + name: **, + piece length: ** + pieces: ** + }, + } + + +``salt`` + the random data must be generated by a cryptographically secure RNG to avoid IV reuse. + +``v`` + The current version, ``1``. New versions may be introduced by updates to this BEP if cryptographic weaknesses necessitate changes. Implementations should check if they support the version indicated in the metadata file and otherwise inform the user that they can download the data but not decrypt it. + +``shadow`` + bencoded-then-encrypted dictionary of key-value pairs that shadow entries in the info dictionary. implementations should only shadow a whitelist of keys which they know to be shadowable. shadowable keys suggested by this BEP: ``files, name, comment``. + +``probe`` + used to quickly verify keys + +``name`` + the name field is mandatory. an implementation may either provide a random string, provide a public name that reveals less information than the actual name in the shadow dictionary or may elect to not encrypt the name at all (i.e. not have a shadow name) + +``files`` + the files list is optional. Until an implementation knows whether a shadow file list exists it should treat a public file list as purely decorative. Only when the shadow dictionary is absent or has been decrypted can the implementation know for certain how the canonical layout for the decrypted data looks like. + + +To protect privacy an implementation should use shadowing for any additional keys that reveal information about the payload + + +Encryption +========== + +Cryptographic primitives: SHA256, AES256 + + Key.root = random key + + Key.payload = PBKDF2(HMAC−SHA256, Key.root, salt || "payload", 4096, 256) + + Key.shadow = sha256(Key.payload || "shadow") + + probe = sha256(Key.shadow) + + IV.payload = truncate_64(sha256(salt || "payload")) + + IV.shadow = truncate_64(sha256(salt || "shadow")) + + +All data is encrypted with AES256-CTR, with the respective IVs occupying the first 64bits of the nonce and the AES-block counter occupying the lower 64 bits. + +The optional ``shadow`` dictionary is encrypted with ``Key.shadow`` and ``IV.shadow`` . + +Before calculating the ``pieces`` hashes all files are concated in ``files`` order (if there is more than one) and encrypted with ``Key.payload`` and ``IV.payload``. + +Encryption/Decryption of the payload happens at a lower layer than the ``pieces`` hash calculation. I.e. ``files -(concat)-> pieces`` has been replaced with ``files -(concat)-> encryption -> pieces``. + +An implementation unaware of this BEP would simply store the ciphertext to the disk in a ``length``-sized file with the public name. + +This scheme only provides authentication for the ciphertext through the ``pieces`` hashes. An incorrect key could result in garbage plaintext, but this does not introduce a new problem since bittorrent never guaranteed that the files contain what the metadata claims. + +Key reuse and hierarchy +----------------------- + +The usage of a salt to derive the payload key from the root key allows the root key to be reused across several torrents while still generating distinct payload keys for each. But UI design SHOULD encourage random key generation for each new torrent and require explicit user action for key reuse. + +An implementation may provide the option to attempt to decrypt a torrent with the same key as another torrent in case a key is only communicated once and individual torrents are later distributed without explicitly providing keys. + +In some circumstances it may make sense to reveal a particular key lower in the hierarchy without revealing an upper key. For example a user may upload a torrent to an indexing site and provide the shadow key so it can extract keywords for fulltext search. + +Or a user may want to share a particular torrent without revealing the root key used to protect multiple other torrents, in that case revealing the payload key for that torrent will be sufficient. + +The probe value can be used to determine to which level of the hierarchy a key belongs by first assuming it is the shadow key and checking if the hash matches matches the probe, then assuming it is the payload key and then double-hashing it etc. + +Key sharing +=========== + +Implementations SHOULD provide a way to view and input the different keys for a torrent so users can share them in an unstructured ways. + +They MAY also allow a torrent to be converted between plain- and ciphertext storage mode on demand. This enables use-cases where the key is shared at a later point in time or where the user does not want to permanently keep the plaintext around. + +They MAY provide a way to export/import them in a machine-readable way but SHOULD only do so after user opt-in or highlighting the secret part to avoid accidental publication. + +Proposed format for magnet links ``&key=`` where the key in hex-encoded form. + +Similarly an implementation may include a ``bepXX key`` dictionary key in torrent root dictionary, albeit in raw binary form, if the user explicitly requests the torrent to be exported that way, e.g. for archival purposes. Since it is in the root dictionary and not the info dictionary it will not leak via metadata exchange. +But to avoid accidental publication the file should be appropriately named to highlight that it contains a secret key. + +Web services that request that users reveal keys for a specific use-case (e.g. metadata extraction) SHOULD NOT rely on the .torrent format. It is less fault-prone and thus safer to specify the requested keys separately in an upload form than attempting to strip them from the torrent after they have been included. + + + +Security Properties +=================== + +The goal is to provide security equivalent to publicly distributing an encrypted archive where the file index is encrypted with a separate key that can be revealed without revealing the payload key. + +In particular that means: + +* swarms remain open, anyone can participate in a swarm, with or without access to the secrets +* an observer without access to the secrets does not know what data is being shared +* correctness of the metadata cannot be confirmed without access to both secrets +* observing that someone participated in a swarm and uploaded data is no longer equivalent to knowing that they had access to the plaintext or knowledge of the metadata +* the ciphertext is accessible to the public. this may be desirable to provide upload bandwidth without knowledge of the content, e.g. to allow untrusted servers to distribute confidential data to trusted clients or to enable hosting without the need to proactively moderate user content. + + +Limitations: + +* there is no forward secrecy. should the secrets become available to an unauthorized party at some future point they will be able to decrypt ciphertext they have downloaded in the past and retroactively associate content with observed users +* deniability is fairly weak, if someone learns the shared secrets or has knowledge how it is distributed they may also draw conclusions whether a particular participant in a swarm could have had access to it. + + +Summary of UI concerns +====================== + +Torrent creation +---------------- + +1. user selects whether he wants to use encryption at all +2. if yes then offer + * to generate a random key. user may instead opt to reuse a key from another torrent + * to provide a meaningful public name distinct from the shadow name + * to only encrypt the payload and not shadow any metadata + + +Key input +--------- + +* option to use the root key of another decrypted torrent +* immediate feedback whether the key matches the probe and what kind of key it was (root, payload, shadow) + +Torrent/Magnet/Key export +------------------------- + +Provide option to + +* not include key +* include shadow key only (if there is any shadowed metadata) +* include torrent-specific key +* include root key (decrypts X additional torrents known by client, possibly more) + + +Test Vectors +============ + +## TODO + + +References +========== + +## TODO + + +Copyright +========= + +This document has been placed in the public domain. + + + +.. + Local Variables: + mode: indented-text + indent-tabs-mode: nil + sentence-end-double-space: t + fill-column: 70 + coding: utf-8 + End: + From 3458fae88b37ab89adab6c6eefea90f610412a5a Mon Sep 17 00:00:00 2001 From: The8472 Date: Fri, 9 Oct 2015 22:37:36 +0200 Subject: [PATCH 02/14] - replace probe with mac - remove section on keys in torrents - specify separate torrent key files instead - better single file / multi file handling --- html/beps/bep_encrypted_data.rst | 135 ++++++++++++++++++++++--------- 1 file changed, 97 insertions(+), 38 deletions(-) diff --git a/html/beps/bep_encrypted_data.rst b/html/beps/bep_encrypted_data.rst index a6ec2bb..234b15f 100644 --- a/html/beps/bep_encrypted_data.rst +++ b/html/beps/bep_encrypted_data.rst @@ -15,14 +15,14 @@ Abstract This BEP specifies a way to apply symmetric encryption to torrent payload at the storage layer and additionally encrypt some metadata with the following goals: -* confidentiality +* confidentiality * limited privacy and non-goals: * forward-secrecy * anonymity -* signature-based authentication +* signature-based authentication, already covered by [BEP 35] * authentication of peer connections @@ -40,9 +40,9 @@ While the private flag [BEP-27]_ may be sufficient in a controlled environment t Similarly Message Stream Encryption provides limited protection from passive eavesdroppers on the network layer but does not prevent the infohash from escaping. -Instead of attempting to restrict access to the swarm or metadata itself this BEP proposes to simply making all data opaque to 3rd parties by encrypting it with a shared secret that is not communicated by any torrent-related protocol, i.e. must be obtained separately by the user. +Instead of attempting to restrict access to the swarm or metadata this BEP proposes to make all data opaque to 3rd parties by encrypting it with a shared secret that is not available through any torrent-related protocol, i.e. must be obtained separately by the user. -In principle the same properties can be provided simply by storing the data in an encrypted archive and using nondescript filename, but that requires users to store the data twice or use additional filesystem layers to transparently access the data, which is even more cumbersome when encryption is involved. It also prevents bittorrent clients from reusing already-downloaded files in a multi-file torrent. +In principle the same properties can be provided by simply storing the data in an encrypted archive and using nondescript a filename, but that requires users to store the data twice or to use additional filesystem layers to transparently access the data, which is even more cumbersome when encryption is involved. It also prevents bittorrent clients from reusing already-downloaded files in a multi-file torrent. Metadata format =============== @@ -53,10 +53,10 @@ Metadata format { info: { bepXX: { - probe: *<32bytes of hash output (string)>*, + mac: *<32bytes of hmac output (string)>*, salt: *<32bytes of random binary data (string)>*, shadow: ** - v: , + v: **, }, length: **, name: **, @@ -64,8 +64,8 @@ Metadata format pieces: ** }, } - - + + ``salt`` the random data must be generated by a cryptographically secure RNG to avoid IV reuse. @@ -73,16 +73,21 @@ Metadata format The current version, ``1``. New versions may be introduced by updates to this BEP if cryptographic weaknesses necessitate changes. Implementations should check if they support the version indicated in the metadata file and otherwise inform the user that they can download the data but not decrypt it. ``shadow`` - bencoded-then-encrypted dictionary of key-value pairs that shadow entries in the info dictionary. implementations should only shadow a whitelist of keys which they know to be shadowable. shadowable keys suggested by this BEP: ``files, name, comment``. + bencoded-then-encrypted dictionary of key-value pairs that shadow entries in the info dictionary. + If it is absent only the payload is encrypted and all info dictionary entries are non-shadowed. + Implementations should only shadow a whitelist of keys which they have a shadowing strategy. + Shadowable keys suggested by this BEP: ``length, files, name, comment``. -``probe`` - used to quickly verify keys +``mac`` + message authentication code covering the info dictionary ``name`` - the name field is mandatory. an implementation may either provide a random string, provide a public name that reveals less information than the actual name in the shadow dictionary or may elect to not encrypt the name at all (i.e. not have a shadow name) - -``files`` - the files list is optional. Until an implementation knows whether a shadow file list exists it should treat a public file list as purely decorative. Only when the shadow dictionary is absent or has been decrypted can the implementation know for certain how the canonical layout for the decrypted data looks like. + the name field is a mandatory part of [BEP 3]_, therefore a placeholder MUST be provided if a shadow name is used. An implementation may either generate a random string consisting of filesystem-friendly characters or allow the user to choose a public name that reveals less information than the shadow name. + +``files``, ``length`` + The shadow dictionary MAY override the single/multifile nature indicated by the public info dictionary. If it does not shadow file information then the public information is canonical. + If the ``files`` or ``length`` are shadowed then the overall payload length MUST be consistent with the public version. + If a shadow dictionary is present the public information should be treated as decorative / advisory until it can be determined whether it has been shadowed, i.e. until the shadow data can be decrypted. To protect privacy an implementation should use shadowing for any additional keys that reveal information about the payload @@ -93,22 +98,23 @@ Encryption Cryptographic primitives: SHA256, AES256 - Key.root = random key + Key.root = random key (recommended length: 256bits) Key.payload = PBKDF2(HMAC−SHA256, Key.root, salt || "payload", 4096, 256) Key.shadow = sha256(Key.payload || "shadow") - probe = sha256(Key.shadow) + mac = HMAC−SHA256(, Key.shadow) IV.payload = truncate_64(sha256(salt || "payload")) IV.shadow = truncate_64(sha256(salt || "shadow")) +AES256-CTR is used for encryption, with the respective IVs occupying the first 64bits of the nonce and the AES-block counter occupying the lower 64 bits. -All data is encrypted with AES256-CTR, with the respective IVs occupying the first 64bits of the nonce and the AES-block counter occupying the lower 64 bits. +The optional ``shadow`` dictionary is encrypted with ``Key.shadow`` and ``IV.shadow``. -The optional ``shadow`` dictionary is encrypted with ``Key.shadow`` and ``IV.shadow`` . +The ``mac`` is calculated over the bencoded info-dictionary excluding the ``mac`` key value pair. Before calculating the ``pieces`` hashes all files are concated in ``files`` order (if there is more than one) and encrypted with ``Key.payload`` and ``IV.payload``. @@ -129,24 +135,66 @@ In some circumstances it may make sense to reveal a particular key lower in the Or a user may want to share a particular torrent without revealing the root key used to protect multiple other torrents, in that case revealing the payload key for that torrent will be sufficient. -The probe value can be used to determine to which level of the hierarchy a key belongs by first assuming it is the shadow key and checking if the hash matches matches the probe, then assuming it is the payload key and then double-hashing it etc. +The mac can also be used to determine to which level of the hierarchy a key belongs by first assuming it is the shadow key and attempting to verify the info-dictionary against it, then assuming it is the payload key, deriving the shadow key and then attempting to verify it etc. Key sharing =========== -Implementations SHOULD provide a way to view and input the different keys for a torrent so users can share them in an unstructured ways. +Implementations SHOULD provide a way to view and input the different keys for a torrent so users can share them in unstructured ways. The hex-encoded form should be used for this purpose. + +Encouraging users to share keys without bundling them with torrents or magnets in a structured way allows them to exchange them over separate channels and also makes it slightly more difficult to crawl the internet for unintentionally disclosed keys. + +Web services that request that users reveal keys for a specific use-case (e.g. metadata extraction) can ask for the key in a separate input field in their forms / APIs. +They SHOULD NOT store or in turn reveal the keys to visitors if that is not essential for their use-case. + +Keys MUST NOT be included in .torrent files in any form. Too much infrastructure for crawling and automatic mass-distribution of .torrent files exists and to a user it would not be obvious whether a torrent contains keys or not, thus making accidental disclosure likely. + +Magnets +------- + +Clients should only include a key if the user explicitly requests it or if the secret part has been sufficiently highlighted to make him aware of what type of secret he is sharing. + +To include a key in magnet links the parameter ``&key=`` can be added where the key is in hex-encoded form. + +The importing client can determine which type of key it is based on the ``mac`` in the metadata. + +Key files +--------- + +To export keys to a file, e.g. for archival purposes or for bulk torrent migration between clients, the following bencoded format can be used: -They MAY also allow a torrent to be converted between plain- and ciphertext storage mode on demand. This enables use-cases where the key is shared at a later point in time or where the user does not want to permanently keep the plaintext around. +.. parsed-literal:: + + { + torrent-keys: { + *<20 bytes infohash>*: { + root: **, + payload: **, + shadow: ** + }, + ... + }, + } + + +``.torrent-keys`` should be used as file extension. By default filesystem permissions should be set appropriately to restrict access to key files to the current user. -They MAY provide a way to export/import them in a machine-readable way but SHOULD only do so after user opt-in or highlighting the secret part to avoid accidental publication. +A file can contain keys for multiple torrents. Only one key needs to be included per torrent, as the lower keys can be derived. -Proposed format for magnet links ``&key=`` where the key in hex-encoded form. -Similarly an implementation may include a ``bepXX key`` dictionary key in torrent root dictionary, albeit in raw binary form, if the user explicitly requests the torrent to be exported that way, e.g. for archival purposes. Since it is in the root dictionary and not the info dictionary it will not leak via metadata exchange. -But to avoid accidental publication the file should be appropriately named to highlight that it contains a secret key. -Web services that request that users reveal keys for a specific use-case (e.g. metadata extraction) SHOULD NOT rely on the .torrent format. It is less fault-prone and thus safer to specify the requested keys separately in an upload form than attempting to strip them from the torrent after they have been included. +Storage layer +============= +This BEP does not mandate how an implementation should store encrypted or decrypted data on disk. + +However, if a client wants to be more flexible than either ignoring this BEP (thus storing ciphertext on disk) or always requiring the keys before starting a torrent it will have to consider the following: + +* clients can be in 3 different knowledge states: no keys, shadow key only, keys that decrypt plaintext; two encryption states: encrypted, decrypted; 3 file layout 3 states: encrypted, multi-file, single-fil +* a user may start downloading a torrent before he has access to the keys. this requires a way to input keys and to convert between encrypted and decrypted storage +* to reduce the amount of data that a compromised system could reveal a seeder may want to import plaintext data, convert it to encrypted form and request that the client discards the keys. + +Since encrypted torrents may contain confidential / private data implementations may also want to set more restrictive file permissions when decrypting data to reduce exposure in multi-user environments. Security Properties @@ -166,11 +214,20 @@ In particular that means: Limitations: * there is no forward secrecy. should the secrets become available to an unauthorized party at some future point they will be able to decrypt ciphertext they have downloaded in the past and retroactively associate content with observed users -* deniability is fairly weak, if someone learns the shared secrets or has knowledge how it is distributed they may also draw conclusions whether a particular participant in a swarm could have had access to it. +* deniability is fairly weak, if someone learns the shared secrets or has knowledge how they are distributed they may also draw conclusions whether a particular participant in a swarm could have had access to it. + + +UI concerns +=========== + +This section is advisory. + +Shared secrets are handled by many parties, therefore the system is as weak as the weakest human. Thus making intentional, correct handling of secrets simple and convenient while making unintentional disclosure hard is an important aspect of keeping the system secure. +Information that a client may want to make visible: -Summary of UI concerns -====================== +* encrypted/decrypted status of a torrent +* which keys it knows Torrent creation ---------------- @@ -185,18 +242,20 @@ Torrent creation Key input --------- -* option to use the root key of another decrypted torrent -* immediate feedback whether the key matches the probe and what kind of key it was (root, payload, shadow) +* input choices: manual, magnet link, .torrent-keys file, reusing key from another torrent +* immediate feedback whether keys match the mac and what kind of key was imported (root, payload, shadow) +* option to decrypt data or leave it encrypted + * offer directory layout choices that would normally be offered when a torrent is imported -Torrent/Magnet/Key export -------------------------- +Magnet/Key export +----------------- Provide option to -* not include key -* include shadow key only (if there is any shadowed metadata) -* include torrent-specific key -* include root key (decrypts X additional torrents known by client, possibly more) +* not include key [default] +* include shadow key only, if there is any shadowed metadata +* include payload key. +* include root key. if the client knows that the key has been reused for other torrents it should indicate this to the user Test Vectors From 33a7baa1dc409c0923b5d9fee26fa18cc4afeb03 Mon Sep 17 00:00:00 2001 From: The8472 Date: Sat, 10 Oct 2015 15:30:50 +0200 Subject: [PATCH 03/14] - ChaCha20 instead of AES - don't include infohashes in key file --- html/beps/bep_encrypted_data.rst | 57 ++++++++++++++++++-------------- 1 file changed, 33 insertions(+), 24 deletions(-) diff --git a/html/beps/bep_encrypted_data.rst b/html/beps/bep_encrypted_data.rst index 234b15f..c27d22b 100644 --- a/html/beps/bep_encrypted_data.rst +++ b/html/beps/bep_encrypted_data.rst @@ -42,7 +42,7 @@ Similarly Message Stream Encryption provides limited protection from passive eav Instead of attempting to restrict access to the swarm or metadata this BEP proposes to make all data opaque to 3rd parties by encrypting it with a shared secret that is not available through any torrent-related protocol, i.e. must be obtained separately by the user. -In principle the same properties can be provided by simply storing the data in an encrypted archive and using nondescript a filename, but that requires users to store the data twice or to use additional filesystem layers to transparently access the data, which is even more cumbersome when encryption is involved. It also prevents bittorrent clients from reusing already-downloaded files in a multi-file torrent. +In principle the same properties can be provided by simply storing the data in an encrypted archive and using a nondescript filename, but that requires users to store the data twice or to use additional filesystem layers to transparently access the data, which is even more cumbersome when encryption is involved. It also prevents bittorrent clients from reusing already-downloaded files in a multi-file torrent. Metadata format =============== @@ -58,7 +58,7 @@ Metadata format shadow: ** v: **, }, - length: **, + length *or* files: **, name: **, piece length: ** pieces: ** @@ -70,12 +70,12 @@ Metadata format the random data must be generated by a cryptographically secure RNG to avoid IV reuse. ``v`` - The current version, ``1``. New versions may be introduced by updates to this BEP if cryptographic weaknesses necessitate changes. Implementations should check if they support the version indicated in the metadata file and otherwise inform the user that they can download the data but not decrypt it. + The protocol version used to encrypt the torrent, currently *1*. New versions may be introduced by updates to this BEP if cryptographic weaknesses necessitate incompatible changes. Implementations should check if they support the version indicated in the metadata file and otherwise inform the user that they can download the data but not decrypt it. ``shadow`` bencoded-then-encrypted dictionary of key-value pairs that shadow entries in the info dictionary. If it is absent only the payload is encrypted and all info dictionary entries are non-shadowed. - Implementations should only shadow a whitelist of keys which they have a shadowing strategy. + Implementations should only shadow a whitelist of keys which they have a shadowing strategy and ignore other keys. Shadowable keys suggested by this BEP: ``length, files, name, comment``. ``mac`` @@ -85,7 +85,7 @@ Metadata format the name field is a mandatory part of [BEP 3]_, therefore a placeholder MUST be provided if a shadow name is used. An implementation may either generate a random string consisting of filesystem-friendly characters or allow the user to choose a public name that reveals less information than the shadow name. ``files``, ``length`` - The shadow dictionary MAY override the single/multifile nature indicated by the public info dictionary. If it does not shadow file information then the public information is canonical. + The shadow dictionary MAY override the single/multifile nature indicated by the public info dictionary. If it does not shadow either key then the public information is canonical. If the ``files`` or ``length`` are shadowed then the overall payload length MUST be consistent with the public version. If a shadow dictionary is present the public information should be treated as decorative / advisory until it can be determined whether it has been shadowed, i.e. until the shadow data can be decrypted. @@ -96,33 +96,37 @@ To protect privacy an implementation should use shadowing for any additional key Encryption ========== -Cryptographic primitives: SHA256, AES256 +Building blocks used: SHA2-256[rfc6234]_, ChaCha20[rfc7539]_, HMAC[rfc2104]_, PBKDF2[rfc2898]_ - Key.root = random key (recommended length: 256bits) +.. parsed-literal:: + + Key.root = random key, recommended strength: 256bits Key.payload = PBKDF2(HMAC−SHA256, Key.root, salt || "payload", 4096, 256) Key.shadow = sha256(Key.payload || "shadow") - mac = HMAC−SHA256(, Key.shadow) + mac = HMAC−SHA256(info-dict without mac, Key.shadow) IV.payload = truncate_64(sha256(salt || "payload")) IV.shadow = truncate_64(sha256(salt || "shadow")) -AES256-CTR is used for encryption, with the respective IVs occupying the first 64bits of the nonce and the AES-block counter occupying the lower 64 bits. +PBKDF2 key derivation is used in case root keys with less entropy than the recommended are used, e.g. for password-based schemes. But for general use this BEP assumes that the root key consists of random binary data and hence mandates hexadecimal encoding when the keys need to be displayed in a human-readable format. + +ChaCha20 is used to both encrypt the shadow dictionary and the torrent payload. -The optional ``shadow`` dictionary is encrypted with ``Key.shadow`` and ``IV.shadow``. +The optional ``shadow`` dictionary is encrypted after bencoding with ``Key.shadow`` and ``IV.shadow``. -The ``mac`` is calculated over the bencoded info-dictionary excluding the ``mac`` key value pair. +The ``mac`` is calculated over the bencoded info-dictionary including the ``bepXX`` dictionary but excluding the ``mac`` key value pair. -Before calculating the ``pieces`` hashes all files are concated in ``files`` order (if there is more than one) and encrypted with ``Key.payload`` and ``IV.payload``. +Before calculating the ``pieces`` hashes all files are concatenated in ``files`` order (if there is more than one) and encrypted with ``Key.payload`` and ``IV.payload``. Encryption/Decryption of the payload happens at a lower layer than the ``pieces`` hash calculation. I.e. ``files -(concat)-> pieces`` has been replaced with ``files -(concat)-> encryption -> pieces``. An implementation unaware of this BEP would simply store the ciphertext to the disk in a ``length``-sized file with the public name. -This scheme only provides authentication for the ciphertext through the ``pieces`` hashes. An incorrect key could result in garbage plaintext, but this does not introduce a new problem since bittorrent never guaranteed that the files contain what the metadata claims. +This scheme only provides integrity verification for the ciphertext through the ``pieces`` hashes, i.e. correct decryption is not verified. An incorrect key could result in garbage plaintext, but this does not introduce a new problem since bittorrent never guaranteed that the files contain what the metadata claims. Key reuse and hierarchy ----------------------- @@ -167,19 +171,22 @@ To export keys to a file, e.g. for archival purposes or for bulk torrent migrati { torrent-keys: { - *<20 bytes infohash>*: { + **: { root: **, payload: **, shadow: ** }, - ... + ... }, } -``.torrent-keys`` should be used as file extension. By default filesystem permissions should be set appropriately to restrict access to key files to the current user. +*torrent identifier* + A unique, use-specific identifier calculated from the torrent's mac via ``SHA256(mac || ".torrent-keys")``. This allows a torrent client to locate keys for a metadata file while preventing reverse lookups for those who do not have access to the metadata. -A file can contain keys for multiple torrents. Only one key needs to be included per torrent, as the lower keys can be derived. +``.torrent-keys`` should be used as file extension. By default filesystem permissions should be set appropriately to restrict access to key files to the current user. + +A key file can contain keys for multiple torrents. Only one key needs to be included per torrent, as the lower keys can be derived. Keys must be included in their raw, unencoded form. @@ -190,7 +197,7 @@ This BEP does not mandate how an implementation should store encrypted or decryp However, if a client wants to be more flexible than either ignoring this BEP (thus storing ciphertext on disk) or always requiring the keys before starting a torrent it will have to consider the following: -* clients can be in 3 different knowledge states: no keys, shadow key only, keys that decrypt plaintext; two encryption states: encrypted, decrypted; 3 file layout 3 states: encrypted, multi-file, single-fil +* clients can be in 3 states regarding key knowledge: no keys, shadow key only, keys that decrypt plaintext; two encryption states: encrypted, decrypted; 3 file layout 3 states: encrypted, multi-file, single-file * a user may start downloading a torrent before he has access to the keys. this requires a way to input keys and to convert between encrypted and decrypted storage * to reduce the amount of data that a compromised system could reveal a seeder may want to import plaintext data, convert it to encrypted form and request that the client discards the keys. @@ -227,24 +234,26 @@ Shared secrets are handled by many parties, therefore the system is as weak as t Information that a client may want to make visible: * encrypted/decrypted status of a torrent -* which keys it knows +* which keys it knows (+ option to discard if storage is encrypted) Torrent creation ---------------- 1. user selects whether he wants to use encryption at all -2. if yes then offer - * to generate a random key. user may instead opt to reuse a key from another torrent - * to provide a meaningful public name distinct from the shadow name - * to only encrypt the payload and not shadow any metadata +2. if yes then offer to + + * generate a random key. user may instead opt to reuse a key from another torrent + * provide a meaningful public name distinct from the shadow name + * only encrypt the payload and not shadow any metadata Key input --------- -* input choices: manual, magnet link, .torrent-keys file, reusing key from another torrent +* input choices: manual, magnet link, ``.torrent-keys`` file, reusing key from another torrent * immediate feedback whether keys match the mac and what kind of key was imported (root, payload, shadow) * option to decrypt data or leave it encrypted + * offer directory layout choices that would normally be offered when a torrent is imported Magnet/Key export From c44fcd06c9ee3df5e900111b4a8d2df55991bad5 Mon Sep 17 00:00:00 2001 From: The8472 Date: Mon, 8 Aug 2016 20:21:54 +0200 Subject: [PATCH 04/14] incorporate feedback --- html/beps/bep_encrypted_data.rst | 32 +++++++++++++++++--------------- 1 file changed, 17 insertions(+), 15 deletions(-) diff --git a/html/beps/bep_encrypted_data.rst b/html/beps/bep_encrypted_data.rst index c27d22b..7b78b09 100644 --- a/html/beps/bep_encrypted_data.rst +++ b/html/beps/bep_encrypted_data.rst @@ -30,7 +30,7 @@ and non-goals: Rationale ========= -BitTorrent swarms are mostly an open system well-suited for mass-distribution of data to the public. +In general BitTorrent swarms are an open system well-suited for mass-distribution of data to the public. Some use-cases require that the data is only distributed to a closed, trusted group of peers. In other cases the content may be meant for open distribution within a community without intent of announcing the content to the whole world. This is analogous to web content that is open to human visitors but requests via robots.txt[robots]_ that it should not be announced to the world by web crawlers. @@ -73,16 +73,16 @@ Metadata format The protocol version used to encrypt the torrent, currently *1*. New versions may be introduced by updates to this BEP if cryptographic weaknesses necessitate incompatible changes. Implementations should check if they support the version indicated in the metadata file and otherwise inform the user that they can download the data but not decrypt it. ``shadow`` - bencoded-then-encrypted dictionary of key-value pairs that shadow entries in the info dictionary. - If it is absent only the payload is encrypted and all info dictionary entries are non-shadowed. - Implementations should only shadow a whitelist of keys which they have a shadowing strategy and ignore other keys. - Shadowable keys suggested by this BEP: ``length, files, name, comment``. + bencoded-then-encrypted dictionary whose key-value pairs shadow entries in the info dictionary. + If it is absent only the payload is encrypted and no info dictionary entries are shadowed. + Implementations should only shadow a whitelist of keys for which they have a shadowing strategy and ignore other keys. + Shadowable keys suggested by this BEP: ``length``, ``files``, ``name``, ``comment``. ``mac`` message authentication code covering the info dictionary ``name`` - the name field is a mandatory part of [BEP 3]_, therefore a placeholder MUST be provided if a shadow name is used. An implementation may either generate a random string consisting of filesystem-friendly characters or allow the user to choose a public name that reveals less information than the shadow name. + the name field is a mandatory part of [BEP 3]_. If a shadow name is used then a placeholder MUST be provided. An implementation may either generate a random string consisting of filesystem-friendly characters or allow the user to choose a public name that reveals less information than the shadow name. ``files``, ``length`` The shadow dictionary MAY override the single/multifile nature indicated by the public info dictionary. If it does not shadow either key then the public information is canonical. @@ -96,7 +96,9 @@ To protect privacy an implementation should use shadowing for any additional key Encryption ========== -Building blocks used: SHA2-256[rfc6234]_, ChaCha20[rfc7539]_, HMAC[rfc2104]_, PBKDF2[rfc2898]_ +Building blocks used in version 1: SHA2-256[rfc6234]_, ChaCha20[rfc7539]_, HMAC[rfc2104]_, PBKDF2[rfc2898]_ + +``||`` is the concat operator .. parsed-literal:: @@ -106,7 +108,7 @@ Building blocks used: SHA2-256[rfc6234]_, ChaCha20[rfc7539]_, HMAC[rfc2104]_, P Key.shadow = sha256(Key.payload || "shadow") - mac = HMAC−SHA256(info-dict without mac, Key.shadow) + mac = HMAC−SHA256(info-dict with mac placeholder, Key.shadow) IV.payload = truncate_64(sha256(salt || "payload")) @@ -118,13 +120,13 @@ ChaCha20 is used to both encrypt the shadow dictionary and the torrent payload. The optional ``shadow`` dictionary is encrypted after bencoding with ``Key.shadow`` and ``IV.shadow``. -The ``mac`` is calculated over the bencoded info-dictionary including the ``bepXX`` dictionary but excluding the ``mac`` key value pair. - -Before calculating the ``pieces`` hashes all files are concatenated in ``files`` order (if there is more than one) and encrypted with ``Key.payload`` and ``IV.payload``. +The ``mac`` is calculated over the bencoded info-dictionary with an 32 zero bytes as placeholder for the ``mac`` value itself. If other extensions perform similar hashing over intermediate representations of the metadata the order in which they are applied needs to be specified. -Encryption/Decryption of the payload happens at a lower layer than the ``pieces`` hash calculation. I.e. ``files -(concat)-> pieces`` has been replaced with ``files -(concat)-> encryption -> pieces``. +The encryption is applied when file data is loaded into the piece address space. Which means the ``pieces`` hashes are calculated over the encrypted data using ``Key.payload`` and ``IV.payload``. +The key stream of the cipher applied according ot the position of the data in the piece space. I.e. any padding, holes or alignment of piece data also affects which part of the key stream is used. +This BEP only covers pieces representing file entries. Should future extensions put other data into the piece address space the interaction with this BEP will need to be defined. -An implementation unaware of this BEP would simply store the ciphertext to the disk in a ``length``-sized file with the public name. +An implementation unaware of this BEP will simply store the ciphertext to the disk in a ``length``-sized file with the public name. This scheme only provides integrity verification for the ciphertext through the ``pieces`` hashes, i.e. correct decryption is not verified. An incorrect key could result in garbage plaintext, but this does not introduce a new problem since bittorrent never guaranteed that the files contain what the metadata claims. @@ -198,8 +200,8 @@ This BEP does not mandate how an implementation should store encrypted or decryp However, if a client wants to be more flexible than either ignoring this BEP (thus storing ciphertext on disk) or always requiring the keys before starting a torrent it will have to consider the following: * clients can be in 3 states regarding key knowledge: no keys, shadow key only, keys that decrypt plaintext; two encryption states: encrypted, decrypted; 3 file layout 3 states: encrypted, multi-file, single-file -* a user may start downloading a torrent before he has access to the keys. this requires a way to input keys and to convert between encrypted and decrypted storage -* to reduce the amount of data that a compromised system could reveal a seeder may want to import plaintext data, convert it to encrypted form and request that the client discards the keys. +* a user may start downloading a torrent before keys are available. this requires a way to input keys and to convert between encrypted and decrypted storage +* for performance or security reasons a seeder may want to import plaintext data, encrypt it and then discard the keys to directly seed the encrypted data from disk. Since encrypted torrents may contain confidential / private data implementations may also want to set more restrictive file permissions when decrypting data to reduce exposure in multi-user environments. From bd7e81ca31eaf890a2f18f80361310263df6374b Mon Sep 17 00:00:00 2001 From: The8472 Date: Sat, 27 Aug 2016 20:09:22 +0200 Subject: [PATCH 05/14] reduce complexity - mandate single-file layout for public data - mandatory shadow dictionary --- html/beps/bep_encrypted_data.rst | 66 +++++++++++++++++++++++--------- 1 file changed, 47 insertions(+), 19 deletions(-) diff --git a/html/beps/bep_encrypted_data.rst b/html/beps/bep_encrypted_data.rst index 7b78b09..f9edd30 100644 --- a/html/beps/bep_encrypted_data.rst +++ b/html/beps/bep_encrypted_data.rst @@ -55,10 +55,10 @@ Metadata format bepXX: { mac: *<32bytes of hmac output (string)>*, salt: *<32bytes of random binary data (string)>*, - shadow: ** + shadow: **, v: **, }, - length *or* files: **, + length: **, name: **, piece length: ** pieces: ** @@ -74,23 +74,51 @@ Metadata format ``shadow`` bencoded-then-encrypted dictionary whose key-value pairs shadow entries in the info dictionary. - If it is absent only the payload is encrypted and no info dictionary entries are shadowed. - Implementations should only shadow a whitelist of keys for which they have a shadowing strategy and ignore other keys. - Shadowable keys suggested by this BEP: ``length``, ``files``, ``name``, ``comment``. ``mac`` message authentication code covering the info dictionary ``name`` - the name field is a mandatory part of [BEP 3]_. If a shadow name is used then a placeholder MUST be provided. An implementation may either generate a random string consisting of filesystem-friendly characters or allow the user to choose a public name that reveals less information than the shadow name. + the name field is a mandatory part of [BEP 3]_. A placeholder MUST be provided. An implementation may either generate a random string consisting of filesystem-friendly characters or allow the user to choose a public name that reveals less information than the shadow name. -``files``, ``length`` - The shadow dictionary MAY override the single/multifile nature indicated by the public info dictionary. If it does not shadow either key then the public information is canonical. - If the ``files`` or ``length`` are shadowed then the overall payload length MUST be consistent with the public version. - If a shadow dictionary is present the public information should be treated as decorative / advisory until it can be determined whether it has been shadowed, i.e. until the shadow data can be decrypted. +``length`` + The info dictionary describes the piece space layout in its ciphertext form. Currently there is no need for anything but a contiguous range of pieces, therefore the info dictionary MUST be created in single file mode. + Future revisions of this BEP may change this requirement if non-contiguous ciphertext representations become necessary. + +Shadow Dictionary +----------------- + +While the info dictionary represents the torrent in its ciphertext form the shadow dictionary represents the plaintext. +In general entries in the shadow dictionary have the same semantics as keys in the info dictionary and take precedence over them, +with the restriction that implementations should only shadow a whitelist of keys for which they have a shadowing strategy and ignore other keys. + +At a minimum clients should support shadowing of the following info dictionary keys: ``length``, ``files``, ``name``, ``comment``. +To protect privacy shadowing should also be used for any implementation-specific keys that reveal information about the payload. + + +.. parsed-literal:: + { + comment: **, + length: **, + name: **, + files: **, + ... + } + +``length`` or ``files`` + These fields represent the plaintext file layout in single or multi-file layout. This means that while the ciphertext is represented as a single file the plaintext can have a different layout. + The overall length of the plaintext MUST be consistent with the ciphertext length. To obfuscate file sizes BEP 47 padding files can be used. + +Interaction with padding files +------------------------------ -To protect privacy an implementation should use shadowing for any additional keys that reveal information about the payload +Since the public representation is single-file there is no padding in the ciphertext. + +The shadow file layout can contain padding files, which consist of zeroes in the *plaintext*. + +A client that has access to the shadow data should still download the padding data at least up to the next piece boundary (allowing paddings larger than a single piece to be partially skipped) to avoid leaking information about actual file sizes or knowledge of the file metadata. +Similarly clients should avoid prioritizing individual pieces or sequential downloading because they would otherwise reveal their knowledge of the file layout. Encryption @@ -114,16 +142,16 @@ Building blocks used in version 1: SHA2-256[rfc6234]_, ChaCha20[rfc7539]_, HMAC IV.shadow = truncate_64(sha256(salt || "shadow")) -PBKDF2 key derivation is used in case root keys with less entropy than the recommended are used, e.g. for password-based schemes. But for general use this BEP assumes that the root key consists of random binary data and hence mandates hexadecimal encoding when the keys need to be displayed in a human-readable format. +PBKDF2 key derivation is used in case root keys with less entropy than recommended is used, e.g. for password-based schemes. But for general use this BEP assumes that the root key consists of random binary data and hence mandates hexadecimal encoding when the keys need to be displayed in a human-readable format. ChaCha20 is used to both encrypt the shadow dictionary and the torrent payload. The optional ``shadow`` dictionary is encrypted after bencoding with ``Key.shadow`` and ``IV.shadow``. -The ``mac`` is calculated over the bencoded info-dictionary with an 32 zero bytes as placeholder for the ``mac`` value itself. If other extensions perform similar hashing over intermediate representations of the metadata the order in which they are applied needs to be specified. +The ``mac`` is calculated over the bencoded info-dictionary with 32 zero bytes as placeholder for the ``mac`` value itself. If other extensions perform similar hashing operations over incomplete representations of the metadata the order in which they are applied needs to be specified. -The encryption is applied when file data is loaded into the piece address space. Which means the ``pieces`` hashes are calculated over the encrypted data using ``Key.payload`` and ``IV.payload``. -The key stream of the cipher applied according ot the position of the data in the piece space. I.e. any padding, holes or alignment of piece data also affects which part of the key stream is used. +The encryption is applied while file data is loaded into the piece address space. Which means the ``pieces`` hashes are calculated over the ciphertext using ``Key.payload`` and ``IV.payload``. +The key stream of the cipher applied according to the position of the data in the piece space. I.e. any padding, holes or alignment of piece data also affects which part of the key stream is used. This BEP only covers pieces representing file entries. Should future extensions put other data into the piece address space the interaction with this BEP will need to be defined. An implementation unaware of this BEP will simply store the ciphertext to the disk in a ``length``-sized file with the public name. @@ -133,7 +161,7 @@ This scheme only provides integrity verification for the ciphertext through the Key reuse and hierarchy ----------------------- -The usage of a salt to derive the payload key from the root key allows the root key to be reused across several torrents while still generating distinct payload keys for each. But UI design SHOULD encourage random key generation for each new torrent and require explicit user action for key reuse. +The salt in the payload key derivation allows the root key to be reused across several torrents while still generating distinct payload keys for each. But UI design SHOULD encourage random key generation for each new torrent and require explicit user action for key reuse. An implementation may provide the option to attempt to decrypt a torrent with the same key as another torrent in case a key is only communicated once and individual torrents are later distributed without explicitly providing keys. @@ -199,13 +227,14 @@ This BEP does not mandate how an implementation should store encrypted or decryp However, if a client wants to be more flexible than either ignoring this BEP (thus storing ciphertext on disk) or always requiring the keys before starting a torrent it will have to consider the following: -* clients can be in 3 states regarding key knowledge: no keys, shadow key only, keys that decrypt plaintext; two encryption states: encrypted, decrypted; 3 file layout 3 states: encrypted, multi-file, single-file +* clients can be in 3 states regarding key knowledge: no keys, shadow key only, keys that can decrypt the payload; two encryption states: encrypted, decrypted; 3 file layout states: encrypted, multi-file, single-file * a user may start downloading a torrent before keys are available. this requires a way to input keys and to convert between encrypted and decrypted storage * for performance or security reasons a seeder may want to import plaintext data, encrypt it and then discard the keys to directly seed the encrypted data from disk. Since encrypted torrents may contain confidential / private data implementations may also want to set more restrictive file permissions when decrypting data to reduce exposure in multi-user environments. + Security Properties =================== @@ -246,7 +275,6 @@ Torrent creation * generate a random key. user may instead opt to reuse a key from another torrent * provide a meaningful public name distinct from the shadow name - * only encrypt the payload and not shadow any metadata Key input @@ -264,7 +292,7 @@ Magnet/Key export Provide option to * not include key [default] -* include shadow key only, if there is any shadowed metadata +* include shadow key. * include payload key. * include root key. if the client knows that the key has been reused for other torrents it should indicate this to the user From 71a2e410af25986874558dff90cb6f3158944159 Mon Sep 17 00:00:00 2001 From: The8472 Date: Sun, 1 Jan 2017 21:59:41 +0100 Subject: [PATCH 06/14] finish TODOs, update text to match prototype --- html/beps/bep_encrypted_data.rst | 146 +++++++++++++++++++++++++------ 1 file changed, 119 insertions(+), 27 deletions(-) diff --git a/html/beps/bep_encrypted_data.rst b/html/beps/bep_encrypted_data.rst index f9edd30..6190fa1 100644 --- a/html/beps/bep_encrypted_data.rst +++ b/html/beps/bep_encrypted_data.rst @@ -22,7 +22,7 @@ and non-goals: * forward-secrecy * anonymity -* signature-based authentication, already covered by [BEP 35] +* signature-based authentication, already covered by BEP 35 [#BEP-35]_ * authentication of peer connections @@ -36,7 +36,7 @@ Some use-cases require that the data is only distributed to a closed, trusted gr In other cases the content may be meant for open distribution within a community without intent of announcing the content to the whole world. This is analogous to web content that is open to human visitors but requests via robots.txt[robots]_ that it should not be announced to the world by web crawlers. -While the private flag [BEP-27]_ may be sufficient in a controlled environment to prevent information about the torrent (e.g. its infohash) from escaping and thus preventing others from connecting to the swarm this is a very brittle form of security which also prevents the use of public infrastructure such as open trackers, PEX or the DHT. +While the private flag [#BEP-27]_ may be sufficient in a controlled environment to prevent information about the torrent (e.g. its infohash) from escaping and thus preventing others from connecting to the swarm this is a very brittle form of security which also prevents the use of public infrastructure such as open trackers, PEX or the DHT. Similarly Message Stream Encryption provides limited protection from passive eavesdroppers on the network layer but does not prevent the infohash from escaping. @@ -52,7 +52,7 @@ Metadata format { info: { - bepXX: { + encrypted: { mac: *<32bytes of hmac output (string)>*, salt: *<32bytes of random binary data (string)>*, shadow: **, @@ -61,13 +61,13 @@ Metadata format length: **, name: **, piece length: ** - pieces: ** + pieces: ** }, } ``salt`` - the random data must be generated by a cryptographically secure RNG to avoid IV reuse. + unique to each torrent. it must be generated by a cryptographically secure RNG. ``v`` The protocol version used to encrypt the torrent, currently *1*. New versions may be introduced by updates to this BEP if cryptographic weaknesses necessitate incompatible changes. Implementations should check if they support the version indicated in the metadata file and otherwise inform the user that they can download the data but not decrypt it. @@ -79,7 +79,7 @@ Metadata format message authentication code covering the info dictionary ``name`` - the name field is a mandatory part of [BEP 3]_. A placeholder MUST be provided. An implementation may either generate a random string consisting of filesystem-friendly characters or allow the user to choose a public name that reveals less information than the shadow name. + the name field is a mandatory part of BEP 3 [#BEP-3]_. A placeholder MUST be provided. An implementation may either generate a random string consisting of filesystem-friendly characters or allow the user to choose a public name that reveals less information than the shadow name. ``length`` The info dictionary describes the piece space layout in its ciphertext form. Currently there is no need for anything but a contiguous range of pieces, therefore the info dictionary MUST be created in single file mode. @@ -95,6 +95,8 @@ with the restriction that implementations should only shadow a whitelist of keys At a minimum clients should support shadowing of the following info dictionary keys: ``length``, ``files``, ``name``, ``comment``. To protect privacy shadowing should also be used for any implementation-specific keys that reveal information about the payload. +Additionally clients should embed BEP 47 [#BEP-47]_ ``sha1`` values of the plaintext files into the shadow dictionary to simplify deduplication, which would otherwise have to attempt encrypting candidate files before checking them against the piece hashes which represent the ciphertext. + .. parsed-literal:: @@ -108,14 +110,14 @@ To protect privacy shadowing should also be used for any implementation-specific ``length`` or ``files`` These fields represent the plaintext file layout in single or multi-file layout. This means that while the ciphertext is represented as a single file the plaintext can have a different layout. - The overall length of the plaintext MUST be consistent with the ciphertext length. To obfuscate file sizes BEP 47 padding files can be used. + The plaintext length may be shorter than the ciphertext so that the ciphertext length can be rounded up to an integer multiple of the piece length to obfuscate file sizes. The plaintext is zero-padded in that case. BEP 47 [#BEP-47]_ can also be used for this purpose in multi-file mode, but since there currently is no way to pad in single-file mode this discrepancy is allowed. -Interaction with padding files ------------------------------- +Interaction with paddings +------------------------- Since the public representation is single-file there is no padding in the ciphertext. -The shadow file layout can contain padding files, which consist of zeroes in the *plaintext*. +The shadow file layout can contain padding files or implicit padding due to the length discrepancy. Those paddings consist of zeroes in the *plaintext*. A client that has access to the shadow data should still download the padding data at least up to the next piece boundary (allowing paddings larger than a single piece to be partially skipped) to avoid leaking information about actual file sizes or knowledge of the file metadata. Similarly clients should avoid prioritizing individual pieces or sequential downloading because they would otherwise reveal their knowledge of the file layout. @@ -124,34 +126,36 @@ Similarly clients should avoid prioritizing individual pieces or sequential down Encryption ========== -Building blocks used in version 1: SHA2-256[rfc6234]_, ChaCha20[rfc7539]_, HMAC[rfc2104]_, PBKDF2[rfc2898]_ +Building blocks used in version 1: SHA2-256 [#rfc6234]_, ChaCha20 [#chacha]_, HMAC [#rfc2104]_, scrypt [#rfc7914]_ ``||`` is the concat operator .. parsed-literal:: - Key.root = random key, recommended strength: 256bits + root_key = reusable key or password from which other values are derived. recommended strength: 256bits - Key.payload = PBKDF2(HMAC−SHA256, Key.root, salt || "payload", 4096, 256) + payload_key = scrypt(N: 2\ :sup:`14`\ , r: 8, p: 1, password: root_key, salt: (salt || "payload")) - Key.shadow = sha256(Key.payload || "shadow") + shadow_key = sha256(payload_key || "shadow") - mac = HMAC−SHA256(info-dict with mac placeholder, Key.shadow) + mac = HMAC−SHA256(message: info-dict with mac placeholder, key: shadow_key) - IV.payload = truncate_64(sha256(salt || "payload")) + payload_nonce = sha256(salt || "payload")[0..8] - IV.shadow = truncate_64(sha256(salt || "shadow")) + shadow_nonce = sha256(salt || "shadow")[0..8] + +``salt``, ``payload_key``, ``shadow_key`` and ``mac`` are 32 bytes each. The nonces are 8 bytes each. ``root_key`` does not have a fixed size. -PBKDF2 key derivation is used in case root keys with less entropy than recommended is used, e.g. for password-based schemes. But for general use this BEP assumes that the root key consists of random binary data and hence mandates hexadecimal encoding when the keys need to be displayed in a human-readable format. +scrypt key derivation is used in case root keys with less entropy than recommended are used, e.g. for password-based schemes. But for general use this BEP assumes that the root key consists of random binary data and hence mandates hexadecimal encoding when the keys need to be displayed in a human-readable format. -ChaCha20 is used to both encrypt the shadow dictionary and the torrent payload. +ChaCha20 with a 64bit nonce, 64bit internal block counter and 256bit key is used to both encrypt the shadow dictionary and the torrent payload. A longer nonce is not needed since a new payload key is already derived for each torrent and using the alternative 96bit nonce/32bit block counter version would also limit the payload size to 256TiB. -The optional ``shadow`` dictionary is encrypted after bencoding with ``Key.shadow`` and ``IV.shadow``. +The ``shadow`` dictionary is encrypted after bencoding with ``shadow_key`` and ``shadow_nonce``. The ``mac`` is calculated over the bencoded info-dictionary with 32 zero bytes as placeholder for the ``mac`` value itself. If other extensions perform similar hashing operations over incomplete representations of the metadata the order in which they are applied needs to be specified. -The encryption is applied while file data is loaded into the piece address space. Which means the ``pieces`` hashes are calculated over the ciphertext using ``Key.payload`` and ``IV.payload``. -The key stream of the cipher applied according to the position of the data in the piece space. I.e. any padding, holes or alignment of piece data also affects which part of the key stream is used. +The encryption is applied while file data is loaded into the piece address space. Which means the ``pieces`` hashes are calculated over the ciphertext using ``payload_key`` and ``payload_nonce``. +The key stream of the cipher applied according to the absolute offset of the data in the piece space. I.e. any padding, holes or alignment in the plaintext is included in the key stream seek position. This BEP only covers pieces representing file entries. Should future extensions put other data into the piece address space the interaction with this BEP will need to be defined. An implementation unaware of this BEP will simply store the ciphertext to the disk in a ``length``-sized file with the public name. @@ -216,7 +220,7 @@ To export keys to a file, e.g. for archival purposes or for bulk torrent migrati ``.torrent-keys`` should be used as file extension. By default filesystem permissions should be set appropriately to restrict access to key files to the current user. -A key file can contain keys for multiple torrents. Only one key needs to be included per torrent, as the lower keys can be derived. Keys must be included in their raw, unencoded form. +A key file can contain keys for multiple torrents. Only one key needs to be included per torrent, as the lower keys can be derived. Keys must be included in their binary form. @@ -297,17 +301,105 @@ Provide option to * include root key. if the client knows that the key has been reused for other torrents it should indicate this to the user -Test Vectors -============ +Test Data +========= + +The test data was built as follows: + +1. filled ./foo/a with 18 * 16KiB of the character ``a`` +2. filled ./foo/b with 2 * 16KiB of the character ``b`` +3. chose root key: 4b6cc4770ff57005d597a8f01e83679d2f2b2ce86490ab5cf10e71f4ef7533e2 +4. generated from file structure shadow dictionary:: + + { + "name":"foo", + "files":[ + { + "sha1":0x5B63C06D350BB4BE82F00B170B822A7BF3F5B190, + "path":["a"], + "length":294912 + }, { + "sha1":0x5B94E57E8BC842A56BB6BD628F3309A6D9092421, + "path":["b"], + "length":32768 + }, { + "length":229376, + "attr":"p" + } + ] + } + + + +5. generated torrent from previous values, salt and public name:: + + { + "info":{ + "pieces":0x28EC203A4435B4DAC7582B598E5A1F11F6060B7202FA5D9E68E003006697AEABF48386B6D86AA9A9, + "encrypted":{ + "salt":0x1053F898E1917EAB461616F895BC2F50ADFFE48F7F4C92AD547E6849B7D27DF7, + "shadow":0x1508C10EF6558D344A645299E4B2C3A3FDC5FAF799FF71E9045E2617C887773D05E6458E0A14D5CC953374F7AA3C944023CD5D87C3B12E3F316FB32A1890FBE37B0F1482217B3E8B77E339D2003A12ADA04940E7BDBFA029EE652450BA512C45DFC7EC3A331FAF661D80AABE08281F2685675B5302FEBA8DE99B7453CEDE920E36C863F4860F0901FA4E99DF7840489B6C97F813F6E9FE97B2B8B19116C15367C3F1EA77, + "v":1, + "mac":0x4CC30CEB6458CFE7598AFDA186C955E75142E726C51AD14FE4760B1F09FAB9E5 + }, + "length":557056, + "name":"Public Name", + "piece length":278528 + } + } + + + +Additional intermediate values: + +payload nonce + 381d28f55eb87e2e5ed2f64234f5953f3b1ebf7adc3efc85cede251fc30e4c6b + +shadow nonce + 3824dc7d0e71dd38903f1107c8eff226ee562bd05c4a0aa068f6be07d523aaef + +torrent key + afaf3eb80291b13546814af8cacf0ae5150b5505e6c0633954bf9daa17363a83 + +metainfo key + 237b2116dc9397a053ff17811d260f02368bc0a704e558d671c33bd015e15f5f + +sha1sum of the bencoded shadow dictionary + e36ffe11188a878147ca72bf9e70b40067451333 + +sha1sum of the bencoded torrent + 2c68945f1bcca4bd071d2f5ec0626180d7a6471f + +infohash + 2668b09faba9888dc0c182cb0527b3aa2f31c4f7 -## TODO References ========== -## TODO +.. [#BEP-3] BEP_0003. The BitTorrent Protocol Specification + (http://bittorrent.org/beps/bep_0003.html) + +.. [#BEP-27] BEP_0027. Private Torrents + (http://bittorrent.org/beps/bep_0027.html) + +.. [#BEP-35] BEP_0035. Torrent Signing + (http://bittorrent.org/beps/bep_0035.html) + +.. [#BEP-47] BEP_0047. Padding files and extended file attributes + (http://bittorrent.org/beps/bep_0047.html) + +.. [#chacha] ChaCha20 by Daniel J. Bernstein + (https://cr.yp.to/chacha.html) + +.. [#RFC-2119] RFC-2119. http://www.ietf.org/rfc/rfc2119.txt + +.. [#rfc6234] RFC 6234. http://www.ietf.org/rfc/rfc2119.txt + +.. [#rfc2104] RFC 2104. http://www.ietf.org/rfc/rfc2104.txt +.. [#rfc7914] RFC 7914. http://www.ietf.org/rfc/rfc7914.txt Copyright ========= From fc8d0cb2c20093809827124ae272ec1abdbc878f Mon Sep 17 00:00:00 2001 From: The8472 Date: Sun, 1 Jan 2017 22:15:13 +0100 Subject: [PATCH 07/14] assign number, add TOC --- html/beps/{bep_encrypted_data.rst => bep_0052.rst} | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) rename html/beps/{bep_encrypted_data.rst => bep_0052.rst} (99%) diff --git a/html/beps/bep_encrypted_data.rst b/html/beps/bep_0052.rst similarity index 99% rename from html/beps/bep_encrypted_data.rst rename to html/beps/bep_0052.rst index 6190fa1..880e6da 100644 --- a/html/beps/bep_encrypted_data.rst +++ b/html/beps/bep_0052.rst @@ -1,4 +1,4 @@ -:BEP: ?? +:BEP: 52 :Title: Encrypted Torrent Payload :Version: $Revision$ :Last-Modified: $Date$ @@ -24,6 +24,9 @@ and non-goals: * anonymity * signature-based authentication, already covered by BEP 35 [#BEP-35]_ * authentication of peer connections + + +.. contents:: From 30e808196bd6acf02d5ee270a35e422f6b276996 Mon Sep 17 00:00:00 2001 From: The8472 Date: Sun, 1 Jan 2017 22:27:37 +0100 Subject: [PATCH 08/14] remove unused reference --- html/beps/bep_0052.rst | 2 -- 1 file changed, 2 deletions(-) diff --git a/html/beps/bep_0052.rst b/html/beps/bep_0052.rst index 880e6da..71a7812 100644 --- a/html/beps/bep_0052.rst +++ b/html/beps/bep_0052.rst @@ -396,8 +396,6 @@ References .. [#chacha] ChaCha20 by Daniel J. Bernstein (https://cr.yp.to/chacha.html) -.. [#RFC-2119] RFC-2119. http://www.ietf.org/rfc/rfc2119.txt - .. [#rfc6234] RFC 6234. http://www.ietf.org/rfc/rfc2119.txt .. [#rfc2104] RFC 2104. http://www.ietf.org/rfc/rfc2104.txt From eb8487e285b1e26cd4e0b7e317de163ba89f6233 Mon Sep 17 00:00:00 2001 From: The8472 Date: Sat, 4 Feb 2017 12:14:57 +0100 Subject: [PATCH 09/14] base64 representation, passwords, simplify mac --- html/beps/bep_0052.rst | 131 +++++++++++++++++++++++++++-------------- 1 file changed, 87 insertions(+), 44 deletions(-) diff --git a/html/beps/bep_0052.rst b/html/beps/bep_0052.rst index 71a7812..2f25d4f 100644 --- a/html/beps/bep_0052.rst +++ b/html/beps/bep_0052.rst @@ -36,7 +36,7 @@ Rationale In general BitTorrent swarms are an open system well-suited for mass-distribution of data to the public. Some use-cases require that the data is only distributed to a closed, trusted group of peers. -In other cases the content may be meant for open distribution within a community without intent of announcing the content to the whole world. This is analogous to web content that is open to human visitors but requests via robots.txt[robots]_ that it should not be announced to the world by web crawlers. +In other cases the content may be meant for open distribution within a community without intent of announcing the content to the whole world. This is analogous to web content that is open to human visitors but requests via robots.txt that it should not be announced to the world by web crawlers. While the private flag [#BEP-27]_ may be sufficient in a controlled environment to prevent information about the torrent (e.g. its infohash) from escaping and thus preventing others from connecting to the swarm this is a very brittle form of security which also prevents the use of public infrastructure such as open trackers, PEX or the DHT. @@ -55,8 +55,8 @@ Metadata format { info: { + enc mac: *<32bytes of hmac output (string)>*, encrypted: { - mac: *<32bytes of hmac output (string)>*, salt: *<32bytes of random binary data (string)>*, shadow: **, v: **, @@ -70,7 +70,7 @@ Metadata format ``salt`` - unique to each torrent. it must be generated by a cryptographically secure RNG. + a new, unique value must be chosen whenever the encrypted contents of a torrent are created or modified. It must be generated by a cryptographically secure RNG. Reuse, e.g. in a decrypt-modify-encrypt operation, would compromise the data. ``v`` The protocol version used to encrypt the torrent, currently *1*. New versions may be introduced by updates to this BEP if cryptographic weaknesses necessitate incompatible changes. Implementations should check if they support the version indicated in the metadata file and otherwise inform the user that they can download the data but not decrypt it. @@ -78,8 +78,8 @@ Metadata format ``shadow`` bencoded-then-encrypted dictionary whose key-value pairs shadow entries in the info dictionary. -``mac`` - message authentication code covering the info dictionary +``enc mac`` + message authentication code covering parts of the info dictionary ``name`` the name field is a mandatory part of BEP 3 [#BEP-3]_. A placeholder MUST be provided. An implementation may either generate a random string consisting of filesystem-friendly characters or allow the user to choose a public name that reveals less information than the shadow name. @@ -135,27 +135,27 @@ Building blocks used in version 1: SHA2-256 [#rfc6234]_, ChaCha20 [#chacha]_, H .. parsed-literal:: - root_key = reusable key or password from which other values are derived. recommended strength: 256bits + byte[] root_key = reusable key or password from which other values are derived. recommended strength: 256bits - payload_key = scrypt(N: 2\ :sup:`14`\ , r: 8, p: 1, password: root_key, salt: (salt || "payload")) + byte[32] payload_key = scrypt(N: 2\ :sup:`14`\ , r: 8, p: 1, password: root_key, salt: (salt || "payload")) - shadow_key = sha256(payload_key || "shadow") + byte[32] shadow_key = sha256(payload_key || "shadow") - mac = HMAC−SHA256(message: info-dict with mac placeholder, key: shadow_key) + byte[32] mac = HMAC−SHA256(message: bencode(info["length"]) || bencode(merkle_root) || bencode(info["encrypted"]), key: shadow_key) - payload_nonce = sha256(salt || "payload")[0..8] + byte[8] payload_nonce = sha256(salt || "payload")[0..8] - shadow_nonce = sha256(salt || "shadow")[0..8] + byte[8] shadow_nonce = sha256(salt || "shadow")[0..8] ``salt``, ``payload_key``, ``shadow_key`` and ``mac`` are 32 bytes each. The nonces are 8 bytes each. ``root_key`` does not have a fixed size. -scrypt key derivation is used in case root keys with less entropy than recommended are used, e.g. for password-based schemes. But for general use this BEP assumes that the root key consists of random binary data and hence mandates hexadecimal encoding when the keys need to be displayed in a human-readable format. +scrypt key derivation is used in case root keys with less entropy than recommended are used, e.g. for password-based schemes. Note that arbitrary binary data is allowed for keys, so this proposal distinguishes between a key and human-readable passwords. More on that below. ChaCha20 with a 64bit nonce, 64bit internal block counter and 256bit key is used to both encrypt the shadow dictionary and the torrent payload. A longer nonce is not needed since a new payload key is already derived for each torrent and using the alternative 96bit nonce/32bit block counter version would also limit the payload size to 256TiB. -The ``shadow`` dictionary is encrypted after bencoding with ``shadow_key`` and ``shadow_nonce``. +The ``shadow`` dictionary is bencoded and then encrypted with ``shadow_key`` and ``shadow_nonce``. -The ``mac`` is calculated over the bencoded info-dictionary with 32 zero bytes as placeholder for the ``mac`` value itself. If other extensions perform similar hashing operations over incomplete representations of the metadata the order in which they are applied needs to be specified. +The ``enc mac`` is calculated as HMAC over the concatenated bencoded representations of the ``length`` value, merkle root [#BEP-30]_ and ``encrypted`` value. The merkle root can be derived from the ``pieces`` array. This allows a torrent to be converted between merkle and flat pieces layout without access to the keys. The encryption is applied while file data is loaded into the piece address space. Which means the ``pieces`` hashes are calculated over the ciphertext using ``payload_key`` and ``payload_nonce``. The key stream of the cipher applied according to the absolute offset of the data in the piece space. I.e. any padding, holes or alignment in the plaintext is included in the key stream seek position. @@ -181,7 +181,11 @@ The mac can also be used to determine to which level of the hierarchy a key belo Key sharing =========== -Implementations SHOULD provide a way to view and input the different keys for a torrent so users can share them in unstructured ways. The hex-encoded form should be used for this purpose. +Implementations SHOULD provide a way to view and input the different keys for a torrent so users can share them in unstructured ways. To allow for both arbitrary binary data - which is necessary for intermediate keys - and human-readable passphrases two encodings are necessary: + +a) url-safe base64 encoding +b) a valid unicode string where the utf8-representation is used as root key + Encouraging users to share keys without bundling them with torrents or magnets in a structured way allows them to exchange them over separate channels and also makes it slightly more difficult to crawl the internet for unintentionally disclosed keys. @@ -193,12 +197,17 @@ Keys MUST NOT be included in .torrent files in any form. Too much infrastructure Magnets ------- -Clients should only include a key if the user explicitly requests it or if the secret part has been sufficiently highlighted to make him aware of what type of secret he is sharing. +While directly including the secrets in a magnet is **discouraged** - they should be conveyed separately - this proposal nevertheless specifies a format to ensure that keys can be transmitted unambiguously when it cannot be avoided. -To include a key in magnet links the parameter ``&key=`` can be added where the key is in hex-encoded form. +To include a key in magnet links the parameter ``&key=`` can be added where the key is in the url-safe base64-encoded form, minus padding to avoid percent-escaping the ``=`` padding. The importing client can determine which type of key it is based on the ``mac`` in the metadata. +If the root key can be utf8-decoded to a valid unicode string it can also be passed as ``&pw=``. Since user agents may process magnet URIs into Internationalized Resource Identifiers (IRIs) for increased readability clients should be prepared to handle IRI input. + + + + Key files --------- @@ -207,19 +216,23 @@ To export keys to a file, e.g. for archival purposes or for bulk torrent migrati .. parsed-literal:: { - torrent-keys: { - **: { - root: **, - payload: **, - shadow: ** + torrent-keys: [ + { + "key": ** + "hints": [ + **, + ... + ] }, ... - }, + ] } +Each dictionary in the ``torrent-keys`` list represents one key and optional implementation-defined fields associated with that key. + +*torrent hint* + An identifier calculated from a torrent's mac via ``SHA256(mac || ".torrent-keys")[0..8]``. This allows a torrent client to locate keys for a metadata file without having to attempt key-derivation. -*torrent identifier* - A unique, use-specific identifier calculated from the torrent's mac via ``SHA256(mac || ".torrent-keys")``. This allows a torrent client to locate keys for a metadata file while preventing reverse lookups for those who do not have access to the metadata. ``.torrent-keys`` should be used as file extension. By default filesystem permissions should be set appropriately to restrict access to key files to the current user. @@ -227,6 +240,8 @@ A key file can contain keys for multiple torrents. Only one key needs to be incl + + Storage layer ============= @@ -250,10 +265,10 @@ The goal is to provide security equivalent to publicly distributing an encrypted In particular that means: * swarms remain open, anyone can participate in a swarm, with or without access to the secrets -* an observer without access to the secrets does not know what data is being shared +* an observer without access to the secrets can not confirm that any published metadata does indeed match the torrent * correctness of the metadata cannot be confirmed without access to both secrets -* observing that someone participated in a swarm and uploaded data is no longer equivalent to knowing that they had access to the plaintext or knowledge of the metadata -* the ciphertext is accessible to the public. this may be desirable to provide upload bandwidth without knowledge of the content, e.g. to allow untrusted servers to distribute confidential data to trusted clients or to enable hosting without the need to proactively moderate user content. +* observing that someone participated in a swarm and uploaded data is no longer equivalent to knowing that they had access to the plaintext or knowledge of the metadata. +* the ciphertext is accessible to the public. this may be desirable to provide upload bandwidth without knowledge of the content, e.g. to allow untrusted servers to distribute confidential data to trusted clients, to enable hosting without the need to proactively moderate user content or to operate content-agnostic caches. Limitations: @@ -307,11 +322,11 @@ Provide option to Test Data ========= -The test data was built as follows: +The test data is generated as follows: -1. filled ./foo/a with 18 * 16KiB of the character ``a`` -2. filled ./foo/b with 2 * 16KiB of the character ``b`` -3. chose root key: 4b6cc4770ff57005d597a8f01e83679d2f2b2ce86490ab5cf10e71f4ef7533e2 +1. fill ./foo/a with 18 * 16KiB of the character ``a`` +2. fill ./foo/b with 2 * 16KiB of the character ``b`` +3. use root key: 0x4b6cc4770ff57005d597a8f01e83679d2f2b2ce86490ab5cf10e71f4ef7533e2 4. generated from file structure shadow dictionary:: { @@ -334,47 +349,72 @@ The test data was built as follows: -5. generated torrent from previous values, salt and public name:: +5. generated torrent from salt and previous values: { "info":{ "pieces":0x28EC203A4435B4DAC7582B598E5A1F11F6060B7202FA5D9E68E003006697AEABF48386B6D86AA9A9, + "sha1":0x8B5C9069F227DED25CE1CAD65CA0DF29812BECA6, + "enc mac":0xF2614F94AE0408138B5ED75B45B626B3424E65A83BAB0D3468A649E5792631DA, "encrypted":{ "salt":0x1053F898E1917EAB461616F895BC2F50ADFFE48F7F4C92AD547E6849B7D27DF7, "shadow":0x1508C10EF6558D344A645299E4B2C3A3FDC5FAF799FF71E9045E2617C887773D05E6458E0A14D5CC953374F7AA3C944023CD5D87C3B12E3F316FB32A1890FBE37B0F1482217B3E8B77E339D2003A12ADA04940E7BDBFA029EE652450BA512C45DFC7EC3A331FAF661D80AABE08281F2685675B5302FEBA8DE99B7453CEDE920E36C863F4860F0901FA4E99DF7840489B6C97F813F6E9FE97B2B8B19116C15367C3F1EA77, - "v":1, - "mac":0x4CC30CEB6458CFE7598AFDA186C955E75142E726C51AD14FE4760B1F09FAB9E5 + "v":1 }, "length":557056, "name":"Public Name", "piece length":278528 } } + -Additional intermediate values: +Additional intermediate values in hex: payload nonce - 381d28f55eb87e2e5ed2f64234f5953f3b1ebf7adc3efc85cede251fc30e4c6b + 381d28f55eb87e2e shadow nonce - 3824dc7d0e71dd38903f1107c8eff226ee562bd05c4a0aa068f6be07d523aaef + 3824dc7d0e71dd38 -torrent key +torrent key afaf3eb80291b13546814af8cacf0ae5150b5505e6c0633954bf9daa17363a83 metainfo key 237b2116dc9397a053ff17811d260f02368bc0a704e558d671c33bd015e15f5f -sha1sum of the bencoded shadow dictionary +sha1sum of the plaintext bencoded shadow dictionary e36ffe11188a878147ca72bf9e70b40067451333 + -sha1sum of the bencoded torrent - 2c68945f1bcca4bd071d2f5ec0626180d7a6471f +Key, Password and Magnet representations +---------------------------------------- + +The following strings are all part of the same key hierarchy and generated using the following salt ``1db9b1aed1d3ba1d892d9afd52ea6ba158a986e785d3ed7f4203b834f499a922`` + +base64 root key + ``UGFzc3fDuHJ0LeODkeOCueODr-ODvOODiQ`` + +base64 torrent key + ``dEBBM6zLgPd8OCPCEAgtK0F55CZsOiLm_h-neGTgSY8`` + +base64 meta key + ``AY81p-wPMHNSXpI1w_dMjBqETWsUzmrGXfajHjExlfY`` + +human-readable root key + ``Passwørt-パスワード`` + +magnet with torrent key + ``magnet:?xt=urn:btih:da39a3ee5e6b4b0d3255bfef95601890afd80709&key=dEBBM6zLgPd8OCPCEAgtK0F55CZsOiLm_h-neGTgSY8`` + +magnet uri with human-readable root key + ``magnet:?xt=urn:btih:da39a3ee5e6b4b0d3255bfef95601890afd80709&pw=Passw%C3%B8rt-%E3%83%91%E3%82%B9%E3%83%AF%E3%83%BC%E3%83%89`` + +magnet iri with human-readable root key + ``magnet:?xt=urn:btih:da39a3ee5e6b4b0d3255bfef95601890afd80709&pw=Passwørt-パスワード`` + -infohash - 2668b09faba9888dc0c182cb0527b3aa2f31c4f7 @@ -387,6 +427,9 @@ References .. [#BEP-27] BEP_0027. Private Torrents (http://bittorrent.org/beps/bep_0027.html) +.. [#BEP-30] BEP_0030. Merkle tree torrent extension + (http://bittorrent.org/beps/bep_0030.html) + .. [#BEP-35] BEP_0035. Torrent Signing (http://bittorrent.org/beps/bep_0035.html) From b91a915edc3a8990622d720ceb5505f4bc755ead Mon Sep 17 00:00:00 2001 From: The8472 Date: Sun, 13 Aug 2017 16:45:14 +0200 Subject: [PATCH 10/14] move to match upstream directory structure --- html/beps/bep_0052.rst => beps/bep_0054.rst | 0 1 file changed, 0 insertions(+), 0 deletions(-) rename html/beps/bep_0052.rst => beps/bep_0054.rst (100%) diff --git a/html/beps/bep_0052.rst b/beps/bep_0054.rst similarity index 100% rename from html/beps/bep_0052.rst rename to beps/bep_0054.rst From eb9d441392c25b0c7730ee7c30a380d88c04883d Mon Sep 17 00:00:00 2001 From: The8472 Date: Sun, 13 Aug 2017 21:32:47 +0200 Subject: [PATCH 11/14] use content-derived IVs, base spec on BEP52 instead of BEPs 3 and 47 --- beps/bep_0054.rst | 283 ++++++++++++++-------------------------------- 1 file changed, 83 insertions(+), 200 deletions(-) diff --git a/beps/bep_0054.rst b/beps/bep_0054.rst index 2f25d4f..3bf296a 100644 --- a/beps/bep_0054.rst +++ b/beps/bep_0054.rst @@ -1,4 +1,4 @@ -:BEP: 52 +:BEP: 54 :Title: Encrypted Torrent Payload :Version: $Revision$ :Last-Modified: $Date$ @@ -45,138 +45,108 @@ Similarly Message Stream Encryption provides limited protection from passive eav Instead of attempting to restrict access to the swarm or metadata this BEP proposes to make all data opaque to 3rd parties by encrypting it with a shared secret that is not available through any torrent-related protocol, i.e. must be obtained separately by the user. -In principle the same properties can be provided by simply storing the data in an encrypted archive and using a nondescript filename, but that requires users to store the data twice or to use additional filesystem layers to transparently access the data, which is even more cumbersome when encryption is involved. It also prevents bittorrent clients from reusing already-downloaded files in a multi-file torrent. +In principle the same properties can be provided by simply storing the data in an encrypted archive and using a nondescript filename, but that requires users to store the data twice or to use additional filesystem layers to transparently access the data, which is even more cumbersome when encryption is involved. It also prevents bittorrent clients from reusing already-downloaded files in a multi-file torrent. -Metadata format -=============== - - -.. parsed-literal:: - - { - info: { - enc mac: *<32bytes of hmac output (string)>*, - encrypted: { - salt: *<32bytes of random binary data (string)>*, - shadow: **, - v: **, - }, - length: **, - name: **, - piece length: ** - pieces: ** - }, - } +Encryption +========== +Building blocks used in version 1: -``salt`` - a new, unique value must be chosen whenever the encrypted contents of a torrent are created or modified. It must be generated by a cryptographically secure RNG. Reuse, e.g. in a decrypt-modify-encrypt operation, would compromise the data. +H: SHA2-256 [#rfc6234]_, +E: XChaCha20 [#xchacha]_, +M: HMAC using H [#rfc2104]_, +scrypt [#rfc7914]_, +``||`` the concat operator -``v`` - The protocol version used to encrypt the torrent, currently *1*. New versions may be introduced by updates to this BEP if cryptographic weaknesses necessitate incompatible changes. Implementations should check if they support the version indicated in the metadata file and otherwise inform the user that they can download the data but not decrypt it. -``shadow`` - bencoded-then-encrypted dictionary whose key-value pairs shadow entries in the info dictionary. +Conceptually an encrypted torrent is created as follows: -``enc mac`` - message authentication code covering parts of the info dictionary -``name`` - the name field is a mandatory part of BEP 3 [#BEP-3]_. A placeholder MUST be provided. An implementation may either generate a random string consisting of filesystem-friendly characters or allow the user to choose a public name that reveals less information than the shadow name. +1. create a BEP 52 [#BEP-52]_ (non-hybrid) torrent representing the plaintext +2. let ``infohash_plain`` be the infohash of the plaintext torrent calculated according to the hashing method appropriate for its ``meta version`` +3. generate a random 32byte salt +4. ``SIV = H(infohash_plain || salt)`` +5. create a single file ``plaintext_padded`` by padding all non-empty files in the plaintext torrent to a full piece size and then concatenating them in the order specified by the ``file tree`` +6. let ``root_key`` be an arbitrary byte sequence that will be used as base secret to derive additional secrets +7. ``payload_key = scrypt(N: 2\ :sup:`14`\ , r: 8, p: 1, password: root_key, salt: (SIV || "payload key"))`` +8. ``payload_nonce = H(infohash_plain || salt || "payload")[0..24] +9. ``payload_encrypted = E(payload_nonce, payload_key, plaintext_padded)`` +10. choose a filename for the ciphertext file and a public name for the encrypted torrent +11. create a BEP 52 (optionally hybrid) torrent for ``payload_encrypted``. its ``meta version`` must match that of the plaintext torrent. its piece size must be the same or smaller than the plaintext torrent's. +12. ``shadow_nonce = H(SIV || "shadow")[0..24] +13. ``shadow_key = H(payload_key || "shadow") +14. ``shadow = E(shadow_nonce, shadow_key, salt || bencode(plaintext_torrent["info"]) || padding)`` where padding is a random sequence of 1 to 512 bytes (inclusive). +15. add the following key value pair to the info dictionary of the encrypted torrent: ``"encrypted": {"siv": SIV, "shadow": shadow, "v": 1}`` +16. ``mac = M(key: shadow_key, message: bencode(encrypted_torrent["info"]["file tree"]) || bencode(encrypted_torrent["info"]["encrypted"])) +17. add the following key value pair to the info dictionary of the encrypted torrent: ``"enc mac": mac`` -``length`` - The info dictionary describes the piece space layout in its ciphertext form. Currently there is no need for anything but a contiguous range of pieces, therefore the info dictionary MUST be created in single file mode. - Future revisions of this BEP may change this requirement if non-contiguous ciphertext representations become necessary. - -Shadow Dictionary ------------------ +This construction +* obscures the exact size of the plaintext by rounding to the nearest piece size +* obscures the size of the plaintext metadata by adding padding +* uses nonces that are derived from content, making them difficult to misuse +* does not reveal any hashes of the plaintext that could be crosschecked by outside observers without knowledge of the keys +* allows clients unaware of this BEP to still share the data and decrypt it through external tools -While the info dictionary represents the torrent in its ciphertext form the shadow dictionary represents the plaintext. -In general entries in the shadow dictionary have the same semantics as keys in the info dictionary and take precedence over them, -with the restriction that implementations should only shadow a whitelist of keys for which they have a shadowing strategy and ignore other keys. - -At a minimum clients should support shadowing of the following info dictionary keys: ``length``, ``files``, ``name``, ``comment``. -To protect privacy shadowing should also be used for any implementation-specific keys that reveal information about the payload. - -Additionally clients should embed BEP 47 [#BEP-47]_ ``sha1`` values of the plaintext files into the shadow dictionary to simplify deduplication, which would otherwise have to attempt encrypting candidate files before checking them against the piece hashes which represent the ciphertext. +The info dictionary of the encrypted torrent will contain the following additional keys .. parsed-literal:: { - comment: **, - length: **, - name: **, - files: **, - ... + info: { + enc mac: *<32bytes of hmac output (string)>*, + encrypted: { + siv: *<32byte IV used for shadow nonce and payload key derivation (string)>*, + shadow: **, + v: **, + }, + ... + }, } -``length`` or ``files`` - These fields represent the plaintext file layout in single or multi-file layout. This means that while the ciphertext is represented as a single file the plaintext can have a different layout. - The plaintext length may be shorter than the ciphertext so that the ciphertext length can be rounded up to an integer multiple of the piece length to obfuscate file sizes. The plaintext is zero-padded in that case. BEP 47 [#BEP-47]_ can also be used for this purpose in multi-file mode, but since there currently is no way to pad in single-file mode this discrepancy is allowed. - -Interaction with paddings -------------------------- - -Since the public representation is single-file there is no padding in the ciphertext. - -The shadow file layout can contain padding files or implicit padding due to the length discrepancy. Those paddings consist of zeroes in the *plaintext*. - -A client that has access to the shadow data should still download the padding data at least up to the next piece boundary (allowing paddings larger than a single piece to be partially skipped) to avoid leaking information about actual file sizes or knowledge of the file metadata. -Similarly clients should avoid prioritizing individual pieces or sequential downloading because they would otherwise reveal their knowledge of the file layout. - - -Encryption -========== - -Building blocks used in version 1: SHA2-256 [#rfc6234]_, ChaCha20 [#chacha]_, HMAC [#rfc2104]_, scrypt [#rfc7914]_ - -``||`` is the concat operator - -.. parsed-literal:: - - byte[] root_key = reusable key or password from which other values are derived. recommended strength: 256bits - byte[32] payload_key = scrypt(N: 2\ :sup:`14`\ , r: 8, p: 1, password: root_key, salt: (salt || "payload")) - - byte[32] shadow_key = sha256(payload_key || "shadow") - - byte[32] mac = HMAC−SHA256(message: bencode(info["length"]) || bencode(merkle_root) || bencode(info["encrypted"]), key: shadow_key) - - byte[8] payload_nonce = sha256(salt || "payload")[0..8] +``v`` + The version used to encrypt the torrent, currently *1*. New versions may be introduced by updates to this BEP if cryptographic weaknesses necessitate incompatible changes. + Implementations should check if they support the version indicated in the metadata file and otherwise inform the user that they can download the data but not decrypt it. - byte[8] shadow_nonce = sha256(salt || "shadow")[0..8] - -``salt``, ``payload_key``, ``shadow_key`` and ``mac`` are 32 bytes each. The nonces are 8 bytes each. ``root_key`` does not have a fixed size. -scrypt key derivation is used in case root keys with less entropy than recommended are used, e.g. for password-based schemes. Note that arbitrary binary data is allowed for keys, so this proposal distinguishes between a key and human-readable passwords. More on that below. +Key reuse and hierarchy +----------------------- -ChaCha20 with a 64bit nonce, 64bit internal block counter and 256bit key is used to both encrypt the shadow dictionary and the torrent payload. A longer nonce is not needed since a new payload key is already derived for each torrent and using the alternative 96bit nonce/32bit block counter version would also limit the payload size to 256TiB. +The SIV in the payload key derivation allows the root key to be reused across several torrents while still generating distinct payload keys for each. But UI design SHOULD encourage random key generation for each new torrent and require explicit user action for key reuse. -The ``shadow`` dictionary is bencoded and then encrypted with ``shadow_key`` and ``shadow_nonce``. +An implementation may provide the option to attempt to decrypt a torrent with the same key as another torrent in case a key is only communicated once and individual torrents are later distributed without explicitly providing keys. -The ``enc mac`` is calculated as HMAC over the concatenated bencoded representations of the ``length`` value, merkle root [#BEP-30]_ and ``encrypted`` value. The merkle root can be derived from the ``pieces`` array. This allows a torrent to be converted between merkle and flat pieces layout without access to the keys. +In some circumstances it may make sense to reveal a particular key lower in the hierarchy without revealing an upper key. For example a user may upload a torrent to an indexing site and provide the shadow key so it can extract keywords for fulltext search. -The encryption is applied while file data is loaded into the piece address space. Which means the ``pieces`` hashes are calculated over the ciphertext using ``payload_key`` and ``payload_nonce``. -The key stream of the cipher applied according to the absolute offset of the data in the piece space. I.e. any padding, holes or alignment in the plaintext is included in the key stream seek position. -This BEP only covers pieces representing file entries. Should future extensions put other data into the piece address space the interaction with this BEP will need to be defined. +Or a user may want to share a particular torrent without revealing the root key used to protect multiple other torrents, in that case revealing the payload key for that torrent will be sufficient. -An implementation unaware of this BEP will simply store the ciphertext to the disk in a ``length``-sized file with the public name. -This scheme only provides integrity verification for the ciphertext through the ``pieces`` hashes, i.e. correct decryption is not verified. An incorrect key could result in garbage plaintext, but this does not introduce a new problem since bittorrent never guaranteed that the files contain what the metadata claims. +Decryption +========== -Key reuse and hierarchy ------------------------ +1. obtain a shadow, payload or root key +2. extract ``SIV`` and ``mac`` +3. test available key against ``mac`` to determine whether it is a shadow key. If the check fails assume it is a payload key and derive the shadow key and test again. If necessary repeat again assuming it is a root key +4. derive shadow nonce, decrypt the shadow value +5. extract salt from decrypted shadow value +6. use a bdecoder that can ignore tail data beyond the end of the root dictionary to extract plaintext torrent info dictionary from the decrypted shadow value +7. calculate ``infohash_plain`` +8. verify ``SIV`` +9. derive ``payload_nonce`` from ``infohash_plain`` and ``salt`` +10. if ``payload_key`` is available decrypt ``plaintext_padded` +11. split `plaintext_padded`` according to file layout information in the plaintext info dictionary -The salt in the payload key derivation allows the root key to be reused across several torrents while still generating distinct payload keys for each. But UI design SHOULD encourage random key generation for each new torrent and require explicit user action for key reuse. -An implementation may provide the option to attempt to decrypt a torrent with the same key as another torrent in case a key is only communicated once and individual torrents are later distributed without explicitly providing keys. +Shadow Dictionary +----------------- -In some circumstances it may make sense to reveal a particular key lower in the hierarchy without revealing an upper key. For example a user may upload a torrent to an indexing site and provide the shadow key so it can extract keywords for fulltext search. +If a client has access to at least a shadow key it may want to check consistency, such as the length and number of pieces, between the encrypted representation and the plaintext metadata in the shadow dictionary. +It may also want to display the metadata of the plaintext to the user instead of the encrypted representation. +Since the shadow dictionary also contains merkle roots for each file correct decryption can also be verified at the file granularity level. Transfer of plaintext merkle layers is not supported, but clients can still use deduplication if they other files with identical plaintext. -Or a user may want to share a particular torrent without revealing the root key used to protect multiple other torrents, in that case revealing the payload key for that torrent will be sufficient. +Implementations may be tempted to optimize requests based on shadow dictionary information, e.g. skipping parts that are padding in the plaintext or prioritize downloading of specific files, but this may be inadvisable since it would reveal knowledge of the metadata. -The mac can also be used to determine to which level of the hierarchy a key belongs by first assuming it is the shadow key and attempting to verify the info-dictionary against it, then assuming it is the payload key, deriving the shadow key and then attempting to verify it etc. Key sharing =========== @@ -240,8 +210,6 @@ A key file can contain keys for multiple torrents. Only one key needs to be incl - - Storage layer ============= @@ -249,7 +217,7 @@ This BEP does not mandate how an implementation should store encrypted or decryp However, if a client wants to be more flexible than either ignoring this BEP (thus storing ciphertext on disk) or always requiring the keys before starting a torrent it will have to consider the following: -* clients can be in 3 states regarding key knowledge: no keys, shadow key only, keys that can decrypt the payload; two encryption states: encrypted, decrypted; 3 file layout states: encrypted, multi-file, single-file +* clients can be in 3 states regarding key knowledge: no keys, shadow key only, keys that can decrypt the payload; two encryption states: encrypted, decrypted * a user may start downloading a torrent before keys are available. this requires a way to input keys and to convert between encrypted and decrypted storage * for performance or security reasons a seeder may want to import plaintext data, encrypt it and then discard the keys to directly seed the encrypted data from disk. @@ -318,112 +286,24 @@ Provide option to * include payload key. * include root key. if the client knows that the key has been reused for other torrents it should indicate this to the user +When a format including keys is chosen the secret part should be highlighted as such. + Test Data ========= -The test data is generated as follows: - -1. fill ./foo/a with 18 * 16KiB of the character ``a`` -2. fill ./foo/b with 2 * 16KiB of the character ``b`` -3. use root key: 0x4b6cc4770ff57005d597a8f01e83679d2f2b2ce86490ab5cf10e71f4ef7533e2 -4. generated from file structure shadow dictionary:: - - { - "name":"foo", - "files":[ - { - "sha1":0x5B63C06D350BB4BE82F00B170B822A7BF3F5B190, - "path":["a"], - "length":294912 - }, { - "sha1":0x5B94E57E8BC842A56BB6BD628F3309A6D9092421, - "path":["b"], - "length":32768 - }, { - "length":229376, - "attr":"p" - } - ] - } - - - -5. generated torrent from salt and previous values: - - { - "info":{ - "pieces":0x28EC203A4435B4DAC7582B598E5A1F11F6060B7202FA5D9E68E003006697AEABF48386B6D86AA9A9, - "sha1":0x8B5C9069F227DED25CE1CAD65CA0DF29812BECA6, - "enc mac":0xF2614F94AE0408138B5ED75B45B626B3424E65A83BAB0D3468A649E5792631DA, - "encrypted":{ - "salt":0x1053F898E1917EAB461616F895BC2F50ADFFE48F7F4C92AD547E6849B7D27DF7, - "shadow":0x1508C10EF6558D344A645299E4B2C3A3FDC5FAF799FF71E9045E2617C887773D05E6458E0A14D5CC953374F7AA3C944023CD5D87C3B12E3F316FB32A1890FBE37B0F1482217B3E8B77E339D2003A12ADA04940E7BDBFA029EE652450BA512C45DFC7EC3A331FAF661D80AABE08281F2685675B5302FEBA8DE99B7453CEDE920E36C863F4860F0901FA4E99DF7840489B6C97F813F6E9FE97B2B8B19116C15367C3F1EA77, - "v":1 - }, - "length":557056, - "name":"Public Name", - "piece length":278528 - } - } - - - - -Additional intermediate values in hex: - -payload nonce - 381d28f55eb87e2e - -shadow nonce - 3824dc7d0e71dd38 - -torrent key - afaf3eb80291b13546814af8cacf0ae5150b5505e6c0633954bf9daa17363a83 - -metainfo key - 237b2116dc9397a053ff17811d260f02368bc0a704e558d671c33bd015e15f5f - -sha1sum of the plaintext bencoded shadow dictionary - e36ffe11188a878147ca72bf9e70b40067451333 - +TODO Key, Password and Magnet representations ---------------------------------------- -The following strings are all part of the same key hierarchy and generated using the following salt ``1db9b1aed1d3ba1d892d9afd52ea6ba158a986e785d3ed7f4203b834f499a922`` - -base64 root key - ``UGFzc3fDuHJ0LeODkeOCueODr-ODvOODiQ`` - -base64 torrent key - ``dEBBM6zLgPd8OCPCEAgtK0F55CZsOiLm_h-neGTgSY8`` - -base64 meta key - ``AY81p-wPMHNSXpI1w_dMjBqETWsUzmrGXfajHjExlfY`` - -human-readable root key - ``Passwørt-パスワード`` - -magnet with torrent key - ``magnet:?xt=urn:btih:da39a3ee5e6b4b0d3255bfef95601890afd80709&key=dEBBM6zLgPd8OCPCEAgtK0F55CZsOiLm_h-neGTgSY8`` - -magnet uri with human-readable root key - ``magnet:?xt=urn:btih:da39a3ee5e6b4b0d3255bfef95601890afd80709&pw=Passw%C3%B8rt-%E3%83%91%E3%82%B9%E3%83%AF%E3%83%BC%E3%83%89`` - -magnet iri with human-readable root key - ``magnet:?xt=urn:btih:da39a3ee5e6b4b0d3255bfef95601890afd80709&pw=Passwørt-パスワード`` - - - +TODO References ========== -.. [#BEP-3] BEP_0003. The BitTorrent Protocol Specification - (http://bittorrent.org/beps/bep_0003.html) - + .. [#BEP-27] BEP_0027. Private Torrents (http://bittorrent.org/beps/bep_0027.html) @@ -436,8 +316,11 @@ References .. [#BEP-47] BEP_0047. Padding files and extended file attributes (http://bittorrent.org/beps/bep_0047.html) -.. [#chacha] ChaCha20 by Daniel J. Bernstein - (https://cr.yp.to/chacha.html) +.. [#BEP-52] BEP_0052. The BitTorrent Protocol Specification v2 + (http://bittorrent.org/beps/bep_0052.html) + +.. [#xchacha] XChaCha20 in libsodium + (https://download.libsodium.org/doc/advanced/xchacha20.html) .. [#rfc6234] RFC 6234. http://www.ietf.org/rfc/rfc2119.txt From 317a2343396d7bce10f62a074c21f1e2cbaa9dd7 Mon Sep 17 00:00:00 2001 From: The8472 Date: Sun, 13 Aug 2017 21:49:12 +0200 Subject: [PATCH 12/14] formatting; remove unused references --- beps/bep_0054.rst | 26 +++++++++++++------------- 1 file changed, 13 insertions(+), 13 deletions(-) diff --git a/beps/bep_0054.rst b/beps/bep_0054.rst index 3bf296a..0cc73eb 100644 --- a/beps/bep_0054.rst +++ b/beps/bep_0054.rst @@ -68,16 +68,22 @@ Conceptually an encrypted torrent is created as follows: 4. ``SIV = H(infohash_plain || salt)`` 5. create a single file ``plaintext_padded`` by padding all non-empty files in the plaintext torrent to a full piece size and then concatenating them in the order specified by the ``file tree`` 6. let ``root_key`` be an arbitrary byte sequence that will be used as base secret to derive additional secrets -7. ``payload_key = scrypt(N: 2\ :sup:`14`\ , r: 8, p: 1, password: root_key, salt: (SIV || "payload key"))`` -8. ``payload_nonce = H(infohash_plain || salt || "payload")[0..24] +7. .. parsed-literal:: + + payload_key = scrypt(N: 2\ :sup:`14`\ , r: 8, p: 1, password: root_key, salt: (SIV || "payload key")) + +8. ``payload_nonce = H(infohash_plain || salt || "payload")[0..24]`` 9. ``payload_encrypted = E(payload_nonce, payload_key, plaintext_padded)`` 10. choose a filename for the ciphertext file and a public name for the encrypted torrent 11. create a BEP 52 (optionally hybrid) torrent for ``payload_encrypted``. its ``meta version`` must match that of the plaintext torrent. its piece size must be the same or smaller than the plaintext torrent's. -12. ``shadow_nonce = H(SIV || "shadow")[0..24] -13. ``shadow_key = H(payload_key || "shadow") +12. ``shadow_nonce = H(SIV || "shadow")[0..24]`` +13. ``shadow_key = H(payload_key || "shadow")`` 14. ``shadow = E(shadow_nonce, shadow_key, salt || bencode(plaintext_torrent["info"]) || padding)`` where padding is a random sequence of 1 to 512 bytes (inclusive). -15. add the following key value pair to the info dictionary of the encrypted torrent: ``"encrypted": {"siv": SIV, "shadow": shadow, "v": 1}`` -16. ``mac = M(key: shadow_key, message: bencode(encrypted_torrent["info"]["file tree"]) || bencode(encrypted_torrent["info"]["encrypted"])) +15. add the following key value pair to the info dictionary of the encrypted torrent: + + ``"encrypted": {"siv": SIV, "shadow": shadow, "v": 1}`` + +16. ``mac = M(key: shadow_key, message: bencode(encrypted_torrent["info"]["file tree"]) || bencode(encrypted_torrent["info"]["encrypted"]))`` 17. add the following key value pair to the info dictionary of the encrypted torrent: ``"enc mac": mac`` This construction @@ -134,7 +140,7 @@ Decryption 7. calculate ``infohash_plain`` 8. verify ``SIV`` 9. derive ``payload_nonce`` from ``infohash_plain`` and ``salt`` -10. if ``payload_key`` is available decrypt ``plaintext_padded` +10. if ``payload_key`` is available decrypt ``plaintext_padded`` 11. split `plaintext_padded`` according to file layout information in the plaintext info dictionary @@ -307,15 +313,9 @@ References .. [#BEP-27] BEP_0027. Private Torrents (http://bittorrent.org/beps/bep_0027.html) -.. [#BEP-30] BEP_0030. Merkle tree torrent extension - (http://bittorrent.org/beps/bep_0030.html) - .. [#BEP-35] BEP_0035. Torrent Signing (http://bittorrent.org/beps/bep_0035.html) -.. [#BEP-47] BEP_0047. Padding files and extended file attributes - (http://bittorrent.org/beps/bep_0047.html) - .. [#BEP-52] BEP_0052. The BitTorrent Protocol Specification v2 (http://bittorrent.org/beps/bep_0052.html) From c682500baf1260d3bb13468466485c7151dfc1fa Mon Sep 17 00:00:00 2001 From: The8472 Date: Sat, 26 Aug 2017 19:57:31 +0200 Subject: [PATCH 13/14] formatting, relax piece size restrictions --- beps/bep_0054.rst | 34 ++++++++++++++++++++++------------ 1 file changed, 22 insertions(+), 12 deletions(-) diff --git a/beps/bep_0054.rst b/beps/bep_0054.rst index 0cc73eb..6f15cf5 100644 --- a/beps/bep_0054.rst +++ b/beps/bep_0054.rst @@ -36,11 +36,11 @@ Rationale In general BitTorrent swarms are an open system well-suited for mass-distribution of data to the public. Some use-cases require that the data is only distributed to a closed, trusted group of peers. -In other cases the content may be meant for open distribution within a community without intent of announcing the content to the whole world. This is analogous to web content that is open to human visitors but requests via robots.txt that it should not be announced to the world by web crawlers. +In other cases the content may be meant for open distribution within a community without intent of announcing the content to the whole world. This is analogous to web content that is open to human visitors but requests via robots.txt that it should not be announced to the world by web crawlers. While the private flag [#BEP-27]_ may be sufficient in a controlled environment to prevent information about the torrent (e.g. its infohash) from escaping and thus preventing others from connecting to the swarm this is a very brittle form of security which also prevents the use of public infrastructure such as open trackers, PEX or the DHT. -Similarly Message Stream Encryption provides limited protection from passive eavesdroppers on the network layer but does not prevent the infohash from escaping. +Similarly Message Stream Encryption provides limited protection from passive eavesdroppers on the network layer but does not prevent the infohash from escaping. Instead of attempting to restrict access to the swarm or metadata this BEP proposes to make all data opaque to 3rd parties by encrypting it with a shared secret that is not available through any torrent-related protocol, i.e. must be obtained separately by the user. @@ -59,14 +59,14 @@ scrypt [#rfc7914]_, ``||`` the concat operator -Conceptually an encrypted torrent is created as follows: +An encrypted torrent is created as follows: 1. create a BEP 52 [#BEP-52]_ (non-hybrid) torrent representing the plaintext 2. let ``infohash_plain`` be the infohash of the plaintext torrent calculated according to the hashing method appropriate for its ``meta version`` 3. generate a random 32byte salt 4. ``SIV = H(infohash_plain || salt)`` -5. create a single file ``plaintext_padded`` by padding all non-empty files in the plaintext torrent to a full piece size and then concatenating them in the order specified by the ``file tree`` +5. create a single file ``plaintext_padded`` by zero-padding all non-empty files in the plaintext torrent to a full piece size and then concatenating them in the order specified by the ``file tree`` 6. let ``root_key`` be an arbitrary byte sequence that will be used as base secret to derive additional secrets 7. .. parsed-literal:: @@ -75,7 +75,8 @@ Conceptually an encrypted torrent is created as follows: 8. ``payload_nonce = H(infohash_plain || salt || "payload")[0..24]`` 9. ``payload_encrypted = E(payload_nonce, payload_key, plaintext_padded)`` 10. choose a filename for the ciphertext file and a public name for the encrypted torrent -11. create a BEP 52 (optionally hybrid) torrent for ``payload_encrypted``. its ``meta version`` must match that of the plaintext torrent. its piece size must be the same or smaller than the plaintext torrent's. +11. create a BEP 52 (optionally hybrid) torrent for ``payload_encrypted``. + its ``meta version`` must match that of the plaintext torrent. 12. ``shadow_nonce = H(SIV || "shadow")[0..24]`` 13. ``shadow_key = H(payload_key || "shadow")`` 14. ``shadow = E(shadow_nonce, shadow_key, salt || bencode(plaintext_torrent["info"]) || padding)`` where padding is a random sequence of 1 to 512 bytes (inclusive). @@ -87,11 +88,13 @@ Conceptually an encrypted torrent is created as follows: 17. add the following key value pair to the info dictionary of the encrypted torrent: ``"enc mac": mac`` This construction + * obscures the exact size of the plaintext by rounding to the nearest piece size * obscures the size of the plaintext metadata by adding padding * uses nonces that are derived from content, making them difficult to misuse * does not reveal any hashes of the plaintext that could be crosschecked by outside observers without knowledge of the keys * allows clients unaware of this BEP to still share the data and decrypt it through external tools +* maintains a 1:1 mapping between ciphertext and plaintext offsets in the piece address space, which makes it trivial to apply the encryption at the I/O layer The info dictionary of the encrypted torrent will contain the following additional keys @@ -137,11 +140,13 @@ Decryption 4. derive shadow nonce, decrypt the shadow value 5. extract salt from decrypted shadow value 6. use a bdecoder that can ignore tail data beyond the end of the root dictionary to extract plaintext torrent info dictionary from the decrypted shadow value -7. calculate ``infohash_plain`` -8. verify ``SIV`` -9. derive ``payload_nonce`` from ``infohash_plain`` and ``salt`` -10. if ``payload_key`` is available decrypt ``plaintext_padded`` -11. split `plaintext_padded`` according to file layout information in the plaintext info dictionary +7. validate that ``meta version`` matches and that the ciphertext is at least as long as the padded plaintext length +8. calculate ``infohash_plain`` +9. verify ``SIV`` +10. derive ``payload_nonce`` from ``infohash_plain`` and ``salt`` +11. if ``payload_key`` is available decrypt ``plaintext_padded`` +12. split ``plaintext_padded`` according to file layout information in the plaintext info dictionary +13. verify plaintext files based on plaintext ``pieces root`` hashes Shadow Dictionary @@ -149,9 +154,14 @@ Shadow Dictionary If a client has access to at least a shadow key it may want to check consistency, such as the length and number of pieces, between the encrypted representation and the plaintext metadata in the shadow dictionary. It may also want to display the metadata of the plaintext to the user instead of the encrypted representation. -Since the shadow dictionary also contains merkle roots for each file correct decryption can also be verified at the file granularity level. Transfer of plaintext merkle layers is not supported, but clients can still use deduplication if they other files with identical plaintext. +Since the shadow dictionary also contains merkle roots for each file correct decryption can also be verified at the file granularity level. +Transfer of plaintext merkle layers is not supported, but clients can still use deduplication if they other files with identical plaintext. Note that deduplication may leak information. + +Implementations may be tempted to optimize requests based on shadow dictionary information, e.g. skipping parts that are padding in the plaintext or prioritize downloading of specific files, especially when there is significant padding overhead. +But such optimizations reveal knowledge of the plain text layout to some participants in the swarm and thus pose a performance-security tradeoff. -Implementations may be tempted to optimize requests based on shadow dictionary information, e.g. skipping parts that are padding in the plaintext or prioritize downloading of specific files, but this may be inadvisable since it would reveal knowledge of the metadata. +Note that the shadow dictionary can be turned into a full-fledged torrent and implementations may do so to reuse existing machinery to process them. But this could leak information if the client were for example to perform DHT lookups for the plaintext torrent. +So as a precaution they may want to treat it *as if* it were a private torrent until the need to actually connect the plaintext torrent to the network arises. Key sharing From 944cd59f83379540edd05f0a91507b0006ccd121 Mon Sep 17 00:00:00 2001 From: The8472 Date: Mon, 28 Aug 2017 00:42:37 +0200 Subject: [PATCH 14/14] pad shadow to a power of two --- beps/bep_0054.rst | 25 +++++++++++++------------ 1 file changed, 13 insertions(+), 12 deletions(-) diff --git a/beps/bep_0054.rst b/beps/bep_0054.rst index 6f15cf5..6fbe527 100644 --- a/beps/bep_0054.rst +++ b/beps/bep_0054.rst @@ -79,7 +79,7 @@ An encrypted torrent is created as follows: its ``meta version`` must match that of the plaintext torrent. 12. ``shadow_nonce = H(SIV || "shadow")[0..24]`` 13. ``shadow_key = H(payload_key || "shadow")`` -14. ``shadow = E(shadow_nonce, shadow_key, salt || bencode(plaintext_torrent["info"]) || padding)`` where padding is a random sequence of 1 to 512 bytes (inclusive). +14. ``shadow = E(shadow_nonce, shadow_key, salt || bencode(plaintext_torrent["info"]) || padding)`` where padding is a sequence of zero or more random bytes chosen so that the length of ``shadow`` is a power of two. 15. add the following key value pair to the info dictionary of the encrypted torrent: ``"encrypted": {"siv": SIV, "shadow": shadow, "v": 1}`` @@ -136,17 +136,18 @@ Decryption 1. obtain a shadow, payload or root key 2. extract ``SIV`` and ``mac`` -3. test available key against ``mac`` to determine whether it is a shadow key. If the check fails assume it is a payload key and derive the shadow key and test again. If necessary repeat again assuming it is a root key -4. derive shadow nonce, decrypt the shadow value -5. extract salt from decrypted shadow value -6. use a bdecoder that can ignore tail data beyond the end of the root dictionary to extract plaintext torrent info dictionary from the decrypted shadow value -7. validate that ``meta version`` matches and that the ciphertext is at least as long as the padded plaintext length -8. calculate ``infohash_plain`` -9. verify ``SIV`` -10. derive ``payload_nonce`` from ``infohash_plain`` and ``salt`` -11. if ``payload_key`` is available decrypt ``plaintext_padded`` -12. split ``plaintext_padded`` according to file layout information in the plaintext info dictionary -13. verify plaintext files based on plaintext ``pieces root`` hashes +3. verify that ``shadow`` length is a power of two +4. test available key against ``mac`` to determine whether it is a shadow key. If the check fails assume it is a payload key and derive the shadow key and test again. If necessary repeat again assuming it is a root key +5. derive shadow nonce, decrypt the shadow value +6. extract salt from decrypted shadow value +7. extract the plaintext info dictionary in the decrypted shadow value between the salt and the padding, this requires a bdecoder that can ignore additional bytes after the root value +8. validate that ``meta version`` matches and that the ciphertext is at least as long as the padded plaintext length +9. calculate ``infohash_plain`` +10. verify ``SIV`` +11. derive ``payload_nonce`` from ``infohash_plain`` and ``salt`` +12. if ``payload_key`` is available decrypt ``plaintext_padded`` +13. split ``plaintext_padded`` according to file layout information in the plaintext info dictionary +14. verify plaintext files based on plaintext ``pieces root`` hashes Shadow Dictionary