Confidentiality layer for IPFS decentralized storage
CIPFS is an encryption layer on top of the existing IPFS. A random key is generated for each upload and appended to the name used to address files. For the sender and receiver, the UX is the same, except that the file reference is 79 characters long instead of 46 characters. However the nodes and surveilling entities can no longer access any of the files they are holding.
CIPFS syntax mimics IPFS - just add a 'c' on front.
Run: cipfs add <plaintext filename>
(displays the CIPFS ID at the end)
Run: cipfs get <CIPFS ID>
(writes file to current directory)
Run: $ ./cipfs.sh add CIPFS_logo.png
Output:
*********************************
Confidential IPFS, v0.1
@isthmus (github.com/mitchellpkt)
*********************************
... generated key
... encrypted test.txt
... uploading to IPFS
19.92 KiB / 19.92 KiB [===============================================] 100.00%
*********************************
Retrieve file with this CIPFS ticket:
CQmZw7LBEQn2HkKXR4NjwMsZSF7c6ThRTLyKSjEzWpsjY77k0OnHY9fOEzVw15bERThqoKu6zJxgolc
*********************************
Run: $ ./cipfs.sh get CQmZw7LBEQn2HkKXR4NjwMsZSF7c6ThRTLyKSjEzWpsjY77k0OnHY9fOEzVw15bERThqoKu6zJxgolc
Output:
PROGRESS:
19.92 KiB / 19.92 KiB [============================================] 100.00% 0s
... downloaded from IPFS
... decrypted file
*********************************
Retrieved file name:
CQmZw7LBEQn2HkKXR4NjwMsZSF7c6ThRTLyKSjEzWpsjY77
Retrieved file SHA-256 sum:
79228e7f85692032b5c49285c95bc4af7e487e526d982a1dc66ddd8cb188e1a0
*********************************
The current implementation of IPFS offers no privacy by default - files are uploaded in plaintext, and accessible to strangers' nodes and anybody with the hash.
Let ||
represent string concatenation, and H(...)
represent IPFS's hash function
We have a (plaintext) message or file P
that we would like to store on IPFS. IPFS uses a global namespace with hash-based content-addressing, so we upload P
and access it later by querying H(P)
.
There are several downsides
- IPFS nodes can read
P
. - Anybody with
H(P)
can retrieveP
. - Anybody with a file
P
can calculateH(P)
and test whether the file was uploaded to IPFS previously.
^^ Note that these are largely features & design decisions rather than bugs. There are always privacy / convenience / scaling tradeoffs, and CIPFS will provide privacy (i.e. solve the 3 issues listed above) at the expense of a longer ID string and file deduplication.
Suppose Arlene wishes to share plaintext document P
with Boris using IPFS. The CIPFS client performs this procedure:
- Generate random number
R
from a pseudorandom number generator (if using--random-key flag
) - Symmetric encrypt
P
withR
to generate ciphertextX
. - Upload
X
to IPFS (which will index it byH(X)
). - Display
CIPFS_ticket
, which isC || H(X) || R
.
Note: Default behavior is deterministic key generation (i.e. symmetric encrypt with first 32 chars of SHA-256 sum). This means that identical files ==> identical ciphertexts ==>> no IPFS bloat. If your threat model is sensitive to IPFS storage hypothesis testing, then add --random-key
flag to generate a unique one-time key.
Now, Arlene gives the CIPFS_ticket
string containing the pointer and the key to Boris. Boris's CIPFS client performs this procedure:
- Break
CIPFS_ticket
intoH(X)
andR
. - Query IPFS for
H(X)
and downloadX
. - Symmetric decrypt
X
usingR
to produceP
. - Display/save
P
.
A few nice characteristics emerge:
- No read access for infrastructure: Only Arlene and Boris have
R
so the IPFS nodes cannot decryptX
to readP
. - If using
--random-key
, then an attacker withP
cannot perform hypothesis testing about file existance on IPFS, since withoutR
they cannot generate Xor
H(X)`. - Similarly, censorship and surveillance resistance arise since
R
introduces ciphertext unlinkability. Even if I am an attacker withP
and total surveillance over IPFS, I cannot prove which encrypted files & transmissions are related. - Arlene can share a linked commit by posting
H(X)
to prove that thatX
exists on IPFS. She can later revealR
to unlockP
. - Arlene can also share an unlinked commit by posting
H(C||H(X)||R)
, which is alsoH(CIPFS_ticket)
. Others cannot verify whether or notX
exists on IPFS until Arlene revealsCIPFS_ticket
.