Request that secures file uploads #862

garrettr · 2015-02-05T04:25:46Z

Background

Initially, this PR was intended to revert the reverting of request_that_secures_files_uploads.py in f42765e. That commit demonstrated using Flask's request wrappers to replace the stream class used for requests (including POSTs, which may contain submission plaintext) with a custom stream class.

Flask's default approach is to buffer requests in memory (with io.BytesIO) if they are small (less than 512KB), and otherwise buffer them to disk (with tempfile.TemporaryFile). This does not work well for us because large requests are usually POSTs containing submission plaintext, and we want to prevent plaintext from ever being written on disk due to the threat of forensic recovery.

Original approach

The original approach in f42765e was to always buffer into memory, regardless of the upload size. Since SecureDrop servers have swap disabled, this prevents plaintext from being written to disk. This commit was temporarily reverted while I was investigating one of the issues from the audit (which had to do with forensic recovery of plaintext from memory).

Drawbacks

My original plan was simply to revert the temporary revert of original commit, and return to using memory for buffering. However, I was somewhat concerned about the ability of this approach to scale for large submissions. Some testing confirmed that it does not, and would fail with MemoryError and other, memory-related errors on sufficiently large submissions (100-200MB seemed to do it on a development VM with 1GB of memory allocated).

We actually needed 2x the submission size in RAM (!!) for each submission because:

The initial submission needed to be fully buffered in RAM
We store all file submissions in zipfiles (see store.save_file_submission) to:
1. compress them if possible
2. retain their original filename in an encrypted and user-friendly-to-decrypt-and-use-on-Tails manner.
Since we didn't want to write that zipfile (containing plaintext) to disk either, we also allocated it in memory. In the worst case, the submission was already compressed and so we would need to allocate memory equal to the full size of the submission to store the corresponding zipfile.

Our current maximum file size is 500MB (and we'd actually like to increase that to 1GB or even more). You can see how this all-in-memory technique implies:

SecureDrop servers would need lots of RAM just to operate normally.
They would be very easy to DOS - just get a couple of large concurrent uploads going to trigger OOM and fail the uploads.

Other possibilities

Ideally, we would like to use Flask's original approach, including buffering large submissions to disk, which usually has much greater capacity than memory. However, we want to prevent a forensic attacker from recovering the plaintext of submissions. Once a submission has been received and encrypted, it would ideally be impossible for an attacker with access to the computer (but not the key used to encrypt the submission) to recover the plaintext - even if they have forensic disk analysis capabilities.

One possible solution that I considered (and actually implemented, but is not the final solution presented here) was to use a forensic disk erasure tool (e.g. shred) to erase the plaintext submission after it's been encrypted. The problem is that these tools are of uncertain efficacy, especially when used on (increasingly widespread) SSDs. I decided this uncertainty was unacceptable for such an important security property.

Final solution

The final solution implemented in this pull request is, at a high level:

Like Flask, small requests (< 512KB) are buffered in memory.
When a larger submission is received, SecureDrop:
1. Generates a random, ephemeral AES key that is only stored in memory.
2. Creates a temporary file on disk.
3. Streams the submission onto disk, encrypting it with the ephemeral key using AES-CTR mode.
4. When the submission needs to be read, it is similarly decrypted on the fly.
5. When the stream class (SecureTemporaryFile) is closed:
  1. The file on disk is deleted (normally, no need for shred)
  2. The AES key is only stored in memory (on the stream class), and will eventually be GC'ed and overwritten (unfortunately we can't precisely control this behavior, at least at the moment, although we might be able to use this approach to zerofill the key, but we would need to do some forensic analysis to make sure that really works as expected).

Additional changes required for zipfiles

An additional change was required to handle the zip file correctly (without buffering the whole thing into memory). Ideally, you would also be able to build a zipfile with this new SecureTemporaryFile class, and you can - but unfortunately, you cannot buffer a file-like object into a zip file, you must instead do it all in one go with zipfile.ZipFile.writestr. This returns us to needing to buffer the whole thing in memory (argh!) and caused stability problems in my testing.

To fix this issue, I switched to using gzip, since it supported writing and reading from file-like objects. This enabled me to build a gzip.GzipFile from a SecureTemporaryFile and have it return a SecureTemporaryFile, which could then be read by gpg.encrypt to perform the final encryption.

Using gzip achieves the other primary goals of using zipfiles as well:

The files are compressed (if possible). gzip uses the same zlib compression as zipfile so there is no tradeoff here.
The original filename is embedded inside the gzip'ed file. This filename is encrypted inside the final .gz.gpg file, so it is not stored in plaintext on the Application Server's disk. Additionally, double-clicking a .gz in Tails leads to the same outcome (thank god) as double-clicking a .zip: it opens an archive window which shows the file contained inside the archive (with the original filename), which can then by opened by double-clicking. So there is also no change in the workflow or usability tradeoff.

Additional changes to templates

While testing this, I noticed that the default filesizeformat template filter formats file sizes in base-10 by default. Most browsers (and most computer things in general...) format file sizes in base 2. This lead to confusing discrepancies between the file sizes listed on the Document Interface and the files sizes shown in the browser UI when downloading a file (in both Firefox/Tor Browser and Chrome). Fortunately, this was easy to fix with a built-in argument meant just for this purpose 👍

Acknowledgements

Thanks to Globaleaks for thinking of this first (and implementing it)! I based this pull request in part on their implementation.

Also thanks to @jacksingleton for implementing the original request_that_secures_file_uploads.py and agitating to take this approach, like, a long time ago.

This class is intended to be used to safely buffer uploads (which may contain submission plaintext) to disk.

garrettr · 2015-02-06T18:01:04Z

Originally, this pr used the Python cryptography library to implement AES-CTR. Unfortunately, cryptography relies on libffi for its bindings to crypto libraries like OpenSSL, and libffi is incompatible with grsecurity's memory protections. While it is possible to use a combination of patches to Python and libffi to safely circumvent this problem, I decided instead to replace the use of cryptography with pycrypto, which we already have as a dependency for their convenient cryptographically secure random module.

I have tested this PR and confirmed that it works with grsecurity. It works with small and large files, and if there was any performance difference between the two libraries it was not noticeable.

…-uploads Request that secures file uploads

added the apparmor entries required for pr freedomofpress#862

psivesely · 2015-10-19T18:43:31Z

Have you considered using AES in CBC mode instead of CTR mode for performance reasons?

Although the parameters are a bit wonky—1,000 16,384-byte blocks are encrypted—here are the results from the pct-speedtest.py test from pycrypto run on my i7-5600U in a VM:

AES256•CBC key setup: 345.11 kKeys/sec
AES256•CBC encryption: 267.46 MBps
AES256•CTR key setup: 45.38 kKeys/sec
AES256•CTR encryption: 192.34 MBps

Since CBC does encryption serially, but decryption in parallel, decryption benchmarks should show an even more marked increase in performance. I can modify the file later to get some results with more realistic parameters and to show decryption as well.

I know CBC has some downsides in certain contexts, but I don’t believe any that I am aware of apply here. Some further contemplation should be necessary, however:

(1) Padding oracle attacks shouldn’t be possible because the SecureDrop server in no way servers as a decryption oracle.
(2) Adaptive chosen plaintext attacks are certainly not possible with random, epehemeral keys, so attacks such as the one described second here should not be possible.

If you have already considered CBC and found a reason to choose CTR over it, I’d be curious to know why.

Lastly, I think shred should still be called on these encrypted files because they still may reveal valuable metadata to forensic analysts including time of write and file size.

These tests were added in PR #476, but will not pass as we do have secrets in memory (but not on disk, see PR #862). Anyway, these tests are clever, so let's keep them but mark them xfail to prevent confusion in the future.

dolanjs mentioned this pull request Feb 5, 2015

apparmor updates #863

Merged

Garrett Robinson added 5 commits February 6, 2015 09:01

Add SecureTemporaryFile class

aca449d

This class is intended to be used to safely buffer uploads (which may contain submission plaintext) to disk.

Use SecureTemporaryFile for Source Interface requests

aa0ebe5

Use gzip instead of zipfile for uploaded files

88f6ea9

Fix unit tests for change from .zip -> .gz

f98a490

Use binary formatting for file sizes in Document Interface

58e21cf

garrettr force-pushed the request-that-secures-file-uploads branch from 791d4c0 to 58e21cf Compare February 6, 2015 17:55

garrettr added a commit that referenced this pull request Feb 6, 2015

Merge pull request #862 from freedomofpress/request-that-secures-file…

7ed392a

…-uploads Request that secures file uploads

garrettr merged commit 7ed392a into develop Feb 6, 2015

dolanjs pushed a commit to dolanjs/securedrop that referenced this pull request Feb 6, 2015

apparmor entries for 862

0d13245

added the apparmor entries required for pr freedomofpress#862

This was referenced Mar 25, 2015

Ensure that files never touch an actual disk unencrypted #14

Closed

Server hard drive freespace scrubbing #88

Closed

ClamAV/Yara/Cuckoo Sandbox scan of sent files for malware #498

Closed

redshiftzero deleted the request-that-secures-file-uploads branch November 9, 2016 01:11

redshiftzero mentioned this pull request Nov 8, 2017

Convert tests in SubmissionNotInMemoryTest to pytest and xfail them #2548

Closed

This was referenced Jun 7, 2019

Change .gz compression to .zip for submissions? #2289

Open

[docs] Most Tails screenshots in Journalist Guide are outdated #4120

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Request that secures file uploads #862

Request that secures file uploads #862

garrettr commented Feb 5, 2015

garrettr commented Feb 6, 2015

psivesely commented Oct 19, 2015

Request that secures file uploads #862

Request that secures file uploads #862

Conversation

garrettr commented Feb 5, 2015

Background

Original approach

Drawbacks

Other possibilities

Final solution

Additional changes required for zipfiles

Additional changes to templates

Acknowledgements

garrettr commented Feb 6, 2015

psivesely commented Oct 19, 2015