-
Notifications
You must be signed in to change notification settings - Fork 685
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ensure that files never touch an actual disk unencrypted #14
Comments
Link to the code ioerror references above has been broken :-( new link https://github.com/freedomofpress/securedrop/blob/master/modules/deaddrop/files/deaddrop/crypto.py#L97 |
At globaleaks we also had a discussion about the same kind of problem and in this ticket you can find the different steps and analysis that has been done on the topic globaleaks/globaleaks-whistleblowing-software#672 |
@fpietrosanti Thanks for the detailed design doc! I have encountered this problem in my own work and ultimately used solution c) from your list. I believe it would be a good solution for this project as well. Concretely, the two options that I see as being useful are
In the second option, we would need to write a utility to prepare the temporary storage area and hook it in to the web server/web framework in question to ensure that cleartext is never written to disk. It would probably have to be written in C or another low-level language so we can guarantee the ephemeral key is securely wiped from memory. This is non-trivial but certainly doable. he next step would be looking into the details of how web.py/our webserver (nginx) handles file uploads. IIRC they typically stream requests into memory up to a certain size of request, and for larger requests they are streamed onto disk (in the clear) instead. |
Perhaps its a reference to cold boot attacks. |
@garrettr i think that keeping everything over a CryptoFS is good, but is not good to make a cryptofs for each file uploaded because it would increase the effect of a DOS with tons of submission of files done in parallel. I updated globaleaks/globaleaks-whistleblowing-software#672 with some last consideration coming from this thread about a possible complete solution. |
@stsquad We can get around this by either a) encrypting the swap space, or b) not having swap at all, which is what we do now (the installation scripts call @fpietrosanti There is a lot of overhead in that scheme - but it gets you perfect forward secrecy. It sounds like the best compromise would be to create a tmpfs, and direct the web server to use that for caching requests. We could secure-wipe/zerofill the tmpfs, either periodically or after every successfully processed upload, to mitigate cold boot attacks. I am not sure if it would even be helpful to encrypt it (since a decrypted version would be maintained elsewhere in memory by the cryptfs anyway). |
Another related concern is reading plaintext into processes whose memory we cannot securely erase, especially long-running processes like the Python web server (which GC, and depending on their activity level, may keep things in memory for a long time). |
@garrettr you can achieve PFS also without creating a new loopfs, the only important elements is that there is a "temporary encryption key per file" used, to symmetrically encrypted the files also going trough a simple AES pipe. The points IMHO are:
Setting up a Luks container does require high privileges because it does need to interact with the kernel, so this operation must be an extraordinary action (likely to be done only once at first boot of the server/application) and not an ordinary actions (like for each file). So i'd suggest to consider only methods that can be managed from the application itself, under it's low privileges with only application logic (while extraordinary stuff such as mounting a Luks container is something to be done by the root operator at first server/application startup time). |
Named pipe. Pipe through GPG then over SSH. Fifo does not (shouldn't) write to disks. "File does not touch the disk unencrypted" (or even encrypted). Double check with Strace. As for the concerns about plain-text remaining in memory mentioned by @garrettr, perhaps that should be a different issue. The concerns about DDOSing the service, use some sort of polling / asynchronous io to make sure there is always a process ready on the "read" side of the named pipe ready to go to be written to. |
@djon3s The problem with pipe come from The need to do buffering that will cause a DOS if uploading a file bigger than RAM memory. In globaleaks we use a scheduler based on python APScheduler to handle asyncronously all activities that can lead to DOS in processing and need to be managed in the parallelism/threshold of execution. |
@fpietrosanti With regard to pipes: I have a feeling though that pipes have become more like streams, rather than copying data. (https://lwn.net/Articles/118750/) and the according to the man page, With regards to piping through gpg / whatever else, that might be an issue. I could guess why the buffer will perhaps fill if we are encrypting to a public key (this is not a stream cipher after all)... however I did a test like the above with the following .... If my reasoning is correct, (confidence level: medium) I believe this would solve the issue of not touching the disk unencrypted, with the issues of submissions/sensitive data cached in RAM by the server mentioned by @garrettr still to resolve, and perhaps best to open as a new issue. Ways I could be wrong:
|
@fpietrosanti While this is a problem, it could be temporarily mitigated by limiting the maximum size of uploads to something reasonable and within the limit of server money (say, 1 or 2 GB). While there are legitimate cases in which sources may want to upload larger files (video, etc.), we could special-case those in a variety of ways since I do not expect them to make up the bulk of submissions. Two ways to do this would be:
|
@djon3s Why does this indicate to you that the file is not touching disk? To me, this seems indicative of one of two cases:
The only way to be sure is to examine the source and use |
I believe that the "pipe to GPG" method is the best method of solving this problem. To do this properly, we will need to check each point in the chain of programs handling the file to make sure they don't write anything to disk, via examining the source and using
This is not necessarily a problem. GPG actually generates a random symmetric (stream cipher) key, encrypts the file with that, then encrypts the random symmetric key with the chosen asymmetric key and combines them together to create the final encrypted file. So it is not impossible for it to encrypt a stream as it comes in... but we must confirm this. |
Interesting discussion. Either way /proc/sys/vm/swappiness should be set to 0 in sysctl.conf. Did a little research and it appears neither Apache or GPG buffer to disk in normal operation. How they behave in an OOM state is another question... |
Resolved in #862. |
@garrettr @redshiftzero : was the analysis of Apache2 completed in relation to this topic? i think that by default the same issue may exist in the multipart implementation of it and this could cause to write plaintext data for both POST and file uploads: https://github.com/apache/httpd/blob/5f32ea94af5f1e7ea68d6fca58f0ac2478cc18c5/server/apreq_parser_multipart.c Probably while this is analyzed it would be better to enlarge the buffer size and limit the requests size to the size of the buffer? @jimjag @wrowe: would you please point us to the right person inside apache can could support identifying the best configuration possibilities and if necessary design a patch for this? |
After your report @evilaliv3 I just uploaded a very large file in a SecureDrop production-like environment (to trigger the logic introduced in #862) and monitored apache's behavior with strace (to look for open calls) and lsof (to see if it's writing files to disk in unexpected locations - apache's capabilities here are limited by our AppArmor profile fortunately, though it can write to e.g. |
@redshiftzero: i'm actually not sure because i'm still trying to understand if apache, while using libapache2-mod-wsgi works in streaming passing the stream directly to flask (where #862 is effective) or if it may be doing some multipart parsing as well. In this condition i doubt that when the stream is bigger that the @GrahamDumpleton: would you please take a look at this ticket? |
Nothing in mod_wsgi should be trying to parse the content of a request. It gets passed through straight to the WSGI application as is. Whether the WSGI framework will send it to disk is a different matter. |
Thank you @GrahamDumpleton for checking this already! this is a relief. The implementation of the WSGI Framework (Flask) was fixed in #862 We should be fine so @redshiftzero ! |
If not already done, what you perhaps want to look at is any input filters for Apache that may sensibly be applied when using mod_wsgi. One that comes to mind is mod_deflate, when request content is compressed. That said, that is actually hard to use with WSGI applications because you need extensions in the WSGI framework to step around WSGI spec and ignore |
It appears that in https://github.com/deaddrop/deaddrop/blob/master/crypto.py#L97 the data is unencrypted on the disk. Even if the disk is encrypted, I think this is sub-optimal. I would suggest either a tmpfs (not so solid for various forensics reasons), encrypting the files on the client as they're uploading, or encrypting them in chunks once they're in the server's memory.
The text was updated successfully, but these errors were encountered: