-
Notifications
You must be signed in to change notification settings - Fork 80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Protect against malicious nested archives (aka zip bombs) #210
Comments
Please note that compression bombs only works with implementation that recursively extract the archive content, which technically we don't. I still think this is a good enhancement / hardening of unblob to protect against these files. I'll do some tests locally and get back with some notes :) |
Instead of checking the extracted file size for the content of a single compressed archive, we should keep an eye on the overall extracted size. With the proposed approach, we could run into issues if a compressed archive has nested archives that are below the threshold - e.g., 42.zip only extracts 530kb in the first loop, but if we continue recursively, we will eventually extract a ton of 4.3gb files, which by themselves may fly below the radar but still fill up the disk. I'd suggest to add an option (e.g., --max-extraction-size), which would override a reasonable default value (e.g., 100gb). Alternatively, we could also monitor available disk space and abort once we have used e.g., 90% of the available free disk space. |
It is very hard to come up with a general solution for this. I'd suggest extraction to a separate volume or use quota to mitigate this issue. |
I agree with the recommendation on separate volume and using quota. I'm sure we can provide sound recommendations on that subject in our documentation. I was toying with the idea of implementing a "best effort" protection against zip bombs by collecting size information from A basic implementation is available on this branch: main...zip-bomb I'm saying "best effort" because the smallest unit we can work on is a I got good results against zip bombs with this, the only thing that is missing is a proper cancellation/cleanup of running processes (marked as There's no urgency to it, but I would be glad if anyone from the team could have a look and share their thoughts. |
I also had a weird idea: writing some wrapper that can be LD_PRELOAD-ed that captures This could also be used to wire in files created by extractors to our reports in real-time. |
This would go against our attempt at supporting OSX (or at least Apple M1/M2). Another idea I had was to launch a "watcher process" registering inotify callbacks on the extracted directories recursively, killing watched processes once the size limit is reached. However inotify only exists on Linux, and the vast majority of inotify libraries in Python sucks (although not super complex to write a modern one). |
There is an equivalent of |
We should check the extracted file size during chunk parsing to avoid filling up the disk when extracting malicious nested archives.
Samples can be found here: https://www.bamsoftware.com/hacks/zipbomb/
For zip bombs, a check can be implemented in
is_valid
:Something similar to this could work:
I'll check if similar behavior (ie. "let's fill the whole disk") can be triggered with other formats.
The text was updated successfully, but these errors were encountered: