Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Use TarSafe for extracting backup tarball #57

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

fghaas
Copy link
Contributor

@fghaas fghaas commented Jan 9, 2023

The tarfile.extractall() command is vulnerable to path traversal, which may be exploited by adding a member with an ../ path to the tarball. In our case, this might open up the possibility of malicious data injection to someone that doesn't normally have access to the Open edX cluster, but does have write access to the S3 bucket. In that case, bad things could happen upon extraction of a thus-crafted archive, during an automated restore.

This shouldn't have particularly wide-ranging implications since the only filesystem affected by such an attack would be the restore job's container, which is by definition short-lived. And an attacker with access to the S3 bucket could already do far greater damage to the Open edX installation by simply modifying the MongoDB or MySQL data contained in the tarball.

Still, it does not hurt to use a safer (if slightly slower) approach that is provided by the tarsafe module.

References:
python/cpython#73974
https://mail.python.org/pipermail/python-dev/2007-August/074290.html
https://nvd.nist.gov/vuln/detail/CVE-2007-4559

The tarfile.extractall() command is vulnerable to path traversal,
which may be exploited by adding a member with an "../" path to the
tarball. In our case, this might open up the possibility of malicious
data injection to someone that doesn't normally have access to the
Open edX cluster, but does have write access to the S3 bucket. In that
case, bad things could happen upon extraction of a thus-crafted
archive, during an automated restore.

This shouldn't have particularly wide-ranging implications since the
only filesystem affected by such an attack would be the restore job's
container, which is by definition short-lived. And an attacker with
access to the S3 bucket could already do far greater damage to the
Open edX installation by simply modifying the MongoDB or MySQL data
contained in the tarball.

Still, it does not hurt to use a safer (if slightly slower) approach
that is provided by the tarsafe module.

References:
python/cpython#73974
https://mail.python.org/pipermail/python-dev/2007-August/074290.html
https://nvd.nist.gov/vuln/detail/CVE-2007-4559
@fghaas fghaas marked this pull request as draft January 9, 2023 14:25
@fghaas
Copy link
Contributor Author

fghaas commented Jan 9, 2023

@angonz I wonder if you could help testing this. I know your backup tarballs are much larger than ours typically are, and I'd like to know if swapping in TarSafe for tarfile makes your restore operations take unacceptably longer. Would you mind giving this a try, by rebuilding your Docker image from my topic branch and attempting a restore using one of your larger tarballs?

@angonz
Copy link
Contributor

angonz commented Jan 16, 2023

Hi Florian, sorry for the late reply. Sure I will test. Just give me some time because the site is now in production and I will have to set up a test env.

@fghaas
Copy link
Contributor Author

fghaas commented Jan 16, 2023

That'd be excellent, thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants