Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement PEP 706 – Filter for tarfile.extractall #102950

Closed
1 task done
encukou opened this issue Mar 23, 2023 · 14 comments
Closed
1 task done

Implement PEP 706 – Filter for tarfile.extractall #102950

encukou opened this issue Mar 23, 2023 · 14 comments
Assignees
Labels
type-feature A feature request or enhancement type-security A security issue

Comments

@encukou encukou added the type-feature A feature request or enhancement label Mar 23, 2023
encukou added a commit to encukou/cpython that referenced this issue Apr 25, 2023
Also remove expilcit `type=tarfile.DIRTYPE`, the slash at the end
is enough.
encukou added a commit that referenced this issue Apr 25, 2023
… sticky bit (GH-103831)

Also remove expilcit `type=tarfile.DIRTYPE`, the slash at the end is
enough.
encukou added a commit to encukou/cpython that referenced this issue Apr 25, 2023
…et the sticky bit (pythonGH-103831)

Also remove expilcit `type=tarfile.DIRTYPE`, the slash at the end is
enough.

Backport of c8c3956
encukou added a commit that referenced this issue Apr 28, 2023
…H-102953) (GH-103832)

See [Backporting & Forward Compatibility in PEP 706](https://peps.python.org/pep-0706/#backporting-forward-compatibility).

- Backport b52ad18
- Backport c8c3956
- Remove the DeprecationWarning
- Adjust docs
- Remove new `__all__` entries
@gpshead gpshead added the type-security A security issue label May 2, 2023
@mcepl
Copy link
Contributor

mcepl commented May 3, 2023

And I really think #73974 (CVE-2007-4559) should be mentioned somewhere in this PR.

mcepl pushed a commit to openSUSE-Python/cpython that referenced this issue May 3, 2023
…et the sticky bit (pythonGH-103831)

Also remove expilcit `type=tarfile.DIRTYPE`, the slash at the end is
enough.

Backport of c8c3956
@encukou
Copy link
Member Author

encukou commented May 3, 2023

Well, the PR was open for a month, but now it's closed. Feel free to suggest an update to the docs.
Note that Python 3.12 doesn't “fix” CVE-2007-4559 (depending on how you define “fix”).

encukou added a commit that referenced this issue May 10, 2023
…H-102953) (GH-104128)

- Backport b52ad18
- Backport c8c3956
- Remove the DeprecationWarning
- Adjust docs
- Remove new `__all__` entries

Co-authored-by: Petr Viktorin <encukou@gmail.com>
encukou added a commit to encukou/cpython that referenced this issue May 11, 2023
…et the sticky bit (pythonGH-103831)

Also remove expilcit `type=tarfile.DIRTYPE`, the slash at the end is
enough.

Backport of c8c3956
encukou added a commit to encukou/cpython that referenced this issue May 11, 2023
…et the sticky bit (pythonGH-103831)

Also remove expilcit `type=tarfile.DIRTYPE`, the slash at the end is
enough.

Backport of c8c3956
encukou added a commit to encukou/cpython that referenced this issue May 16, 2023
…et the sticky bit (pythonGH-103831)

Also remove expilcit `type=tarfile.DIRTYPE`, the slash at the end is
enough.

Backport of c8c3956
@ned-deily
Copy link
Member

Per the discussion in #104583, we have decided that it is not feasible to safely merge the proposed 3.7 version of this fix prior to 3.7's imminent end-of-life. Third-party distributors of cPython who plan to provide support for 3.7 past its official end-of-life are free, of course, to choose to merge or adapt the PR for their users.

@mcepl
Copy link
Contributor

mcepl commented May 30, 2023

Well, the PR was open for a month, but now it's closed. Feel free to suggest an update to the docs. Note that Python 3.12 doesn't “fix” CVE-2007-4559 (depending on how you define “fix”).

I would say “deals with”, which should cover everything. ;)

@mcepl
Copy link
Contributor

mcepl commented Jun 5, 2023

Oh, you mean that the change is not 100% API-stable? We have learned the hard way, unfortunately.

@mcepl
Copy link
Contributor

mcepl commented Jun 5, 2023

And yes, I will be working on 3.4 as well. Happy times!

encukou added a commit to encukou/cpython that referenced this issue Jun 7, 2023
Add API for allowing checks on the content of tar files, allowing callers to mitigate
directory traversal (CVE-2007-4559) and related issues.

Python 3.12 will warn if this API is not used.
Python 3.14 will fail if it's not used.
RHEL adds configuration options, by default it will warn and fail like 3.14 upstream.

Backport from python#102950

Change document: https://peps.python.org/pep-0706/
encukou added a commit to encukou/cpython that referenced this issue Jun 7, 2023
Add API for allowing checks on the content of tar files, allowing callers to mitigate
directory traversal (CVE-2007-4559) and related issues.

Python 3.12 will warn if this API is not used.
Python 3.14 will fail if it's not used.
RHEL adds configuration options, by default it will warn and fail like 3.14 upstream.

Backport from python#102950

Change document: https://peps.python.org/pep-0706/
carlosroman added a commit to DataDog/cpython that referenced this issue Jun 22, 2023
* Post 3.8.16

* [3.8] Update copyright years to 2023. (pythongh-100852)

* [3.8] Update copyright years to 2023. (pythongh-100848).
(cherry picked from commit 11f9932)

Co-authored-by: Benjamin Peterson <benjamin@python.org>

* Update additional copyright years to 2023.

Co-authored-by: Ned Deily <nad@python.org>

* [3.8] Update copyright year in README (pythonGH-100863) (pythonGH-100867)

(cherry picked from commit 30a6cc4)

Co-authored-by: Ned Deily <nad@python.org>
Co-authored-by: HARSHA VARDHAN <75431678+Thunder-007@users.noreply.github.com>

* [3.8] Correct CVE-2020-10735 documentation (pythonGH-100306) (python#100698)

(cherry picked from commit 1cf3d78)
(cherry picked from commit 88fe8d7)

Co-authored-by: Jeremy Paige <ucodery@gmail.com>
Co-authored-by: Gregory P. Smith <greg@krypto.org>

* [3.8] Bump Azure Pipelines to ubuntu-22.04 (pythonGH-101089) (python#101215)

(cherry picked from commit c22a55c)

Co-authored-by: Hugo van Kemenade <hugovk@users.noreply.github.com>

* [3.8] pythongh-100180: Update Windows installer to OpenSSL 1.1.1s (pythonGH-100903) (python#101258)

* pythongh-101422: (docs) TarFile default errorlevel argument is 1, not 0 (pythonGH-101424)

(cherry picked from commit ea23271)

Co-authored-by: Owain Davies <116417456+OTheDev@users.noreply.github.com>

* [3.8] pythongh-95778: add doc missing in some places (pythonGH-100627) (python#101630)

(cherry picked from commit 4652182)

* [3.8] pythongh-101283: Improved fallback logic for subprocess with shell=True on Windows (pythonGH-101286) (python#101710)

Co-authored-by: Oleg Iarygin <oleg@arhadthedev.net>
Co-authored-by: Steve Dower <steve.dower@microsoft.com>

* [3.8] pythongh-101981: Fix Ubuntu SSL tests with OpenSSL (3.1.0-beta1) CI i… (python#102095)

[3.8] pythongh-101981: Fix Ubuntu SSL tests with OpenSSL (3.1.0-beta1) CI issue (pythongh-102079)

* [3.8] pythonGH-102306 Avoid GHA CI macOS test_posix failure by using the appropriate macOS SDK (pythonGH-102307)

[3.8] Avoid GHA CI macOS test_posix failure by using the appropriate macOS SDK.

* [3.8] pythongh-101726: Update the OpenSSL version to 1.1.1t (pythonGH-101727) (pythonGH-101752)

Fixes CVE-2023-0286 (High) and a couple of Medium security issues.
https://www.openssl.org/news/secadv/20230207.txt

Co-authored-by: Gregory P. Smith <greg@krypto.org>
Co-authored-by: Ned Deily <nad@python.org>

* [3.8] pythongh-102627: Replace address pointing toward malicious web page (pythonGH-102630) (pythonGH-102667)

(cherry picked from commit 61479d4)

Co-authored-by: Blind4Basics <32236948+Blind4Basics@users.noreply.github.com>
Co-authored-by: C.A.M. Gerlach <CAM.Gerlach@Gerlach.CAM>
Co-authored-by: Hugo van Kemenade <hugovk@users.noreply.github.com>

* [3.8] pythongh-101997: Update bundled pip version to 23.0.1 (pythonGH-101998). (python#102244)

(cherry picked from commit 89d9ff0)

* [3.8] pythongh-102950: Implement PEP 706 – Filter for tarfile.extractall (pythonGH-102953) (python#104548)

Backport of c8c3956

* [3.8] pythongh-99889: Fix directory traversal security flaw in uu.decode() (pythonGH-104096) (python#104332)

(cherry picked from commit 0aeda29)

Co-authored-by: Sam Carroll <70000253+samcarroll42@users.noreply.github.com>

* [3.8] pythongh-104049: do not expose on-disk location from SimpleHTTPRequestHandler (pythonGH-104067) (python#104121)

Do not expose the local server's on-disk location from `SimpleHTTPRequestHandler` when generating a directory index. (unnecessary information disclosure)

(cherry picked from commit c7c3a60)

Co-authored-by: Ethan Furman <ethan@stoneleaf.us>
Co-authored-by: Gregory P. Smith <greg@krypto.org>
Co-authored-by: Jelle Zijlstra <jelle.zijlstra@gmail.com>

* [3.8] pythongh-103935: Use `io.open_code()` when executing code in trace and profile modules (pythonGH-103947) (python#103954)

Co-authored-by: Tian Gao <gaogaotiantian@hotmail.com>

* [3.8] pythongh-68966: fix versionchanged in docs (pythonGH-105299)

* [3.8] Update GitHub CI workflow for macOS. (pythonGH-105302)

* [3.8] pythongh-105184: document that marshal functions can fail and need to be checked with PyErr_Occurred (pythonGH-105185) (python#105222)

(cherry picked from commit ee26ca1)

Co-authored-by: Irit Katriel <1055913+iritkatriel@users.noreply.github.com>

* [3.8] pythongh-102153: Start stripping C0 control and space chars in `urlsplit` (pythonGH-102508) (pythonGH-104575) (pythonGH-104592) (python#104593) (python#104895)

`urllib.parse.urlsplit` has already been respecting the WHATWG spec a bit pythonGH-25595.

This adds more sanitizing to respect the "Remove any leading C0 control or space from input" [rule](https://url.spec.whatwg.org/GH-url-parsing:~:text=Remove%20any%20leading%20and%20trailing%20C0%20control%20or%20space%20from%20input.) in response to [CVE-2023-24329](https://nvd.nist.gov/vuln/detail/CVE-2023-24329).

I simplified the docs by eliding the state of the world explanatory
paragraph in this security release only backport.  (people will see
that in the mainline /3/ docs)

(cherry picked from commit d7f8a5f)
(cherry picked from commit 2f630e1)
(cherry picked from commit 610cc0a)
(cherry picked from commit f48a96a)

Co-authored-by: Miss Islington (bot) <31488909+miss-islington@users.noreply.github.com>
Co-authored-by: Illia Volochii <illia.volochii@gmail.com>
Co-authored-by: Gregory P. Smith [Google] <greg@krypto.org>

* [3.8] pythongh-103142: Upgrade binary builds and CI to OpenSSL 1.1.1u (pythonGH-105174) (pythonGH-105200) (pythonGH-105205) (python#105370)

Upgrade builds to OpenSSL 1.1.1u.

Also updates _ssl_data_111.h from OpenSSL 1.1.1u, _ssl_data_300.h from 3.0.9.

Manual edits to the _ssl_data_300.h file prevent it from removing any
existing definitions in case those exist in some peoples builds and were
important (avoiding regressions during backporting).

(cherry picked from commit ede89af)
(cherry picked from commit e15de14)

Co-authored-by: Gregory P. Smith <greg@krypto.org>
Co-authored-by: Ned Deily <nad@python.org>

* Python 3.8.17

* Post 3.8.17

* Updated CI to build 3.8.17

---------

Co-authored-by: Łukasz Langa <lukasz@langa.pl>
Co-authored-by: Benjamin Peterson <benjamin@python.org>
Co-authored-by: Ned Deily <nad@python.org>
Co-authored-by: Miss Islington (bot) <31488909+miss-islington@users.noreply.github.com>
Co-authored-by: HARSHA VARDHAN <75431678+Thunder-007@users.noreply.github.com>
Co-authored-by: Gregory P. Smith <greg@krypto.org>
Co-authored-by: Jeremy Paige <ucodery@gmail.com>
Co-authored-by: Hugo van Kemenade <hugovk@users.noreply.github.com>
Co-authored-by: Steve Dower <steve.dower@python.org>
Co-authored-by: Owain Davies <116417456+OTheDev@users.noreply.github.com>
Co-authored-by: Éric <earaujo@caravan.coop>
Co-authored-by: Oleg Iarygin <oleg@arhadthedev.net>
Co-authored-by: Steve Dower <steve.dower@microsoft.com>
Co-authored-by: Dong-hee Na <donghee.na@python.org>
Co-authored-by: Blind4Basics <32236948+Blind4Basics@users.noreply.github.com>
Co-authored-by: C.A.M. Gerlach <CAM.Gerlach@Gerlach.CAM>
Co-authored-by: Pradyun Gedam <pradyunsg@gmail.com>
Co-authored-by: Petr Viktorin <encukou@gmail.com>
Co-authored-by: Sam Carroll <70000253+samcarroll42@users.noreply.github.com>
Co-authored-by: Ethan Furman <ethan@stoneleaf.us>
Co-authored-by: Jelle Zijlstra <jelle.zijlstra@gmail.com>
Co-authored-by: Tian Gao <gaogaotiantian@hotmail.com>
Co-authored-by: Irit Katriel <1055913+iritkatriel@users.noreply.github.com>
Co-authored-by: stratakis <cstratak@redhat.com>
Co-authored-by: Illia Volochii <illia.volochii@gmail.com>
rickprice added a commit to ActiveState/cpython that referenced this issue Sep 13, 2023
rickprice added a commit to ActiveState/cpython that referenced this issue Sep 13, 2023
rickprice added a commit to ActiveState/cpython that referenced this issue Sep 14, 2023
icanhasmath pushed a commit to ActiveState/cpython that referenced this issue Sep 27, 2023
icanhasmath pushed a commit to ActiveState/cpython that referenced this issue Sep 29, 2023
icanhasmath pushed a commit to ActiveState/cpython that referenced this issue Sep 29, 2023
@ben-spiller
Copy link

Hi, I just upgraded from 3.11 to Python 3.12 and this PR is producing DeprecationWarnings (ok) that can't be easily avoided (not ok!) while calling shutil.unpack_archive(): Python 3.14 will, by default, filter extracted tar archives and reject files or modify their metadata. Use the filter argument to control this behavior.

While the idea of a deprecation warning for tarfile with filter=None is clearly reasonable, the impact of this on shutil.unpack_archive() is really unfortunate and could use more consideration. The whole point of unpack_archive is to provide a way to unpack an archive without the caller having to special-case different archive formats. It's now impossible to use unpack_archive for different formats in Python 3.12 without either

a) hitting the new v3.12 deprecation warning for not specifying filter='...' (imho it's not ok to ignore warnings!), or

b) special-casing how caller invokes unpack_archive() for different archive types i.e. passing filter='data' if archive contains .tar/tgz/txz/etc but filter=None if archive is a zipfile. Determining which case is not trivial for the caller of unpack_archive(), and should be taken care of by the library.

Both of those options are pretty gross and make this change very hard to adapt to without gross hacks or disabling deprecation warnings (not a great practice).

Some possible solutions:

  • provide some value of filter= that can be passed to both tarfile and other archive formats that don't yet/don't need to support tar-style safe extraction (as PEP-0706 states, "ZipFile.extract’s defaults are already similar to what a 'data' filter would do")
  • remove the deprecation warning asap until a future release when there's time to do the above or plan another solution. (nb: this deprecation is not documented in the "Important deprecations, removals or restrictions" section of release notes so removing it in a patch would seem reasonable)
  • make unpack_archive pass the filter= flag down to tarfile but not to other archive classes

Right now it's a breaking change and I can't see any nice way to work around it.

stratakis pushed a commit to stratakis/cpython that referenced this issue Feb 27, 2024
Implement PEP 706 – Filter for tarfile.extractall

Upstream issue: python#102950

Tracker bug: https://bugzilla.redhat.com/show_bug.cgi?id=263261
hroncok pushed a commit to fedora-python/cpython that referenced this issue Mar 7, 2024
Add API for allowing checks on the content of tar files, allowing callers to mitigate
directory traversal (CVE-2007-4559) and related issues.

Python 3.12 will warn if this API is not used.
Python 3.14 will fail if it's not used.

Backport from python#102950

Change document: https://peps.python.org/pep-0706/
stratakis pushed a commit to stratakis/cpython that referenced this issue Mar 11, 2024
Add API for allowing checks on the content of tar files, allowing callers to mitigate
directory traversal (CVE-2007-4559) and related issues.

Python 3.12 will warn if this API is not used.
Python 3.14 will fail if it's not used.

Backport from python#102950

Change document: https://peps.python.org/pep-0706/
stratakis pushed a commit to stratakis/cpython that referenced this issue Mar 11, 2024
Add API for allowing checks on the content of tar files, allowing callers to mitigate
directory traversal (CVE-2007-4559) and related issues.

Python 3.12 will warn if this API is not used.
Python 3.14 will fail if it's not used.

Backport from python#102950

Change document: https://peps.python.org/pep-0706/
stratakis pushed a commit to stratakis/cpython that referenced this issue Mar 20, 2024
Add API for allowing checks on the content of tar files, allowing callers to mitigate
directory traversal (CVE-2007-4559) and related issues.

Python 3.12 will warn if this API is not used.
Python 3.14 will fail if it's not used.

Backport from python#102950

Change document: https://peps.python.org/pep-0706/
stratakis pushed a commit to stratakis/cpython that referenced this issue Mar 20, 2024
Add API for allowing checks on the content of tar files, allowing callers to mitigate
directory traversal (CVE-2007-4559) and related issues.

Python 3.12 will warn if this API is not used.
Python 3.14 will fail if it's not used.

Backport from python#102950

Change document: https://peps.python.org/pep-0706/
stratakis pushed a commit to stratakis/cpython that referenced this issue Mar 20, 2024
Add API for allowing checks on the content of tar files, allowing callers to mitigate
directory traversal (CVE-2007-4559) and related issues.

Python 3.12 will warn if this API is not used.
Python 3.14 will fail if it's not used.

Backport from python#102950

Change document: https://peps.python.org/pep-0706/
stratakis pushed a commit to stratakis/cpython that referenced this issue Mar 20, 2024
Add API for allowing checks on the content of tar files, allowing callers to mitigate
directory traversal (CVE-2007-4559) and related issues.

Python 3.12 will warn if this API is not used.
Python 3.14 will fail if it's not used.

Backport from python#102950

Change document: https://peps.python.org/pep-0706/
stratakis pushed a commit to stratakis/cpython that referenced this issue Mar 25, 2024
Add API for allowing checks on the content of tar files, allowing callers to mitigate
directory traversal (CVE-2007-4559) and related issues.

Python 3.12 will warn if this API is not used.
Python 3.14 will fail if it's not used.

Backport from python#102950

Change document: https://peps.python.org/pep-0706/
hroncok pushed a commit to fedora-python/cpython that referenced this issue Mar 26, 2024
Add API for allowing checks on the content of tar files, allowing callers to mitigate
directory traversal (CVE-2007-4559) and related issues.

Python 3.12 will warn if this API is not used.
Python 3.14 will fail if it's not used.

Backport from python#102950

Change document: https://peps.python.org/pep-0706/
mcepl pushed a commit to openSUSE-Python/cpython that referenced this issue Mar 28, 2024
…et the sticky bit (pythonGH-103831)

Also remove expilcit `type=tarfile.DIRTYPE`, the slash at the end is
enough.

Backport of c8c3956
mcepl pushed a commit to openSUSE-Python/cpython that referenced this issue Apr 2, 2024
Add API for allowing checks on the content of tar files, allowing callers to mitigate
directory traversal (CVE-2007-4559) and related issues.

Python 3.12 will warn if this API is not used.
Python 3.14 will fail if it's not used.

Backport from python#102950

Change document: https://peps.python.org/pep-0706/
icanhasmath pushed a commit to ActiveState/cpython that referenced this issue Jul 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type-feature A feature request or enhancement type-security A security issue
Projects
None yet
Development

No branches or pull requests

5 participants