Skip to content

gh-121999: Change default tarfile filter to 'data' #122002

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Jul 26, 2024

Conversation

WilliamRoyNelson
Copy link
Contributor

@WilliamRoyNelson WilliamRoyNelson commented Jul 19, 2024

@ghost
Copy link

ghost commented Jul 19, 2024

All commit authors signed the Contributor License Agreement.
CLA signed

@bedevere-app
Copy link

bedevere-app bot commented Jul 19, 2024

Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool.

If this change has little impact on Python users, wait for a maintainer to apply the skip news label instead.

@WilliamRoyNelson WilliamRoyNelson force-pushed the gh-121999-tarfile-filter branch from 0529960 to c74d0dc Compare July 19, 2024 04:24
@ZeroIntensity
Copy link
Member

CC @encukou, as this is PEP 706.

@sodle sodle force-pushed the gh-121999-tarfile-filter branch from d3c2acc to 15216b9 Compare July 20, 2024 09:36
@WilliamRoyNelson WilliamRoyNelson force-pushed the gh-121999-tarfile-filter branch from 1e3001d to 57c60e7 Compare July 21, 2024 23:02
@picnixz
Copy link
Member

picnixz commented Jul 22, 2024

I'm a bit confused but... arae there two people working on this PR simultaneously @WilliamRoyNelson and @sodle?

Copy link
Member

@picnixz picnixz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor comments.

.. versionchanged:: 3.14
Set the default extraction filter to :func:`data <data_filter>`,
which disallows dangerous features such as links to absolute paths
or paths outside of the destination. Previously, the filter strategy
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry about this one but the outside of now feels weird to me :')

@AA-Turner As a native speaker (you're the only one I know...), should it be "outside the destination", or "outside of"? (or something else entirely?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think "outside of" is conventional in North America, less conventional in the UK.
https://learningenglish.voanews.com/a/should-we-think-outside-or-outside-of-the-box-/6434530.html

I think I was reading from PEP 706 when I wrote that update.

Refuse to extract links (hard or soft) which end up linking to a path outside of the destination. (On systems that don’t support links, tarfile will, in most cases, fall back to creating regular files. This proposal doesn’t change that behaviour.)

@encukou
Copy link
Member

encukou commented Jul 22, 2024

I'll review later this week.

@sodle sodle force-pushed the gh-121999-tarfile-filter branch from c3638d4 to 619dc28 Compare July 22, 2024 14:46
@sodle
Copy link
Contributor

sodle commented Jul 22, 2024

I'm a bit confused but... arae there two people working on this PR simultaneously @WilliamRoyNelson and @sodle?

Yeah. Bill is a friend of mine and enlisted my help with writing the tests.

@encukou
Copy link
Member

encukou commented Jul 25, 2024

For the documentation, communicating via GitHub review comments wouldn't be effective, so I took the liberty of pushing a commit to this PR directly. I hope you don't mind.

The main themes are:

  • Passing filter='data' is still recommended, if there's any chance the code will run on Python 3.13 or below. When 3.14 is released, that's nearly all the code :)
  • All the security notes should lead to :ref:tarfile-extraction-filter(to explain things) and/or:ref:tarfile-further-verification (to explain that 'data' still isn't unconditionally “safe”). Hopefully we can do that without annoying people who read those already :)

For shutil: zipfile also has some safeties, though they haven't been reviewed in a while. IMO we can claim for both formats that the defaults “prevent the most dangerous of such security issues”.

Does this look good to you?

@encukou encukou merged commit dcafb36 into python:main Jul 26, 2024
34 checks passed
@encukou
Copy link
Member

encukou commented Jul 26, 2024

Thank you for the update!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants