Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove extras from compiled file #1608

Conversation

FlorentJeannot
Copy link
Contributor

This PR removes the extras from the compiled files.

Please note that this is a proposal that needs discussion.

Here's a comparative table of other tools:

Tool Result
pipenv lock -r Includes extras
pip freeze Does not include extras
poetry export Does not include extras

In my opinion the compiled files should not include extras.

Let's take an example:

Given a requirements.in containing typer with all extras:

typer[all]

When doing pip-sync, the compiled file is:

#
# This file is autogenerated by pip-compile with python 3.9
# To update, run:
#
#    pip-compile
#
click==8.1.2
    # via typer
colorama==0.4.4
    # via typer
shellingham==1.4.0
    # via typer
typer[all]==0.4.1
    # via -r requirements.in

Then pip freeze shows:

click==8.1.2
colorama==0.4.4
pep517==0.12.0
pip-tools==6.5.1
shellingham==1.4.0
tomli==2.0.1
typer==0.4.1

Now imagine that typer[all] did not contain colorama (which is an extra dependency) at the time we ran pip-compile.
The compiled file would look like that:

#
# This file is autogenerated by pip-compile with python 3.9
# To update, run:
#
#    pip-compile
#
click==8.1.2
    # via typer
shellingham==1.4.0
    # via typer
typer[all]==0.4.1
    # via -r requirements.in

Let's save the change in the compiled file and remove all installed packages in the venv.
Then let's do pip-sync and pip freeze again. Here's the output:

click==8.1.2
colorama==0.4.4
pep517==0.12.0
pip-tools==6.5.1
shellingham==1.4.0
tomli==2.0.1
typer==0.4.1

As you can see, while colorama was not explicitly defined in the compiled files, it has been installed because of the extra specified in the compiled file.

I do not know if it's possible to replace a package in PyPI with the same version.
If it is, then an attacker could inject an extra dependency and users would install it without realizing it. Personally, when my list of installed packages is huge, I don't check one by one all of them, I trust pip or pip-sync to install what's defined in the compiled file and nothing else. That's why I think this change makes sense.

Links mentioning this change

Contributor checklist
  • Provided the tests for the changes.
  • Assure PR title is short, clear, and good to be included in the user-oriented changelog
Maintainer checklist
  • Assure one of these labels is present: backwards incompatible, feature, enhancement, deprecation, bug, dependency, docs or skip-changelog as they determine changelog listing.
  • Assign the PR to an existing or new milestone for the target version (following Semantic Versioning).

@AndydeCleyre
Copy link
Contributor

AndydeCleyre commented Apr 3, 2022

IMO the extras should be preserved, as they are the only way within a single file to guarantee any order of installation, which may have significant effects on a given package's setup procedure.

If someone replaces a versioned PyPI package with a modification (and the user isn't hash checking it), then there's no need for the attacker to put the malicious code in a dependency, which in a sense makes it more visible than including it directly in the replaced package.

@FlorentJeannot
Copy link
Contributor Author

FlorentJeannot commented Apr 3, 2022

@AndydeCleyre About your second point, It makes sense that in reality, the attacker would include the change in the code and not as an extra. It doesn't change the fact that it may install unexpected extras (especially if we allow the extras to be included in direct reference of a VCS). Although in theory, a package should be pinned and the content should not change.

About your first point, I don't know well enough how that works to comment. Are you saying that pip will first try to install packages which have an extra?
If that so, we should be able to find people having this issue using pip freeze or poetry export right? I took a quick look in their github repositories and I didn't find anything about that. Do you think you could give an example of a compiled file which would break?

@AndydeCleyre
Copy link
Contributor

Are you saying that pip will first try to install packages which have an extra?

For pkgname[group1,group2], pip will install the the requirements from group1 and group2 before installing pkgname.

If those groups specify req1 and req2, and we run pip install req1 req2 pkgname, we lose that order guarantee.

If that so, we should be able to find people having this issue using pip freeze or poetry export right? I took a quick look in their github repositories and I didn't find anything about that. Do you think you could give an example of a compiled file which would break?

You can find some context for a good example in the case of gdal, which enables some features if and only if numpy is already installed at time of gdal's installation, at #992 and OSGeo/gdal#2158.

@FlorentJeannot
Copy link
Contributor Author

@AndydeCleyre Thank you for your explanation!
After reading the issue you shared, it now makes sense to me that we need to keep the extras in the compiled file.
Before I close this PR, I'll wait to hear from @atugushev as I'd like to know if he, too, agrees with this.

@atugushev
Copy link
Member

atugushev commented Apr 6, 2022

Currently, we have direct references without extras and pinned packages with extras in requirements.txt, which looks wrong and should be synced in some single way. Intuitively I still think that we should omit the extras, but I'd like to see a proper discussion in a separate issue with "pros and cons" and examples from other tools for each method. Who would like to open an issue?

@FlorentJeannot
Copy link
Contributor Author

@atugushev I could work on that Saturday unless @AndydeCleyre you'd like to do it?

@AndydeCleyre
Copy link
Contributor

Please go ahead, thanks!

@FlorentJeannot
Copy link
Contributor Author

I will try to run all the tools with gdal and numpy (with and without extras) which are mentioned in #992 and see what happens.

@ssbarnea
Copy link
Member

ssbarnea commented Oct 6, 2022

I will be closing this because we did not have a bug and an agreed behavior yet. Also the already existing --strip-extras should be enough to cover for it anyway.

@ssbarnea ssbarnea closed this Oct 6, 2022
@FlorentJeannot
Copy link
Contributor Author

@ssbarnea Sure thing. Also what do we do about #1582 which is kind of related?

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants