-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement PEP 643 to optimise pip download --no-binary
#10195
Comments
There are a few things we need to unpack here. First of all, pip does not build a wheel for metadata, but only requests to build package metadata, which is expected to require minimal effort. Some packages are configured in a, well, let’s say suboptimal way that a metadata build request results in an entire package build, but that fact is opaque to pip, and the build result is usually still short of a wheel, so technically there’s no wheel to save. Even if we ignore all of that and say let's just make pip build a wheel for metadata (which means we're unreasonably punishing well-configured projects for less cooperative ones, but let's assume we can magically avoid that), there is still the fundamental issue that a wheel build is not guaranteed to be stable. If you ever noticed, pip does not cache wheels for installation it built from source, because a compilation step is essentially remote code execution and there's nothing stopping a package to produce a different wheels when built a second time (and a lot of projects do that in the real world in forms like build-time feature detection and conditional compilation). So making pip cache wheels built from source is introducing a cache invalidation issue (when can a wheel be reused?), one of the “two most difficult things” in computer science. In the end, everything boils down to one fundamental fact that, without outside input, it is impossible to make sure a collection of source code is trustworthy in any way without building it from scratch. What pip needs is some additional flags in that source tree to tell pip what it can cache. And there is already a standard for that: PEP 643. If a source distribution implements that PEP, pip can determine the package metadata in it is trustworthy in fine granularity, and avoids the build step when it can. So what I would suggest is to contribute to wheel builders (e.g. setuptools) to see PEP 643 implemented (see pypa/setuptools#2685), and once those projects start generating sdists with appropriate metadata, pip can start doing what you want. Without that, anything pip does to solve your problems will cause us to get complaints from another group of people. |
I’ll just re-purpose this issue to track pip’s implementation of PEP 643. Even if there’s currently no-one producing such sdists, we can do the work upfront so one anyone does, they can get the benefits immediately. |
pip download --no-binary
must build a wheel for metadata, save it.pip download --no-binary
+1 on this. It would be great to get PEP 643 support in place, even if backends are not yet producing PEP 643 compliant sdists, as having support in pip would act as encouragement for them to do so. I may try to take a look at this, but my open source time is pretty limited at the moment, so if someone else wants to pick this up, I'm fine with that. |
Thanks for the detailed explanation. If I understand correctly, when |
|
After testing, it looks like |
#8387 (comment) says the contrary. Is the comment incorrect?
What are the best practices package maintainers can do to speed up building package metadata and avoid the 'suboptimal' way? Do the projects using PEP-517 need to implement a special 'prepare_metadata_for_build_wheel' hook to avoid building a wheel? I am looking at pip/src/pip/_vendor/pep517/wrappers.py Lines 172 to 188 in 1c4753f
|
See https://www.python.org/dev/peps/pep-0517/#prepare-metadata-for-build-wheel Overall, given a source distribution, pip currently will always generate metadata for that package. If that package does not have a metadata generation hook, pip will generate an entire wheel instead. |
Thanks, @pradyunsg , this makes sense. It seems that before PEP-517, pip was generating package metadata by calling Would it be appropriate for packages that use setuptools to implement If calling egg_info is not a good way to go, do you by chance have an example of a good implementation of the |
The problem is not if they have Edit: And to answer your original question, no, you can't just call |
The numpy regression happened right when they switched to PEP-517 in numpy/numpy#14053. Running |
Huh? I feel like it's useful to point out that setuptools does implement |
And that numpy correctly handles |
That sounds more like there's an issue with setuptools' |
I am confused as to where |
Can we move this discussion to a new issue please? This seems to be purely about numpy / PEP 517, and has nothing to do with PEP 643's implementation work. |
Yes, it's implemented by the build backend, not the package maintainer. And I agree with @pradyunsg, this should be taken to a different issue (as I say, maybe on the numpy or setuptools tracker). |
Thanks for the feedback, opened pypa/setuptools#2814. |
Currently, when pip downloads sdists it is sometimes necessary for it to build a wheel in order to get trustworthy metadata #7995 #1884. Ideally it wouldn't need to do so, but that is a difficult problem to solve.
I would instead suggest that when pip does need to build a wheel during download it should default to saving that resulting wheel along-side the sdist package. This would let users avoid having to build the wheel twice in some cases, which could save a considerable amount of time.
While there are other cases where that wheel wouldn't be useful to the user (they want to make modifications first, control build settings, etc) it doesn't hurt them to have it, as it was being built anyway, and thus the time and disc space requirements are unchanged. People using
pip download
in scripts and the like may have to make changes to account for the additional file, so a transition plan would be needed.I think the closest existing alternative to this is
pip download --build ./tmp --no-clean
. The main downside to that is knowing and remembering that it is necessary. Users are often surprised thatpip download
builds packages. Even when aware of the issue it can be easy to forget, especially when an sdist is downloaded because no appropriate wheels are available rather than because--no-binary
was used.This might dovetail with #9769 which aims to always make wheels as an intermediate step in installing sdists.
The text was updated successfully, but these errors were encountered: