-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pip install --dry-run
shouldn't download full wheels when metadata file available
#12603
Comments
PRs welcome. However, one potential upside of downloading the wheels is that if the dry run is followed by an actual install, the install itself will be faster as everything needed will be cached. So it's not immediately obvious that this change would be beneficial overall. |
I think there are two main use cases for dry run, one where you're validating what would install and then installing, and one where you're collecting the output of dry run as part of a larger environment management process, in this case you may not be installing locally, you could be preparing files for a docker build, or many other things. I think you're thinking of the first case. Even in that case you may decide that the versions are wrong and you want to update the requirements, you may need to do this several times. If the wheels are very large, it's a lot of wasted download and time as you iterate finding the right requirements. |
I've inserted the |
Dry run works fine for me, in the sense that it doesn't install any packages, if you have an issue that appears to be a bug create a new issue with steps to reproduce, including your environment such as your pip version etc. |
@notatallshaw, you're right. I mistakenly assumed that it wasn't, because the output says "Downloading" (which I quickly interrupted). At the end, the output does say "would install" the downloaded package. |
I started working on a PR and I think it's going to be a smaller change once #12300 lands because there's some specific legacy version and specifier warnings which assume the full wheel file is there. So I'm going to wait until that is no longer a concern rather than trying to fix that code. Although this is going to be bigger than a 10 line change, because |
x-ref #12186 |
That PR seems effectively dead, hopefully I can touch this code in a much simpler manner. |
We have this use case as well. We use --dry-run with --report to figure out dependencies. The wheels downloads slow down the process significantly. |
@notatallshaw - just checking how are you getting along with the PR? I was surprised when I discovered that |
Not had time to work on it, might revisit it in a few weeks, and I was only going to submit it if I could make a relatively simple change. If anyone else wants to give it a go, by all means don't wait for me. |
fwiw, for the time being, I've been happily running #12186 on my machine:
|
I was just about to file the same issue :) P.S. @notatallshaw I can't edit the issue title. Could you fix the typo in the word “shouldn't”? |
--dry-run
shouln't download full wheels when metadata file available--dry-run
shouldn't download full wheels when metadata file available
Done. Also there was some recent progress in #12863 Personally, I'd quite like the work done by @cosmicexplorer to land. |
Just sharing a concrete example that brought me here. |
@notatallshaw please keep pinging me if you perceive any blocking on my part! I have extreme confidence in my approach for all of my open PRs, which has been refined and honed over years starting from #7819 ever since I realized pip was the right place for the optimization work I was doing for Twitter Cortex ML in pantsbuild/pants#8793 (which produced If you apply all my current diffs (https://github.com/pypa/pip/pulls/cosmicexplorer) in series (the last one is #12258, sorry I need to rebase this, I'll do that now), you will get a truly fantastic performance improvement with minimal complexity, making use of the metadata resolution framework introduced in prior PRs to read metadata from cache (and even further). In addition to performance, it will also drastically reduce the number and magnitude of HTTP requests made against pypi: see #12256. Each of these subsequent PRs demonstrates a robust performance improvement, especially when resolving large binary wheels for ML frameworks like @mmartial discussed. The overall performance improvement is quite drastic (especially with On a personal level, I really appreciate you advocating for my work, and I would love if you could continue to help nudge me to make sure this gets done. I'm @hipsterelectron@circumstances.run on mastodon and if you DM me there or on twitter I will be more likely to respond. I think this code is good, I think it's right for pip, and I will do my part to keep iterating on these PRs until they're pip quality. |
I think it's an upside for regular python apps, but with pytorch, xformers, accelerate etc it looks like this:
|
you can use one of @cosmicexplorer's branches:
they are being tracked here: #12921 |
Description
When running
pip install --dry-run {package}
pip downloads the metadata file and then the full wheelExpected behavior
Dry run installs don't need to download the full wheels
pip version
24.0
Python version
3.11
OS
Linux
How to Reproduce
pip install --dry-run kaleido==0.2.1
Output
Code of Conduct
The text was updated successfully, but these errors were encountered: