-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Poetry downloading same wheels multiple times within a single invocation #2415
Comments
The first two downloads happen in poetry/src/poetry/puzzle/provider.py Lines 413 to 418 in 41a8a47
which indeed use a temporary location that is immediately thrown away. Presumably the right thing to share with would be the artifact cache as used by the |
What's the reasoning for dumping the downloads to a I'd be happy to try and contribute. Naively I'd check a cache wherever download_file is called ( |
Suspect that code fragment uses a temporary directory for no particularly good reason. poetry has a cache of downloaded files that it uses during installation, as managed by the curiously named Couple of problems though:
I'd start with an MR that updates the chef so that Then if that's accepted, follow up with some sort of rearrangement so that this cache can be shared by the chef and the solving code |
Thanks @dimbleby I'll take a look and see what I can do. |
This is a serious problem with packages like PyTorch which are extremely large. Unless there's a workaround for this I will definitely never use Poetry. |
Any update on a fix for this? I really like poetry but locking or adding a new dependency now takes > 5 minutes because I have to download wheels for torch, torchaudio, and torchvision. Is there a short-term workaround while a more permanent fix is made? Thank you. |
I suspect many are reading this issue without actually having experienced the issue -- Poetry downloads Torch once for metadata + hashing, and a second time for actual installation. After the cache is created, Poetry will not re-download Torch. We are downloading distfiles more often than needed as two parts of the code do not share a common cache, but we are not downloading every time |
Thanks for the reply. I am an active user of poetry running 1.2.1, I experience the issue as the pytorch wheel downloads every time I do add or lock and it takes around 80 seconds to download. Kazam_screencast_00002.mp4 |
Every time I run It is added to
|
I think this is related: in a project I have these conditional URL dependencies defined
Every |
On my system also, this seemed to make Poetry re-download Since PyTorch URLs have to be hard-coded to install properly and PyTorch's wheel takes more than 1GB, this prevents me from migrating the team to Poetry. |
Ah, looking at this, I realize that all the metadata caching happens in the repository layer. So if you're using direct URL dependencies, Poetry has no caching whatsoever. I personally got turned around here on whether this was a bug or as-designed behavior (currently, the latter is true). Ideally the artifacts cache could be made agonostic to repositories so that it is keyed on URLs only and we can share it, as @dimbleby has mentioned. On top of that, I wonder if some mechanism to cache metadata (maybe a |
I'm also experiencing this issue and it's unfortunate as I now have to choose between installing specific It will be great if there was caching for direct URL dependencies as well, as neither option is ideal right now 😞 |
I'm having the same problem in my project, because if you have any packages with " { url = ... } " poetry add, poetry lock , poetry update, everytime it downloads again. Like a temporary solution, I'm using requirements.txt for URL packages and pyproject.toml for the remaining, waiting for a solution. |
I think we've pretty firmly established what is going on and what is needed to improve -- I'd ask that people please refrain from "me too" as it's just adding noise right now. |
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
-vvv
option).Issue
When adding a new dependency, it is downloaded multiple times; I observed three downloads, two of those are unneccessary.
Starting with a
pyproject.toml
as in the Gist given above, I runThen I see the following output (XXX added as markers for explanation below):
At the positions where the marker
XXX
is inserted, the same 1.3GB download is done again and again.Similar, when adding another package later, again
XXX
marks the cursor position when the big download is done:I'd expect the file to be downloaded a most once and reused.
Slightly related but different issues: #999, #2094
The text was updated successfully, but these errors were encountered: