-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add prefetching of index in PEP503 repositories #5442
add prefetching of index in PEP503 repositories #5442
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we get similar results by applying an functools.lru_cache
on _get_page
instead?
@abn Nope! So the problem here is that if you ask pytorch.org where to get foobar, it's going to throw an error back at you. No manner of caching is going to change that -- except maybe cross-session, and we'd still have to get the error once. This asks pytorch.org "what do you have" and then we only ask for those things. |
@tgolsson I should really read the descriptions fully before looking at code 🤣 |
No worries! For some context for why I landed on this solution: I did investigate alternative methods such as "explicit" sources, not querying secondary sources by default, etc. However, changing how sources are combined felt like a larger/potentially breaking or verbose change compared to relying on PEP503 behaviours. I do think it's weird that default/primary/secondary sources seemingly are mostly treated the same way, but again, anything changing that is a breaking change. This is 100% backwards compatible, a pure speed-up, and purely opt-in. The only way to get "worse" results than today is to opt-in to indexing for a repository that doesn't have an index. |
This is awesome! |
@abn What's the release-process for poetry-core so the dependency constraint can be updated here? |
@tgolsson I am aiming for a new core release this week; so once that lands we can rebase this. |
This can now be rebased. |
0118da1
to
1437a69
Compare
@abn Rebased! I'm going to need to check how the docs render since that had changed quite a bit. |
Deploy preview for website ready! ✅ Preview Built with commit a0c1846. |
If anyone watching this PR has a known use-case for indexing; I'd love to know if it works for you! I've tested against torch and pypi, plus of course unit tests -- but I'm sure there are other cases out there that have... interesting configurations that I'm not dealing with correctly. |
The new indexed keyword introduced for pre-fetching legacy repositories was not known the Source class.
I just tried out your branch and it is working great. I can finally install PyTorch using poetry. Thank you for the work. I do however get an error when I try to add a repository to pyproject.toml using I made a pull request into tgolsson:ts/prefetch-legacy-repository. As far as I can see that fixed the problem. ErrorCommand ran:
pyproject.toml[tool.poetry]
name = "test"
version = "0.1.0"
description = ""
authors = ["test@test.test"]
readme = "README.md"
[tool.poetry.dependencies]
python = "~3.8"
numpy = "^1.22"
[build-system]
requires = ["poetry-core"]
build-backend = "poetry.core.masonry.api"
[[tool.poetry.source]]
name = "pytorch"
url = "https://download.pytorch.org/whl/cu115/"
default = false
secondary = true
indexed = true |
Co-authored-by: Bjorn Neergaard <bjorn@neersighted.com>
Co-authored-by: Bjorn Neergaard <bjorn@neersighted.com>
Add indexed keyword to Source class
Thanks @neersighted for the feedback, and @Jinior for testing and the PR! |
@abn / @neersighted I'd missed the follow up to the convo but went ahead and did some restructuring; which I do think made it clearer! I don't think it's exactly what you asked for with a "SimpleIndexedRepositoryPage", though -- but I'm not sure I see how that would work. |
Please let me know if there's anything I can do to get this merged... |
@vikigenius It does solve it if users opt in. So for some users it's a perfect fix but otherwise it has no effect. |
@tgolsson is this branch pinning the correct version of poetry core? I tried to run this against our private gitlab package registry and ran into the following
see the second argument, when I try to run with this branch without setting |
I'm abandoning this PR as we've adopted PDM instead :-) |
This pull request has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
Pull Request Check List
Resolves: #4885, partially
Description
This PR adds a new keyword
indexed
when defining a LegacyRepository (e.g. PEP503) source. Theindexed
keyword enables the use of a prefetched and cached index, which will limit the amount of unnecessary calls.Currently, if one configures a secondary repository for Poetry it'll get queried for all dependencies, no matter whether it's default, primary, or secondary. For projects with lots of dependencies (transitive or direct) this leads to a lot of unnecessary calls to a host which can't serve the requested package. One such case is using GPU-based packages from https://download.pytorch.org/whl/, where most other packages should be served from Pypi. This leads to user confusion due to error messages; and takes a lot of time.
While update-time on a cold cache is dominated by downloading every possible GPU package; this PR changes noop
poetry update
time from 30-40 seconds to <5 seconds. In total, this repository has 89 dependencies (reported bypoetry show
).Also, it removes all errors from querying subpages that don't exist.
This depends on a PR to
poetry-core
, and thus a release there: python-poetry/poetry-core#323Timing
Poetry==1.2.0b1
This PR