-
-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added caching to improve pipenv-resolver performance #3667
Conversation
Thanks for that patch. It is a great work, but IMO I am not willing to merge such big changes made to vendor libraries( |
I understand the reasoning, but I do think a similar solution would be beneficial for pipenv. It nearly halves the time needed for a fairly standard Django project for me. While pipenv is really great, it's really too slow for comfort. I'll see if I can somehow patch something closer instead but I couldn't find anything quickly accessible that would have a meaningful effect. |
In lieu of changes upstream, would a monkeypatch for these two small sections be an option? |
@wolph Yes, if you have an elegant solution instead of hacking into the vendors I will be glad to review. |
@frostming I believe this is about as elegant as monkey patching can get :) (to clarify, the last 5 runs were with cache enabled) If we want to get any really decent performance we'll have to implement multithreaded and/or async downloading though. Or perhaps even skipping the downloading of the release list and simply check if there's been updates. |
Ok, that last test confirms it. It's not my code, the tests are broken ;) So please review regardless of the test status :) |
@frostming since a large portion of the time is now spent in downloading, an async multithreaded downloading approach could be very beneficial. That would require a similar edit in notpip and doesn't have to be all that invasive I think. Simply replace the pipenv/pipenv/patched/notpip/_internal/index.py Lines 758 to 774 in 6c62d23
Would a multithreaded version of that code be acceptable? |
We use a custom, internal pypi hosted on GitHub pages, so this really hits us as GitHub pages has a fixed cache-length of 10 minutes. It would be help us if the offline cache length for the .whl files could be overridden to be infinite. Otherwise we end up downloading big things like numpy over and over. |
@techalchemy do you have time to review this by any chance? |
I will be able to have a look this week, sorry for the very long delay while all the moving pieces were being pulled together |
Almost a year since the last activity, what is the current status? |
I've personally mostly stopped using pipenv. While I like the idea, it's far too slow to be usable for a decently sized project. Even with all of these improvements it's only about twice as fast or so, but it needs to be about 10 times as fast to be feasible for me which would require a substantial rewrite. |
This would have to be revisited for modern python3 and modern pip -- perhaps there is something we can learn from this PR to provide some level of caching in the resolver -- but it would need to be redone. I have a different performance PR out that greatly reduces the amount of system load to perform a batch install in similar or less wall time than the current version, but it doesn't effect the resolver: #5301 The biggest improvements I can see to the resolver without trying the monkeypatch pip internals, would be to get more support for fetching packages hashes from private pypis without having to download all of the wheels or other artifacts to hash them. |
The issue
The
pipenv-resolver
performance was bugging me a bit so I had a few rounds with a profiler and added caching with a few of the heaviest calls. In my case these were doing over a 100k calls so even with little performance impact it still made a difference.I suppose it's related to several of these: https://github.com/pypa/pipenv/search?q=performance&type=Issues but specifically this one: #356
For what it's worth, this change made quite a lot of difference for me. My initial ~30 seconds
pipenv lock
time was decreased to about ~17 seconds. There might be more low-hanging fruit but this takes a stab at it :)The fix
It adds a few minor caches to a few key spots in the code which are called very often according to some
cProfile
runs.The checklist
news/
directory to describe this fix with the extension.bugfix
,.feature
,.behavior
,.doc
..vendor
. or.trivial
(this will appear in the release changelog). Use semantic line breaks and name the file after the issue number or the PR #.If this is a patch to the
vendor
directory…While these could be submitted to the upstream projects, I think they're mostly very specific to pipenv but feel free to disagree and I'll take care of it.
The case of the
Version()
modification could be modified from everywhere it's being called, but that's still within thevendor
packages so it wouldn't help. It's a rather minor change here but a clean version upstream would be pretty invasive.