-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Download wheels in batch at the end of .prepare_linked_requirements_more() #8896
Download wheels in batch at the end of .prepare_linked_requirements_more() #8896
Conversation
de18d0d
to
4d02725
Compare
cc @pradyunsg @McSinyx wrote a tweet about much I loved your work on the lazy wheel fetching: https://twitter.com/hipsterelectron/status/1308306011409661954?s=20 |
I'm sorry to disappoint you, the lazy wheel isn't making download any faster at the moment. I suppose the other use-case (pip resolve) is possible with extra steps and this is what you're trying to do here. Regarding the news file, since pip's internal API is internal, I think this should be flagged as trivial. Concerning the approach, I can recall discussions with @pradyunsg and @chrahunt resulting in keeping the late download in the resolver (at least until the legacy resolver is removed, to keep the consistency of output by the two resolvers), in particular, within RequirementPreparer (see GH-8685). IIRC @pradyunsg told me that it's preferable to pass a dry_run option to the resolver instead—the reason Side comment: the use PartialRequirementDownloadCompleter is solely init then call it's only method, which I would love to see to be changed into a function if it turns out that my memory regarding the general approach said above isn't correct. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Me likey!
That was the wrong shortcut, on the wrong window. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Me likey the general clarity that factoring out the logic for the batch download logic brings.
As @McSinyx mentioned, until the older resolver dies, we'd want to keep the "download everything else" inside the resolver -- maintaining the abstraction of "everything is ready". While the "prepare more" logic is indeed a no-op with the old resolver, it'll be nice to keep it inside to be able to reason about these things more easily until that old one goes away.
Let's keep the download logic inside the resolver.resolve
call, and add a flag to tell the resolver whether to fully-prepare partially prepared items. :)
I made a mess of the review here -- but that's what I get for reviewing at 1am. :) |
You couldn't possibly disappoint me!!! Sounds like it's time for some profiling 🔍 . Sorry for disappearing off the face of the earth for so long! Your adaptation was much much cleaner than what I had and that is a much better place to start off than the other way around (at least, in my opinion, in this case).
So my thought process was actually just that parallel downloading would be the key to actually improving the download performance!! And if not, I was thinking a further optimization would be to keep alive the connections used to download the metadata, in case
This is super super helpful especially with the link to the PR I missed!! Hardly gave @pradyunsg much else to review ^_^
Yes!
Will do!
This makes perfect sense! It's also less code ^_^
You reviewed at approximately the same time I posted it, then! We'll call it even. |
0d7e9e8
to
15e61d6
Compare
Instead of doing this:
I instead just added an extra method |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm... keeping things on RequirementPreparer itself... should get us nearly the same benefits (separation of "collect" vs "download" in prepare_more). I think we can get way without introducing changes on the resolver side?
We'd skip the entire "prepare_more" stage from the resolver during a dry run anyway, so... :)
Yes! This is a much nicer way to think about it. Will do.
Great point!! (along with the rest) |
34d27a5
to
50c5e51
Compare
Done, I think! I think this gets us the separation you noted above, paving the way for other methods of downloading requirements, as well as not downloading them at all! |
ping! |
50c5e51
to
e9ed0b5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems clean enough but would need rebasing :)
create PartialRequirementDownloadCompleter, and use in wheel, install, and download add NEWS entry rename NEWS entry rename NEWS entry respond to review comments move the partial requirement download completion to the bottom of the prepare_more method
e9ed0b5
to
22406d4
Compare
Rebased! |
Perhaps @xavfernandez is the right person to ping here? |
Ping once more! ^_^ |
Let me know if there's someone specific I should contact! |
Nah -- I'll merge this once #8936 is done with (later this week). :) |
ping @pradyunsg! ❤️ |
ping! |
Pong! pip 20.3 hasn't been released yet, because, well, 2020. I will come back to this, once that's done. |
Not a problem!!! Thanks for the update. I will follow along with the 20.3 updates then! |
Come on, report your status, Azure. |
This PR corresponds to the second work section described in #7819 (comment) from the much-too-large #8448:
With some modifications to that prompt, it now consists of:
RequirementPreparer._complete_partial_requirements()
to fully download and prepare lazily-downloaded wheels..prepare_linked_requirements_more()
, separating the batch download from the rest of that method.