Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Designing a "goal" for build logic refactor #7259

Open
pradyunsg opened this issue Oct 25, 2019 · 12 comments
Open

Designing a "goal" for build logic refactor #7259

pradyunsg opened this issue Oct 25, 2019 · 12 comments
Labels
state: needs discussion This needs some more discussion type: refactor Refactoring code

Comments

@pradyunsg
Copy link
Member

In the third edition of "Pradyun dumps his thoughts in a GitHub issue", there are thoughts on where to end up in terms of a design/architecture view, after this refactor.

Since ISTM that @chrahunt has volunteered to do some cleanup work on RequirementPreparer (and prepare.py in general), I'm skipping working on that bit of code (can come back to it later, if needed) to avoid any potential duplicated effort, and the next stage for me becomes the installation logic.

Prior to working on that, I figured it'll help to dump my ideas for how to eventually structure the codebase, w.r.t. this refactoring's resulting state. I've been thinking about where the "build" and "install" logic in pip, should live.


As an initial proposal for discussion, here's one way we could structure the codebase:

operations
  build
    metadata.py -- current pip._internal.operations.generate_metadata
    metadata_legacy.py -- "legacy" parts of pip._internal.operations.generate_metadata
    wheel.py -- build logic from current pip._internal.wheel
  install
    legacy.py -- from source dists directly (setup.py install, setup.py develop)
    wheel.py -- unpacking wheels, for installation
    editable.py (future) -- for whatever future standard we have

models
  states -- @chrahunt's "projects" classes. Let's not bikeshed the name right now.
    -- current "pip._internal.distributions` package moves into this
    -- immutable -- makes them easier to reason about
    -- "transition" between states, via methods on these objects
    -- -> call an "operation", constructing a new state and returning it
    --   -> idempotent (ideally)

This is essentially moving the code in InstallRequirement into operations.{build,install} packages, separating code for different approaches we permit.

  • keeping "legacy" approaches (i.e. implementation-defined-standards) separated
    • makes it easier to reason about these things
    • easier to drop support when we want to
  • expecting the current pip._internal.wheel would likely be not-so-tricky to breakup into "install" vs "build" steps

I won't be introducing any "state" models for a bit, until the rest of the logic is fairly disentangled. The process will make it clearer if idempotence and and immutability for states, make sense given the current logic that we have. If you think we should have a detailed discussion about what "states" look like or do, I'll request you to file a new issue. :)

@pradyunsg pradyunsg added type: refactor Refactoring code state: needs discussion This needs some more discussion labels Oct 25, 2019
@chrahunt
Copy link
Member

chrahunt commented Oct 25, 2019

For reference (to satisfy curiosity, not derail discussion), the latest code related to state-based project models can be found here, with an overview here and a sample of associated unit tests here.

@pradyunsg
Copy link
Member Author

Since no one has holler'ed against this, I'll go ahead and close this since I don't think there's anything actionable here -- just an RFC.

@lock lock bot added the auto-locked Outdated issues that have been locked by automation label Dec 2, 2019
@lock lock bot locked as resolved and limited conversation to collaborators Dec 2, 2019
@pfmoore
Copy link
Member

pfmoore commented Jan 16, 2020

@chrahunt What's the current situation on the state-based project models you referred to above? Is that going to end up as a PR? Is it something you need help with? I'm looking at what next steps can be picked up on the refactor, and I'm not clear how your work there fits in.

@chrahunt
Copy link
Member

👀

@pfmoore
Copy link
Member

pfmoore commented Jan 23, 2020

Confused by the emoji, sorry - can you clarify?

@pradyunsg pradyunsg reopened this Jan 23, 2020
@pypa pypa unlocked this conversation Jan 23, 2020
@pradyunsg
Copy link
Member Author

(reopened since there's discussion happening now)

@chrahunt
Copy link
Member

Sorry, I was indicating that I had seen the comment and intended to respond but then forgot about this issue.

My gut says that its use will be pretty independent of the resolver abstraction we use since Projects internally are prepared as-needed to satisfy accessed properties. So as long as we can adapt the property access to whatever the resolver is expecting it should be a good solution to requirement preparation.

I don't see a way to introduce the subpackage as small usable pieces. It seems like I'd have to introduce all of it at once and then we can use it in the resolver. That doesn't seem like it would be easy to review. If we're on board that this looks good in general then I can start introducing individual parts to get reviewed even if they aren't necessarily used yet. Similar to what I did in #7576, which was followed up by @uranusjr in #7608 and which we could now probably actually hook into our CLI processing.

I think my general concern about introducing a bunch of code is mitigated some since we've factored a lot of the build and metadata preparation logic into reusable functions.

Maybe we could set up a time to go over this in real time?

@pradyunsg pradyunsg removed the auto-locked Outdated issues that have been locked by automation label Jan 25, 2020
@pfmoore
Copy link
Member

pfmoore commented Jan 25, 2020

Thanks, I see. My instinct here is that the resolver needs to be able to just say to the project/requirement "what are your dependencies" without worrying too much about whether we need to download/build or whatever. That's such a precise match for the "project" model that I'm currently having a hard time even thinking about the resolver without assuming the project model is in place :-)

That doesn't necessarily mean that we need the project model yet - it would be easy enough to write wrapper functions for now. So it's not a hard dependency, just a help for me in thinking about the design.

Maybe we could set up a time to go over this in real time?

That would be good, but I think it would probably be better if I got a clearer feel for the resolver abstractions and how the algorithms use the data first. I was planning on doing that this week, so maybe we could sort something out the week after?

@pfmoore
Copy link
Member

pfmoore commented Feb 18, 2020

I've been looking further at pip's abstractions with relation to resolvelib. The biggest issue I see at the moment is that the existing InstallRequirement object is built from a requirement line (something like resolvelib>=0.2.0) and the "get the best candidate" logic is all handled internally to the object. But for resolvelib, we want to invert that logic - get all applicable candidates for a requirement spec, and convert them as needed into InstallRequirement (or project) objects so that we can query them for metadata.

The first part is fine - plenty of refactoring to do, but the machinery is all there in the finder. But the second, not so much, as we get into a loop where we have a candidate, we want to build a Requirement from it, that then searches for the best candidate...

I feel at this point like I'm re-inventing thinking that's probably already been done by @pradyunsg and @chrahunt in designing the "project" model, and @chrahunt already offered to discuss this rel-time. Maybe we can schedule a time for it? @chrahunt and/or @pradyunsg do you want to ping me on Zulip so we can arrange a mutually convenient time?

@pradyunsg
Copy link
Member Author

we want to invert that logic - get all applicable candidates for a requirement spec, and convert them as needed into InstallRequirement

We can use PackageFinder.find_best_candidate().applicable_candidates for the "get all applicable candidates" part. It's called indirectly through populate_link() today. I don't think there's any refactoring needed here, since "use the finder method directly, instead of calling populate_link" would probably work.

IMO it should be feasible to construct an InstallRequirement object that matches a "template" InstallRequirement, with the link set as per the applicable candidate. Basically, a copied object, with a link. :)

@pfmoore
Copy link
Member

pfmoore commented Feb 18, 2020

I don't think there's any refactoring needed here, since "use the finder method directly, instead of calling populate_link" would probably work.

That bit, yes. Using the finder method directly is fine here, as you say. There's a bit of messy back-and-forth between parsed requirements, requirement strings, and things like that, but that's relatively easily tidied up. (At the moment, I'm prototyping code using the existing APIs, so I'm avoiding changes like that - but that's not a long-term constraint).

IMO it should be feasible to construct an InstallRequirement object that matches a "template" InstallRequirement, with the link set as per the applicable candidate. Basically, a copied object, with a link. :)

Hmm, maybe? I got lost in the maze of calls that triggered find_best_candidate and couldn't work out how to pull the bits apart (I'm concerned that we want an InstallRequirement for each candidate, not just the best one, when we have the new resolver). But I was tired (still am!) and maybe I'm not seeing something obvious. I'll try again tomorrow after some sleep 🙂

@pradyunsg
Copy link
Member Author

IMO it should be feasible to construct an InstallRequirement object that matches a "template" InstallRequirement, with the link set as per the applicable candidate. Basically, a copied object, with a link. :)

Heh, I find it mildly amusing that this is what we've ended up with, in the new resolver. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
state: needs discussion This needs some more discussion type: refactor Refactoring code
Projects
None yet
Development

No branches or pull requests

3 participants