Designing a "goal" for build logic refactor #7259

pradyunsg · 2019-10-25T06:24:24Z

In the third edition of "Pradyun dumps his thoughts in a GitHub issue", there are thoughts on where to end up in terms of a design/architecture view, after this refactor.

Since ISTM that @chrahunt has volunteered to do some cleanup work on RequirementPreparer (and prepare.py in general), I'm skipping working on that bit of code (can come back to it later, if needed) to avoid any potential duplicated effort, and the next stage for me becomes the installation logic.

Prior to working on that, I figured it'll help to dump my ideas for how to eventually structure the codebase, w.r.t. this refactoring's resulting state. I've been thinking about where the "build" and "install" logic in pip, should live.

As an initial proposal for discussion, here's one way we could structure the codebase:

operations
  build
    metadata.py -- current pip._internal.operations.generate_metadata
    metadata_legacy.py -- "legacy" parts of pip._internal.operations.generate_metadata
    wheel.py -- build logic from current pip._internal.wheel
  install
    legacy.py -- from source dists directly (setup.py install, setup.py develop)
    wheel.py -- unpacking wheels, for installation
    editable.py (future) -- for whatever future standard we have

models
  states -- @chrahunt's "projects" classes. Let's not bikeshed the name right now.
    -- current "pip._internal.distributions` package moves into this
    -- immutable -- makes them easier to reason about
    -- "transition" between states, via methods on these objects
    -- -> call an "operation", constructing a new state and returning it
    --   -> idempotent (ideally)

This is essentially moving the code in InstallRequirement into operations.{build,install} packages, separating code for different approaches we permit.

keeping "legacy" approaches (i.e. implementation-defined-standards) separated
- makes it easier to reason about these things
- easier to drop support when we want to
expecting the current pip._internal.wheel would likely be not-so-tricky to breakup into "install" vs "build" steps

I won't be introducing any "state" models for a bit, until the rest of the logic is fairly disentangled. The process will make it clearer if idempotence and and immutability for states, make sense given the current logic that we have. If you think we should have a detailed discussion about what "states" look like or do, I'll request you to file a new issue. :)

The text was updated successfully, but these errors were encountered:

chrahunt · 2019-10-25T09:03:49Z

For reference (to satisfy curiosity, not derail discussion), the latest code related to state-based project models can be found here, with an overview here and a sample of associated unit tests here.

pradyunsg · 2019-11-02T14:38:08Z

Since no one has holler'ed against this, I'll go ahead and close this since I don't think there's anything actionable here -- just an RFC.

pfmoore · 2020-01-16T09:38:10Z

@chrahunt What's the current situation on the state-based project models you referred to above? Is that going to end up as a PR? Is it something you need help with? I'm looking at what next steps can be picked up on the refactor, and I'm not clear how your work there fits in.

chrahunt · 2020-01-17T05:28:29Z

👀

pfmoore · 2020-01-23T15:35:45Z

Confused by the emoji, sorry - can you clarify?

pradyunsg · 2020-01-23T20:47:36Z

(reopened since there's discussion happening now)

chrahunt · 2020-01-25T02:24:04Z

Sorry, I was indicating that I had seen the comment and intended to respond but then forgot about this issue.

My gut says that its use will be pretty independent of the resolver abstraction we use since Projects internally are prepared as-needed to satisfy accessed properties. So as long as we can adapt the property access to whatever the resolver is expecting it should be a good solution to requirement preparation.

I don't see a way to introduce the subpackage as small usable pieces. It seems like I'd have to introduce all of it at once and then we can use it in the resolver. That doesn't seem like it would be easy to review. If we're on board that this looks good in general then I can start introducing individual parts to get reviewed even if they aren't necessarily used yet. Similar to what I did in #7576, which was followed up by @uranusjr in #7608 and which we could now probably actually hook into our CLI processing.

I think my general concern about introducing a bunch of code is mitigated some since we've factored a lot of the build and metadata preparation logic into reusable functions.

Maybe we could set up a time to go over this in real time?

pfmoore · 2020-01-25T10:10:16Z

Thanks, I see. My instinct here is that the resolver needs to be able to just say to the project/requirement "what are your dependencies" without worrying too much about whether we need to download/build or whatever. That's such a precise match for the "project" model that I'm currently having a hard time even thinking about the resolver without assuming the project model is in place :-)

That doesn't necessarily mean that we need the project model yet - it would be easy enough to write wrapper functions for now. So it's not a hard dependency, just a help for me in thinking about the design.

Maybe we could set up a time to go over this in real time?

That would be good, but I think it would probably be better if I got a clearer feel for the resolver abstractions and how the algorithms use the data first. I was planning on doing that this week, so maybe we could sort something out the week after?

pfmoore · 2020-02-18T17:03:56Z

I've been looking further at pip's abstractions with relation to resolvelib. The biggest issue I see at the moment is that the existing InstallRequirement object is built from a requirement line (something like resolvelib>=0.2.0) and the "get the best candidate" logic is all handled internally to the object. But for resolvelib, we want to invert that logic - get all applicable candidates for a requirement spec, and convert them as needed into InstallRequirement (or project) objects so that we can query them for metadata.

The first part is fine - plenty of refactoring to do, but the machinery is all there in the finder. But the second, not so much, as we get into a loop where we have a candidate, we want to build a Requirement from it, that then searches for the best candidate...

I feel at this point like I'm re-inventing thinking that's probably already been done by @pradyunsg and @chrahunt in designing the "project" model, and @chrahunt already offered to discuss this rel-time. Maybe we can schedule a time for it? @chrahunt and/or @pradyunsg do you want to ping me on Zulip so we can arrange a mutually convenient time?

pradyunsg · 2020-02-18T20:21:54Z

we want to invert that logic - get all applicable candidates for a requirement spec, and convert them as needed into InstallRequirement

We can use PackageFinder.find_best_candidate().applicable_candidates for the "get all applicable candidates" part. It's called indirectly through populate_link() today. I don't think there's any refactoring needed here, since "use the finder method directly, instead of calling populate_link" would probably work.

IMO it should be feasible to construct an InstallRequirement object that matches a "template" InstallRequirement, with the link set as per the applicable candidate. Basically, a copied object, with a link. :)

pfmoore · 2020-02-18T21:00:01Z

I don't think there's any refactoring needed here, since "use the finder method directly, instead of calling populate_link" would probably work.

That bit, yes. Using the finder method directly is fine here, as you say. There's a bit of messy back-and-forth between parsed requirements, requirement strings, and things like that, but that's relatively easily tidied up. (At the moment, I'm prototyping code using the existing APIs, so I'm avoiding changes like that - but that's not a long-term constraint).

IMO it should be feasible to construct an InstallRequirement object that matches a "template" InstallRequirement, with the link set as per the applicable candidate. Basically, a copied object, with a link. :)

Hmm, maybe? I got lost in the maze of calls that triggered find_best_candidate and couldn't work out how to pull the bits apart (I'm concerned that we want an InstallRequirement for each candidate, not just the best one, when we have the new resolver). But I was tired (still am!) and maybe I'm not seeing something obvious. I'll try again tomorrow after some sleep 🙂

pradyunsg · 2022-03-26T16:24:00Z

IMO it should be feasible to construct an InstallRequirement object that matches a "template" InstallRequirement, with the link set as per the applicable candidate. Basically, a copied object, with a link. :)

Heh, I find it mildly amusing that this is what we've ended up with, in the new resolver. :)

pradyunsg added type: refactor Refactoring code state: needs discussion This needs some more discussion labels Oct 25, 2019

pradyunsg mentioned this issue Oct 25, 2019

Build Logic Refactor #6607

Open

pradyunsg closed this as completed Nov 2, 2019

pradyunsg mentioned this issue Nov 2, 2019

Move generate metadata into operations.build.metadata #7283

Merged

lock bot added the auto-locked Outdated issues that have been locked by automation label Dec 2, 2019

lock bot locked as resolved and limited conversation to collaborators Dec 2, 2019

pfmoore mentioned this issue Jan 16, 2020

Docs: Specify the types of requirement pip supports (internal details) #7589

Open

pradyunsg reopened this Jan 23, 2020

pypa unlocked this conversation Jan 23, 2020

pradyunsg removed the auto-locked Outdated issues that have been locked by automation label Jan 25, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Designing a "goal" for build logic refactor #7259

Designing a "goal" for build logic refactor #7259

pradyunsg commented Oct 25, 2019

chrahunt commented Oct 25, 2019 •

edited

Loading

pradyunsg commented Nov 2, 2019

pfmoore commented Jan 16, 2020 •

edited by chrahunt

Loading

chrahunt commented Jan 17, 2020

pfmoore commented Jan 23, 2020

pradyunsg commented Jan 23, 2020

chrahunt commented Jan 25, 2020

pfmoore commented Jan 25, 2020

pfmoore commented Feb 18, 2020

pradyunsg commented Feb 18, 2020

pfmoore commented Feb 18, 2020

pradyunsg commented Mar 26, 2022

Designing a "goal" for build logic refactor #7259

Designing a "goal" for build logic refactor #7259

Comments

pradyunsg commented Oct 25, 2019

chrahunt commented Oct 25, 2019 • edited Loading

pradyunsg commented Nov 2, 2019

pfmoore commented Jan 16, 2020 • edited by chrahunt Loading

chrahunt commented Jan 17, 2020

pfmoore commented Jan 23, 2020

pradyunsg commented Jan 23, 2020

chrahunt commented Jan 25, 2020

pfmoore commented Jan 25, 2020

pfmoore commented Feb 18, 2020

pradyunsg commented Feb 18, 2020

pfmoore commented Feb 18, 2020

pradyunsg commented Mar 26, 2022

chrahunt commented Oct 25, 2019 •

edited

Loading

pfmoore commented Jan 16, 2020 •

edited by chrahunt

Loading