Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove propagate_markers #7076

Merged
merged 1 commit into from
Sep 5, 2024
Merged

Remove propagate_markers #7076

merged 1 commit into from
Sep 5, 2024

Conversation

konstin
Copy link
Member

@konstin konstin commented Sep 5, 2024

Follow-up to #6959 and #6961: Use the reachability computation instead of propagate_markers everywhere.

With marker_reachability, we have a function that computes for each node the markers under which it is (requirements.txt, no markers provided on installation) or can be (uv.lock, depending on the markers provided on installation) included in the installation. Put differently: If the marker computed by marker_reachability is not fulfilled for the current platform, the package is never required on the current platform.

We compute the markers for each package in the graph, this includes the virtual extra packages and the base packages. Since we know that each virtual extra package depends on its base package (foo[bar] implied foo), we only retain the base package marker in the requirements.txt graph.

In #6959/#6961 we were only using it for pruning packages in uv.lock, now we're also using it for the markers in requirements.txt.

I think this closes #4645, CC @bluss.

@konstin konstin added the internal A refactor or improvement that is not user-facing label Sep 5, 2024
/// marker), we re-queue the node and update all its children. This implicitly handles cycles,
/// whenever we re-reach a node through a cycle the marker we have is a more
/// specific marker/longer path, so we don't update the node and don't re-queue it.
pub(crate) fn marker_reachability<T>(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function only moved.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If possible I might suggest splitting these moves out into separate commits in the future. Otherwise it can be a little tricky to see which parts of the diff are "uninteresting moves" versus what is an interesting change.

(It can be hard to always do this though, especially if you do the move mid-refactor.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call: #7091

@konstin konstin marked this pull request as draft September 5, 2024 11:05
@konstin konstin marked this pull request as ready for review September 5, 2024 11:15
Copy link
Member

@BurntSushi BurntSushi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes sense to me.

Since we know that each virtual extra package depends on its base package (foo[bar] implied foo), we only retain the base package marker in the requirements.txt graph.

Does this lead to any bug fixes? Or is it "just" a simplification?

/// marker), we re-queue the node and update all its children. This implicitly handles cycles,
/// whenever we re-reach a node through a cycle the marker we have is a more
/// specific marker/longer path, so we don't update the node and don't re-queue it.
pub(crate) fn marker_reachability<T>(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If possible I might suggest splitting these moves out into separate commits in the future. Otherwise it can be a little tricky to see which parts of the diff are "uninteresting moves" versus what is an interesting change.

(It can be hard to always do this though, especially if you do the move mid-refactor.)

@@ -201,7 +201,7 @@ impl std::fmt::Display for RequirementsTxtExport<'_> {
}
}

if let Some(contents) = markers.contents() {
if let Some(contents) = self.reachability[&node_index].contents() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

// We use the markers of the base package: We know that each virtual extra package has an
// edge to the base package, so we know that base package markers are more general than the
// extra package markers (the extra package markers are a subset of the base package
// markers).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like an interesting insight that might be worth copying over to the docs on PubGrubPackage. Specifically, is it a guaranteed invariant that for any package foo, if there is both a PubGrubPackage::Package and a PubGrubPackage::Marker for it, then the former is always a superset of the latter?

(I'd love to start collecting invariants in the docs for PubGrubPackage because I find that type somewhat difficult to reason about. But I don't have the full shape in my head yet to propose something better.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a hard time documenting this because "the base package is always selected when one of its virtual packages is selected" seems so fundamental to me. Do we have a good place to put overall resolver things? Because I think it would be valuable to write some of them up, we bet we come across some improvements to uv by this way.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure that was always even true, was it? The way virtual packages get added has shifted around a lot I think. Particularly in get_dependencies in the resolver. I don't think anything fundamental has changed in ~months, but I know @charliermarsh did a lot of work there.

I would start by adding these sorts of things to PubGrubPackage I think. It is a very key data type in the resolver and I think there are a lot of interesting invariants around it.

Copy link
Member

@charliermarsh charliermarsh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy with this as long as tests pass. Thanks!

konstin added a commit that referenced this pull request Sep 5, 2024
@konstin
Copy link
Member Author

konstin commented Sep 5, 2024

Since we know that each virtual extra package depends on its base package (foo[bar] implied foo), we only retain the base package marker in the requirements.txt graph.

Does this lead to any bug fixes? Or is it "just" a simplification?

This is intrinsic to how we handle markers and to marker propagation. This property held true before and after, though the algorithm we arrived at it changed and i made it more explicit.

konstin added a commit that referenced this pull request Sep 5, 2024
@konstin konstin enabled auto-merge (squash) September 5, 2024 16:46
@konstin konstin merged commit 750e8b1 into main Sep 5, 2024
57 checks passed
@konstin konstin deleted the konsti/propagate_markers branch September 5, 2024 16:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
internal A refactor or improvement that is not user-facing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants