-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Encode mutually-incompatible pairs of markers #9444
Conversation
7a177a9
to
2b8bec4
Compare
crates/uv-pep508/src/marker/tree.rs
Outdated
value: "Linux".to_string(), | ||
}, | ||
), | ||
// os_name == 'posix' and platform_system == 'Windows' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the bright side, this is highly flexible so we can encode more known incompatibilities, like this.
@@ -494,7 +494,7 @@ impl<'d> Forker<'d> { | |||
let mut envs = vec![]; | |||
{ | |||
let not_marker = self.marker.negate(); | |||
if !env_marker.is_disjoint(¬_marker) { | |||
if !env_marker.is_disjoint(¬_marker) && !env_marker.is_conflicting() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I need to do this in more places.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be a pain... Right now, env_marker.is_disjoint(¬_marker)
returns true
if either of the expressions are false, but that doesn't include the is_conflicting
definition here -- it only looks at the FALSE
node.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could make is_conflicting
into simplify_conflicting(...) -> MarkerTree
that returns a FALSE node if it matched one of the impossible cases, that might simplify
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think the approach makes sense though?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we'd have to try it out, but it does sound more elegant in theory
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
andrew's idea is even better than that
2b8bec4
to
55539fd
Compare
crates/uv-pep508/src/marker/tree.rs
Outdated
] { | ||
let mut a = MarkerTree::expression(a); | ||
let b = MarkerTree::expression(b); | ||
a.and(b); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We may want to cache those, @ibraheemdev will know the perf characteristics
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is in a OnceCell
so it should be good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the idea here is sound.
My main concern is probably what you might expect: that you have to remember to call this in a bunch of places around the code. I suppose the mitigation to that is that this isn't a correctness issue but a simplification issue.
Is there any way to do this simplification via MarkerTree
's smart constructors? Before @ibraheemdev's work, we had this problem of markers possibly being simplified or not, and we'd call the simplification routine in various spots. But now we have a nice guarantee that markers are always simplified, as guaranteed by the smart constructors. Maybe we can do that in our MarkerTree::and
and MarkerTree::or
methods?
Yeah I like that idea. |
I wonder if you could even define |
55539fd
to
6ed78a7
Compare
Ok, the version I pushed does this in the smart constructors... |
@@ -695,12 +697,14 @@ impl MarkerTree { | |||
#[allow(clippy::needless_pass_by_value)] | |||
pub fn and(&mut self, tree: MarkerTree) { | |||
self.0 = INTERNER.lock().and(self.0, tree.0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it better to hold the lock and re-use it in simplify
, or re-lock()
as I'm doing here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably re-use it, these are pretty much only used by a single thread.
b2415d6
to
a1357ef
Compare
I think what I have here works, but I'm worried about the performance costs and complexity. |
a1357ef
to
20fcfae
Compare
crates/uv-pep508/src/marker/tree.rs
Outdated
fn simplify(&mut self) { | ||
if !self.0.is_false() { | ||
let mutex = &*MUTUAL_EXCLUSIONS; | ||
if INTERNER.lock().is_disjoint(self.0, mutex.0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Took me a while to understand that this is is_disjoint
and not !is_disjoint
because MUTUAL_EXCLUSIONS
is the negated exclusions, so it's the valid space.
Plus when reading the code you need to consider we call this every time while building the marker, so when only B from A or B
(we previously saw in a lockfile) lies in the excluded area, we still get A because B was converted to FALSE upon construction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should I be changing something in response to this comment? :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add an inline comment like:
MUTUAL_EXCLUSIONS
is the possible space, if the markers areis_disjoint
with it, they are solely in the impossible space.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(I'll leave the smart constructor review to andrew/ibraheem)
20fcfae
to
2929d09
Compare
Ok, I think this version should have a minimal performance hit. I moved the logic into the algebra itself, and we now only perform this check when merging nodes that could yield a conflict (i.e., two marker trees that contain variables that could conflict). Per @ibraheemdev suggestion, I made those variables "highest-priority", so if they aren't present at the top of the marker tree, we know they aren't present anywhere beneath it. |
255fae9
to
7580a50
Compare
|
||
// Create the output node. | ||
guard.create_node(func, children) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I need the above methods because I need versions that don't recursively call .exclusions()
. If I just use and
and or
here, I create an infinite loop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense.
|
||
if let Some(exclusions) = self.state.exclusions { | ||
return exclusions; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume it's fine to just use an Option
here rather than a OnceCell
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah. Rust wouldn't let you mess this up.
return true; | ||
} | ||
|
||
let (x, y) = (self.shared.node(xi), self.shared.node(yi)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a huge fan of having this entirely separate method. But it's equivalent to is_disjoint
, except it doesn't do the mutually-incompatible marker check. If it did, we'd create an infinite loop, since is_disjoint
calls and
in that case which then calls disjointness
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I feel like a little copying here is probably the way to go. I don't see a simple way of unifying them without making it more obscure.
Might be worth moving this comment into the code, for future wranglers wondering about it.
7580a50
to
f157444
Compare
// Determine whether the conjunction _could_ contain a conflict. As an optimization, we only | ||
// have to perform this check at the top-level, since these variables are given higher | ||
// priority in the tree. In other words, if they're present, they _must_ be at the top. | ||
let conflicts = x.var.conflicts() && y.var.conflicts(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea is to give the "possibly-conflicting" markers highest priority, so they're always at the top of the tree.
As such, when we AND two expressions, we only have to perform the "possibly-conflicting" check if they both have a conflicting variable at the very top.
13c46e1
to
d8f094a
Compare
Ok, I added some sorting to the DNF to minimize the amount of churn in the expressions. |
b42371b
to
7b81730
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this makes sense to me. And I like that this doesn't require any sort of knowledge on the callers part to explicitly simplify the markers. Nice work. :-)
@@ -493,6 +494,19 @@ pub enum MarkerExpression { | |||
}, | |||
} | |||
|
|||
/// The kind of a [`MarkerExpression`]. | |||
#[derive(Clone, Copy, Debug, Eq, Hash, PartialEq, PartialOrd, Ord)] | |||
pub(crate) enum MarkerExpressionKind { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this only used for backcompat sorting? If so, it might be useful to add a comment indicating as such. And that the order here matters?
return true; | ||
} | ||
|
||
let (x, y) = (self.shared.node(xi), self.shared.node(yi)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I feel like a little copying here is probably the way to go. I don't see a simple way of unifying them without making it more obscure.
Might be worth moving this comment into the code, for future wranglers wondering about it.
|
||
// Create the output node. | ||
guard.create_node(func, children) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense.
|
||
if let Some(exclusions) = self.state.exclusions { | ||
return exclusions; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah. Rust wouldn't let you mess this up.
7b81730
to
4dbc57d
Compare
4dbc57d
to
1d75afa
Compare
This MR contains the following updates: | Package | Update | Change | |---|---|---| | [astral-sh/uv](https://github.com/astral-sh/uv) | patch | `0.5.7` -> `0.5.8` | MR created with the help of [el-capitano/tools/renovate-bot](https://gitlab.com/el-capitano/tools/renovate-bot). **Proposed changes to behavior should be submitted there as MRs.** --- ### Release Notes <details> <summary>astral-sh/uv (astral-sh/uv)</summary> ### [`v0.5.8`](https://github.com/astral-sh/uv/blob/HEAD/CHANGELOG.md#058) [Compare Source](astral-sh/uv@0.5.7...0.5.8) **This release does not include the `powerpc64le-unknown-linux-musl` target due to a build issue. See [#​9793](astral-sh/uv#9793) for details. If this change affects you, please file an issue with your use-case.** ##### Enhancements - Omit empty resolution markers in lockfile ([#​9738](astral-sh/uv#9738)) - Add `--install-dir` to to `uv python install` and `uninstall` commands ([#​7920](astral-sh/uv#7920)) - Add `--show-urls` and `--only-downloads` to `uv python list` ([#​8062](astral-sh/uv#8062)) - Add `uv python list --all-arches` ([#​9782](astral-sh/uv#9782)) - Add `uv run --gui-script` flag for running Python scripts with `pythonw.exe` ([#​9152](astral-sh/uv#9152)) - Allow `--gui-script` on Unix ([#​9787](astral-sh/uv#9787)) - Allow download of Python distribution variants optimized for newer x86\_64 microarchitectures ([#​9781](astral-sh/uv#9781)) - Allow execution of `pyw` files on Unix ([#​9759](astral-sh/uv#9759)) - Allow users to specify URLs in `project.dependencies` and `tool.uv.sources` ([#​9718](astral-sh/uv#9718)) - Encode mutually-incompatible pairs of markers ([#​9444](astral-sh/uv#9444)) - Improve the error message when a Python install request is not valid ([#​9783](astral-sh/uv#9783)) - Preserve directory-level standalone build symlinks ([#​9723](astral-sh/uv#9723)) - Add support for `uv publish --index <name>` ([#​9694](astral-sh/uv#9694)) - Reframe `--locked` and `--frozen` as `--check` operations for `uv lock` ([#​9662](astral-sh/uv#9662)) - Rename Python install scratch directory from `.cache` -> `.temp` ([#​9756](astral-sh/uv#9756)) - Enable `uv tool uninstall uv` on Windows ([#​8963](astral-sh/uv#8963)) - Improve self-dependency hint to make shadowing clear ([#​9716](astral-sh/uv#9716)) - Refactor unavailable metadata to shrink the resolver ([#​9769](astral-sh/uv#9769)) - Show 'depends on itself' for proxy packages ([#​9717](astral-sh/uv#9717)) - Show a dedicated error for missing subdirectories ([#​9761](astral-sh/uv#9761)) - Show a dedicated hint for missing `git+` prefixes ([#​9789](astral-sh/uv#9789)) ##### Performance - Eagerly error when parsing `pyproject.toml` requirements ([#​9704](astral-sh/uv#9704)) - Use copy-on-write when normalizing paths ([#​9710](astral-sh/uv#9710)) ##### Bug fixes - Avoid enforcing non-conflicts in `uv export` ([#​9751](astral-sh/uv#9751)) - Don't drop comments between items in TOML tables ([#​9784](astral-sh/uv#9784)) - Don't fail with `--no-build` when static metadata is available ([#​9785](astral-sh/uv#9785)) - Don't filter non-patch registry version ([#​9736](astral-sh/uv#9736)) - Don't read metadata from stale `.egg-info` files ([#​9760](astral-sh/uv#9760)) - Enforce correctness of self-dependencies ([#​9705](astral-sh/uv#9705)) - Fix projects's typo in resolver error messages ([#​9708](astral-sh/uv#9708)) - Ignore `.` prefixed directories during managed Python installation discovery ([#​9786](astral-sh/uv#9786)) - Improve handling of invalid virtual environments during interpreter discovery ([#​8086](astral-sh/uv#8086)) - Normalize relative paths when `--project` is specified ([#​9709](astral-sh/uv#9709)) - Respect self-constraints on recursive extras ([#​9714](astral-sh/uv#9714)) - Respect user settings for tracing coloring ([#​9733](astral-sh/uv#9733)) - Retry on tar extraction errors ([#​9753](astral-sh/uv#9753)) - Add conflict markers to the lock file ([#​9370](astral-sh/uv#9370)) - De-duplicate resolution markers ([#​9780](astral-sh/uv#9780)) - Avoid 403 error hint for PyTorch URLs ([#​9750](astral-sh/uv#9750)) - Avoid treating non-existent `--find-links` as relative URLs ([#​9720](astral-sh/uv#9720)) - Omit Windows Store `python3.13.exe` et al ([#​9679](astral-sh/uv#9679)) - Replace executables with broken symlinks during `uv python install` ([#​9706](astral-sh/uv#9706)) ##### Documentation - Fix build failure links ([#​9740](astral-sh/uv#9740)) </details> --- ### Configuration 📅 **Schedule**: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined). 🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied. ♻ **Rebasing**: Whenever MR becomes conflicted, or you tick the rebase/retry checkbox. 🔕 **Ignore**: Close this MR and you won't be reminded about this update again. --- - [ ] <!-- rebase-check -->If you want to rebase/retry this MR, check this box --- This MR has been generated by [Renovate Bot](https://github.com/renovatebot/renovate). <!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNy40NDAuNyIsInVwZGF0ZWRJblZlciI6IjM3LjQ0MC43IiwidGFyZ2V0QnJhbmNoIjoibWFpbiIsImxhYmVscyI6WyJSZW5vdmF0ZSBCb3QiXX0=-->
Summary
This is an alternative to #9344. If accepted, I need to audit the codebase and call sites to apply it everywhere, but the basic idea is: rather than encoding mutually-incompatible pairs of markers in the representation itself, we have an additional method on
MarkerTree
that expands the false-y definition to take into account assumptions about which markers can be true alongside others. We then check if the the current marker implies that at least one of them is true.So, for example, we know that
sys_platform == 'win32'
andplatform_system == 'Darwin'
are mutually exclusive. When given a marker expression likepython_version >= '3.7'
, we test ifpython_version >= '3.7'
andsys_platform != 'win32' or platform_system != 'Darwin'
are disjoint, i.e., if the following can't be satisfied:Since, if this can't be satisfied, it implies that the left-hand expression requires
sys_platform == 'win32'
andplatform_system == 'Darwin'
to be true at the same time.I think the main downsides here are:
sys_platform == 'win32' and platform_system != 'Darwin'
, even though we know the latter expression is redundant.Closes #7760.
Closes #9275.