Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple conflicting extras installed when using explicit index assignments #9289

Closed
anaoum opened this issue Nov 20, 2024 · 4 comments · Fixed by #9370
Closed

Multiple conflicting extras installed when using explicit index assignments #9289

anaoum opened this issue Nov 20, 2024 · 4 comments · Fixed by #9370
Assignees
Labels
bug Something isn't working

Comments

@anaoum
Copy link

anaoum commented Nov 20, 2024

When conflicting extra groups share a dependant, both extra groups seem to be installed.

Consider the following example adapted from the docs:

[project]
name = "project"
version = "0.1.0"
requires-python = ">=3.11,<3.12"
dependencies = []

[project.optional-dependencies]
cpu = [
  "torch>=2.5.1",
  "torchvision>=0.20.1",
]
cu124 = [
  "torch>=2.5.1",
  "torchvision>=0.20.1",
]
metrics = [
  "torchmetrics>=1.2.0",
]

[tool.uv]
conflicts = [
  [
    { extra = "cpu" },
    { extra = "cu124" },
  ],
]

[tool.uv.sources]
torch = [
  { index = "pytorch-cpu", extra = "cpu", marker = "platform_system != 'Darwin'" },
  { index = "pytorch-cu124", extra = "cu124" },
]
torchvision = [
  { index = "pytorch-cpu", extra = "cpu", marker = "platform_system != 'Darwin'" },
  { index = "pytorch-cu124", extra = "cu124" },
]

[[tool.uv.index]]
name = "pytorch-cpu"
url = "https://download.pytorch.org/whl/cpu"
explicit = true

[[tool.uv.index]]
name = "pytorch-cu124"
url = "https://download.pytorch.org/whl/cu124"
explicit = true

The key difference is the additional extra metrics that includes the torchmetrics package that depends on torch.

When I lock and sync both cpu and metrics extras, I get multiple versions of torch installed:

$ uv sync --extra cpu --extra metrics
Resolved 34 packages in 1ms
Uninstalled 3 packages in 199ms
Installed 3 packages in 1.10s
 ~ torch==2.5.1
 ~ torch==2.5.1+cpu
 ~ torch==2.5.1+cu124
@charliermarsh
Copy link
Member

charliermarsh commented Nov 20, 2024

Interesting, thanks. I don't think the extra indexes are relevant, since you can reproduce with:

[project]
name = "project"
version = "0.1.0"
requires-python = ">=3.11,<3.12"
dependencies = []

[project.optional-dependencies]
foo = [
  "idna==3.10",
]
bar = [
  "idna==3.9",
]
baz = [
  "anyio",
]

[tool.uv]
conflicts = [
  [
    { extra = "foo" },
    { extra = "bar" },
  ],
]

From there, uv tree gives you:

project v0.1.0
├── idna v3.9 (extra: bar)
├── anyio v4.6.2.post1 (extra: baz)
│   ├── idna v3.9
│   ├── idna v3.10
│   └── sniffio v1.3.1
└── idna v3.10 (extra: foo)

@charliermarsh charliermarsh added the bug Something isn't working label Nov 20, 2024
@anaoum
Copy link
Author

anaoum commented Nov 20, 2024

Thanks. It also happens if the two conflicting extra groups share the common dependant:

[project]
name = "project"
version = "0.1.0"
requires-python = ">=3.11,<3.12"
dependencies = []

[project.optional-dependencies]
foo = [
  "idna==3.10",
  "anyio",
]
bar = [
  "idna==3.9",
  "anyio",
]

[tool.uv]
conflicts = [
  [
    { extra = "foo" },
    { extra = "bar" },
  ],
]

@BurntSushi
Copy link
Member

I think i see a path to fixing this. I think it has two components:

  • In ResolverOutput::from_state (and possibly also marker_reachability), generate extra = "..." markers from each Resolution's conflicting groups, if they have them. Then apply those markers via intersection to each of the edges in that Resolution. So in Charlie's example above, that means the idna==3.9 dependency would have a extra == "bar" marker and the idna==3.10 dependency would have a extra == "foo" or extra == "baz" marker. Or something similar.
  • Update InstallTarget::to_resolution in uv-resolver/lock to pass in extras when doing marker evaluation.

One hiccup here is that this doesn't take groups into account, so we might not be able to reuse markers for this. We'll instead need an additional field to encode this extra/group inclusion logic.

@BurntSushi
Copy link
Member

BurntSushi commented Nov 21, 2024

Working it out by hand, I think the conditional logic for each idna dependency in each of the three forks present in this example is:

 idna==3.9: extra != "foo" and (extra == "bar" or extra == "baz")
idna==3.10: extra != "bar" and (extra == "foo" or extra == "baz")
idna==3.10: (extra != "foo" and extra != "bar") and extra == "baz"

Which combines to:

 idna==3.9: extra != "foo" and (extra == "bar" or extra == "baz")
idna==3.10: (extra != "bar" and (extra == "foo" or extra == "baz"))
            or ((extra != "foo" and extra != "bar") and extra == "baz")

And simplifies to:

 idna==3.9: extra != "foo" and (extra == "bar" or extra == "baz")
idna==3.10: (extra != "bar" and extra == "baz") or (extra != "bar" and extra == "foo")

Note that because of how extras work, the above notation is quite misleading. It's more clearly written like this:

 idna==3.9: "foo" not in extras and ("bar" in extras or "baz" in extras)
idna==3.10: ("bar" not in extras and "baz" in extras) or ("bar" not in extras and "foo" in extras)

BurntSushi added a commit that referenced this issue Nov 22, 2024
When we generate conflict markers for each resolution after the
resolver runs, it turns out that generating them just from exclusion
rules is not sufficient.

For example, if `foo` and `bar` are declared as conflicting extras, then
we end up with the following forks:

    A: extra != 'foo'
    B: extra != 'bar'
    C: extra != 'foo' and extra != 'bar'

Now let's take an example where these forks don't share the same version
for all packages. Consider a case where `idna==3.9` is in forks A and C,
but where `idna==3.10` is in fork B. If we combine the markers in forks
A and C through disjunction, we get the following:

     idna==3.9: extra != 'foo' or (extra != 'foo' and extra != 'bar')
    idna==3.10: extra != 'bar'

Which simplifies to:

     idna==3.9: extra != 'foo'
    idna==3.10: extra != 'bar'

But these are clearly not disjoint. Both dependencies could be selected,
for example, when neither `foo` nor `bar` are active. We can remedy this
by keeping around the inclusion rules for each fork:

    A: extra != 'foo' and extra == 'bar'
    B: extra != 'bar' and extra == 'foo'
    C: extra != 'foo' and extra != 'bar'

And so for `idna`, we have:

     idna==3.9: (extra != 'foo' and extra == 'bar') or (extra != 'foo' and extra != 'bar')
    idna==3.10: extra != 'bar' and extra == 'foo'

Which simplifies to:

     idna==3.9: extra != 'foo'
    idna==3.10: extra != 'bar' and extra == 'foo'

And these *are* properly disjoint. There is no way for them both to be
active. This also correctly accounts for fork C where neither `foo` nor
`bar` are active, and yet, `idna==3.9` is still enabled but `idna==3.10`
is not. (In the [motivating example], this comes from `baz` being enabled.)
That is, this captures the idea that for `idna==3.10` to be installed,
there must actually be a specific extra that is enabled. That's what
makes it disjoint from `idna==3.9`.

We aren't quite done yet, because this does add *too many* conflict
markers to dependency edges that don't need it. In the next commit,
we'll add in our world knowledge to simplify these conflict markers.

[motivating example]: #9289
BurntSushi added a commit that referenced this issue Nov 23, 2024
When we generate conflict markers for each resolution after the
resolver runs, it turns out that generating them just from exclusion
rules is not sufficient.

For example, if `foo` and `bar` are declared as conflicting extras, then
we end up with the following forks:

    A: extra != 'foo'
    B: extra != 'bar'
    C: extra != 'foo' and extra != 'bar'

Now let's take an example where these forks don't share the same version
for all packages. Consider a case where `idna==3.9` is in forks A and C,
but where `idna==3.10` is in fork B. If we combine the markers in forks
A and C through disjunction, we get the following:

     idna==3.9: extra != 'foo' or (extra != 'foo' and extra != 'bar')
    idna==3.10: extra != 'bar'

Which simplifies to:

     idna==3.9: extra != 'foo'
    idna==3.10: extra != 'bar'

But these are clearly not disjoint. Both dependencies could be selected,
for example, when neither `foo` nor `bar` are active. We can remedy this
by keeping around the inclusion rules for each fork:

    A: extra != 'foo' and extra == 'bar'
    B: extra != 'bar' and extra == 'foo'
    C: extra != 'foo' and extra != 'bar'

And so for `idna`, we have:

     idna==3.9: (extra != 'foo' and extra == 'bar') or (extra != 'foo' and extra != 'bar')
    idna==3.10: extra != 'bar' and extra == 'foo'

Which simplifies to:

     idna==3.9: extra != 'foo'
    idna==3.10: extra != 'bar' and extra == 'foo'

And these *are* properly disjoint. There is no way for them both to be
active. This also correctly accounts for fork C where neither `foo` nor
`bar` are active, and yet, `idna==3.9` is still enabled but `idna==3.10`
is not. (In the [motivating example], this comes from `baz` being enabled.)
That is, this captures the idea that for `idna==3.10` to be installed,
there must actually be a specific extra that is enabled. That's what
makes it disjoint from `idna==3.9`.

We aren't quite done yet, because this does add *too many* conflict
markers to dependency edges that don't need it. In the next commit,
we'll add in our world knowledge to simplify these conflict markers.

[motivating example]: #9289
BurntSushi added a commit that referenced this issue Nov 23, 2024
When we generate conflict markers for each resolution after the
resolver runs, it turns out that generating them just from exclusion
rules is not sufficient.

For example, if `foo` and `bar` are declared as conflicting extras, then
we end up with the following forks:

    A: extra != 'foo'
    B: extra != 'bar'
    C: extra != 'foo' and extra != 'bar'

Now let's take an example where these forks don't share the same version
for all packages. Consider a case where `idna==3.9` is in forks A and C,
but where `idna==3.10` is in fork B. If we combine the markers in forks
A and C through disjunction, we get the following:

     idna==3.9: extra != 'foo' or (extra != 'foo' and extra != 'bar')
    idna==3.10: extra != 'bar'

Which simplifies to:

     idna==3.9: extra != 'foo'
    idna==3.10: extra != 'bar'

But these are clearly not disjoint. Both dependencies could be selected,
for example, when neither `foo` nor `bar` are active. We can remedy this
by keeping around the inclusion rules for each fork:

    A: extra != 'foo' and extra == 'bar'
    B: extra != 'bar' and extra == 'foo'
    C: extra != 'foo' and extra != 'bar'

And so for `idna`, we have:

     idna==3.9: (extra != 'foo' and extra == 'bar') or (extra != 'foo' and extra != 'bar')
    idna==3.10: extra != 'bar' and extra == 'foo'

Which simplifies to:

     idna==3.9: extra != 'foo'
    idna==3.10: extra != 'bar' and extra == 'foo'

And these *are* properly disjoint. There is no way for them both to be
active. This also correctly accounts for fork C where neither `foo` nor
`bar` are active, and yet, `idna==3.9` is still enabled but `idna==3.10`
is not. (In the [motivating example], this comes from `baz` being enabled.)
That is, this captures the idea that for `idna==3.10` to be installed,
there must actually be a specific extra that is enabled. That's what
makes it disjoint from `idna==3.9`.

We aren't quite done yet, because this does add *too many* conflict
markers to dependency edges that don't need it. In the next commit,
we'll add in our world knowledge to simplify these conflict markers.

[motivating example]: #9289
BurntSushi added a commit that referenced this issue Nov 23, 2024
When we generate conflict markers for each resolution after the
resolver runs, it turns out that generating them just from exclusion
rules is not sufficient.

For example, if `foo` and `bar` are declared as conflicting extras, then
we end up with the following forks:

    A: extra != 'foo'
    B: extra != 'bar'
    C: extra != 'foo' and extra != 'bar'

Now let's take an example where these forks don't share the same version
for all packages. Consider a case where `idna==3.9` is in forks A and C,
but where `idna==3.10` is in fork B. If we combine the markers in forks
A and C through disjunction, we get the following:

     idna==3.9: extra != 'foo' or (extra != 'foo' and extra != 'bar')
    idna==3.10: extra != 'bar'

Which simplifies to:

     idna==3.9: extra != 'foo'
    idna==3.10: extra != 'bar'

But these are clearly not disjoint. Both dependencies could be selected,
for example, when neither `foo` nor `bar` are active. We can remedy this
by keeping around the inclusion rules for each fork:

    A: extra != 'foo' and extra == 'bar'
    B: extra != 'bar' and extra == 'foo'
    C: extra != 'foo' and extra != 'bar'

And so for `idna`, we have:

     idna==3.9: (extra != 'foo' and extra == 'bar') or (extra != 'foo' and extra != 'bar')
    idna==3.10: extra != 'bar' and extra == 'foo'

Which simplifies to:

     idna==3.9: extra != 'foo'
    idna==3.10: extra != 'bar' and extra == 'foo'

And these *are* properly disjoint. There is no way for them both to be
active. This also correctly accounts for fork C where neither `foo` nor
`bar` are active, and yet, `idna==3.9` is still enabled but `idna==3.10`
is not. (In the [motivating example], this comes from `baz` being enabled.)
That is, this captures the idea that for `idna==3.10` to be installed,
there must actually be a specific extra that is enabled. That's what
makes it disjoint from `idna==3.9`.

We aren't quite done yet, because this does add *too many* conflict
markers to dependency edges that don't need it. In the next commit,
we'll add in our world knowledge to simplify these conflict markers.

[motivating example]: #9289
BurntSushi added a commit that referenced this issue Nov 27, 2024
In the course of working on #9289, I've had to devise
some additions to our markers. While we are still staying
strictly compatible with the PEP 508 format, we will be
abusing the `extra` expression to carry a lot more
information.

Specifically, we want the following additional
operations:

* Simplify `extra != 'foo'`
* Remove all extra expressions
* Remove everything except extra expressions

My work on #9289 requires all of these (which will be
in a future in PR).
BurntSushi added a commit that referenced this issue Nov 27, 2024
In the course of working on #9289, I've had to devise
some additions to our markers. While we are still staying
strictly compatible with the PEP 508 format, we will be
abusing the `extra` expression to carry a lot more
information.

Specifically, we want the following additional
operations:

* Simplify `extra != 'foo'`
* Remove all extra expressions
* Remove everything except extra expressions

My work on #9289 requires all of these (which will be
in a future in PR).
BurntSushi added a commit that referenced this issue Nov 27, 2024
In the course of working on #9289, I've had to devise
some additions to our markers. While we are still staying
strictly compatible with the PEP 508 format, we will be
abusing the `extra` expression to carry a lot more
information.

Specifically, we want the following additional
operations:

* Simplify `extra != 'foo'`
* Remove all extra expressions
* Remove everything except extra expressions

My work on #9289 requires all of these (which will be
in a future in PR).
BurntSushi added a commit that referenced this issue Dec 2, 2024
In the course of working on #9289, I've had to devise
some additions to our markers. While we are still staying
strictly compatible with the PEP 508 format, we will be
abusing the `extra` expression to carry a lot more
information.

Specifically, we want the following additional
operations:

* Simplify `extra != 'foo'`
* Remove all extra expressions
* Remove everything except extra expressions

My work on #9289 requires all of these (which will be
in a future in PR).
BurntSushi added a commit that referenced this issue Dec 2, 2024
In the course of working on #9289, I've had to devise
some additions to our markers. While we are still staying
strictly compatible with the PEP 508 format, we will be
abusing the `extra` expression to carry a lot more
information.

Specifically, we want the following additional
operations:

* Simplify `extra != 'foo'`
* Remove all extra expressions
* Remove everything except extra expressions

My work on #9289 requires all of these (which will be
in a future in PR).
BurntSushi added a commit that referenced this issue Dec 2, 2024
In the course of working on #9289, I've had to devise
some additions to our markers. While we are still staying
strictly compatible with the PEP 508 format, we will be
abusing the `extra` expression to carry a lot more
information.

Specifically, we want the following additional
operations:

* Simplify `extra != 'foo'`
* Remove all extra expressions
* Remove everything except extra expressions

My work on #9289 requires all of these (which will be
in a future in PR).
BurntSushi added a commit that referenced this issue Dec 2, 2024
In the course of working on #9289, I've had to devise
some additions to our markers. While we are still staying
strictly compatible with the PEP 508 format, we will be
abusing the `extra` expression to carry a lot more
information.

Specifically, we want the following additional
operations:

* Simplify `extra != 'foo'`
* Remove all extra expressions
* Remove everything except extra expressions

My work on #9289 requires all of these (which will be
in a future in PR).
BurntSushi added a commit that referenced this issue Dec 4, 2024
Some of these are regression tests or related regression tests
for #9289. Others are regression tests for bugs I found while working
on #9289.

Most of the outputs for these tests are wrong. For example, some tests
are asserting that multiple versions of the same package are installed.
But we add them now before our fixes so that we can see how they
change.
BurntSushi added a commit that referenced this issue Dec 5, 2024
Some of these are regression tests or related regression tests
for #9289. Others are regression tests for bugs I found while working
on #9289.

Most of the outputs for these tests are wrong. For example, some tests
are asserting that multiple versions of the same package are installed.
But we add them now before our fixes so that we can see how they
change.
BurntSushi added a commit that referenced this issue Dec 6, 2024
Some of these are regression tests or related regression tests
for #9289. Others are regression tests for bugs I found while working
on #9289.

Most of the outputs for these tests are wrong. For example, some tests
are asserting that multiple versions of the same package are installed.
But we add them now before our fixes so that we can see how they
change.
BurntSushi added a commit that referenced this issue Dec 9, 2024
Some of these are regression tests or related regression tests
for #9289. Others are regression tests for bugs I found while working
on #9289.

Most of the outputs for these tests are wrong. For example, some tests
are asserting that multiple versions of the same package are installed.
But we add them now before our fixes so that we can see how they
change.
BurntSushi added a commit that referenced this issue Dec 9, 2024
Some of these are regression tests or related regression tests
for #9289. Others are regression tests for bugs I found while working
on #9289.

Most of the outputs for these tests are wrong. For example, some tests
are asserting that multiple versions of the same package are installed.
But we add them now before our fixes so that we can see how they
change.
BurntSushi added a commit that referenced this issue Dec 9, 2024
Some of these are regression tests or related regression tests
for #9289. Others are regression tests for bugs I found while working
on #9289.

Most of the outputs for these tests are wrong. For example, some tests
are asserting that multiple versions of the same package are installed.
But we add them now before our fixes so that we can see how they
change.
BurntSushi added a commit that referenced this issue Dec 10, 2024
Some of these are regression tests or related regression tests
for #9289. Others are regression tests for bugs I found while working
on #9289.

Most of the outputs for these tests are wrong. For example, some tests
are asserting that multiple versions of the same package are installed.
But we add them now before our fixes so that we can see how they
change.
BurntSushi added a commit that referenced this issue Dec 10, 2024
This fixes a bug found by Konsti[1] where not all extras would propagate
correctly. This could lead to improper simplification steps.

The fix here is to revisit nodes in the graph unless no changes have
been made to the set of enabled extras.

We also add a regression test whose snapshot changes without this fix. I
tried writing a test case by hand but couldn't figure it out. The
original pytorch MRE in #9289 also has a different lock file with this
fix, but we really shouldn't be adding more pytorch tests given how
beefy they are. So I found this case using airflow, which is also beefy,
but hopefully good enough.

[1]: #9370 (comment)
BurntSushi added a commit that referenced this issue Dec 10, 2024
This fixes a bug found by Konsti[1] where not all extras would propagate
correctly. This could lead to improper simplification steps.

The fix here is to revisit nodes in the graph unless no changes have
been made to the set of enabled extras.

We also add a regression test whose snapshot changes without this fix. I
tried writing a test case by hand but couldn't figure it out. The
original pytorch MRE in #9289 also has a different lock file with this
fix, but we really shouldn't be adding more pytorch tests given how
beefy they are. So I found this case using airflow, which is also beefy,
but hopefully good enough.

[1]: #9370 (comment)
BurntSushi added a commit that referenced this issue Dec 10, 2024
This PR adds a notion of "conflict markers" to the lock file as an
attempt to address #9289. The idea is to encode a new kind of boolean
expression indicating how to choose dependencies based on which extras
are activated.

As an example of what conflict markers look like, consider one of the
cases
brought up in #9289, where `anyio` had unconditional dependencies on
two different versions of `idna`. Now, those are gated by markers, like
this:

```toml
        [[package]]
        name = "anyio"
        version = "4.3.0"
        source = { registry = "https://pypi.org/simple" }
        dependencies = [
            { name = "idna", version = "3.5", source = { registry = "https://pypi.org/simple" }, marker = "extra == 'extra-7-project-foo'" },
            { name = "idna", version = "3.6", source = { registry = "https://pypi.org/simple" }, marker = "extra == 'extra-7-project-bar' or extra != 'extra-7-project-foo'" },
            { name = "sniffio" },
        ]
```

The odd extra values like `extra-7-project-foo` are an encoding of not
just the conflicting extra (`foo`) but also the package it's declared
for (`project`). We need both bits of information because different
packages may have the same extra name, even if they are completely
unrelated. The `extra-` part is a prefix to distinguish it from groups
(which, in this case, would be encoded as `group-7-project-foo` if `foo`
were a dependency group). And the `7` part indicates the length of the
package name which makes it possible to parse out the package and extra
name from this encoding. (We don't actually utilize that property, but
it seems like good sense to do it in case we do need to extra
information from these markers.)

While this preserves PEP 508 compatibility at a surface level, it does
require utilizing this encoding scheme in order
to evaluate them when they're present (which only occurs when
conflicting extras/groups are declared).

My sense is that the most complex part of this change is not just adding
conflict markers, but their simplification. I tried to address this in
the code comments and commit messages.

Reviewers should look at this commit-by-commit.

Fixes #9289, Fixes #9546, Fixes #9640, Fixes #9622, Fixes #9498, Fixes
#9701, Fixes #9734
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants