-
Notifications
You must be signed in to change notification settings - Fork 900
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for pinning a package to a specific index #171
Comments
I don't quite understand why this is so hard (as per the pip issue), I can't tell if it's hard because it's a large conceptual change for pip specifically, if it's hard because it's pip is large and complicated and any changes are hard, or if there's inherent complexity. |
Poetry's design is interesting: https://python-poetry.org/docs/repositories/. It feels a bit more complex than is necessary though. |
Please consider dependency confusion attacks: https://medium.com/@alex.birsan/dependency-confusion-4a5d60fec610 Use of PEP 708 is a yet-to-be-implemented approach to improving the security posture. |
I would love to implement improvements to this (like the ability to pin a dependency to a specific index)… We specifically held off and implemented indexes as-is to be spec-compliant. I’ll implement this as soon as it’s supported and there’s clarity on how installers should handle it! |
Makes sense. I think you may receive a lot of duplicate feature requests from the folks who do misuse We may need to consider offering some help as mentioned in the PEP to move this along.
In the short-term, if you don't want to bug-for-bug implement pip, we may need to point people at alternatives like https://github.com/uranusjr/simpleindex to help them merge indexes behind the scenes on localhost. I don't think many will like it 😂 |
@groodt can you point at official docs that confirm this use of extra index is indeed misuse? To me it fits well with local versions for example? In any case, thanks for the links to simple index, this will be useful if case you're right and UV does not allow this use of extra index :) A bit sad to have to run two local servers instead of one, but that's not that bad. |
From what I recall (I quickly refreshed my memory on the issue but didn’t review the details) it’s a “design decision that was made a long time ago” type of complexity. Pip has built a bunch of choices on top of the “all indexes are equal and get merged” model, and it’s hard to know what we’d need to change if we were to revisit that decision. Add to that the need for backward compatibility and it’s a much bigger question than people saying “just pick the index I ask for” will accept. For uv, I foresee the following issues:
On the plus side, you’ll be offering a solution to something users have been wanting for a long time, and which is often characterised as a security issue. And practicality may well beat purity here.
It doesn’t describe it as “misuse”, but the pip docs are clear that we treat all indexes as equal: “There is no ordering in the locations that are searched” from here. We (pip) could add a guide-type article discussing this in more depth, but we don’t have one currently. |
There’s a small security warning in the pip docs here
There is also an in progress pip PR to make this more explicit here pypa/pip#11694 Here’s a major recent dependency confusion attack that impacted PyTorch (caused by instructions to use —extra-index-url) https://news.ycombinator.com/item?id=34202662 |
@groodt I'm not sure this is to answer my previous question ("can you point at official docs that confirm this use of extra index is indeed misuse"), but I was especially referring to "it's purpose is to provide a set of fallback mirrors of the primary index". I distinguish three use-cases for extra indexes:
My understanding of your "append additional sources of dependencies" was that it referred to case 2, but now I think you were speaking of case 1. So, to rephrase, and given pip currently considers all indexes to be equal, is case 2 is a misuse too? (Anyway, after reading PEP 708, I also agree it's the way forward and not index ordering, as I commented here 👍 ) |
Stepping back from the question of "why is this so hard", @groodt is correct that PEP 708 is the better solution here. Otherwise, does the user need to specify an explicit index for the whole of their internal package's dependency tree? What if someone adds a new internal module, and forgets to add it to the list of "must come from the internal index" list in all of their install jobs? Are we going to consider that as "user error"? The torchtriton attack took exactly this form. Or an attacker could compromise a public dependency of your internal project. Having an index pin option doesn't prevent you from needing to handle the consequences of a "all distributions with the same name and version are interchangeable" model. It just gives users a manual way of firefighting issues with that model. |
It's not that "pip considers all indexes to be equal" but rather that "pip considers all distributions with the same name and version to be interchangeable regardless of which index they came from". The difference is subtle but important. Whether case 2 is a problem depends on whether you trust the "main index", just the same as with case 1. The trust issue is what's important here rather than what is "considered equal"1. The To answer your question more explicitly, your case (2) is a "misuse" in the sense that it has risks (the same as case 1). The risks may not be a dependency confusion attack, but they do include compromise of the PyPI account owning of the code you're extending. Is that a more acceptable risk? Only you can decide that. The point is that mixing indexes with different trust levels2 is the problem here. Even the "mirror PyPI" case involves a risk if the mirror is compromised. Footnotes |
Thanks @pfmoore! (Side note: I just discovered that PDM supports respecting the order of indexes: https://pdm-project.org/latest/usage/config/#respect-the-order-of-the-sources.) |
Just as an example (and yes, it's a pathological case) if you have ordering, suppose you have index1 and index2 (ordered with 1 having priority over 2). Index 1 contains A 1.0 and B 1.0, with A 1.0 depending on B. Index 2 has B 2.0. If you install A, do you get B 1.0 or B 2.0? If the answer is 1.0, why did you bother specifying index 2? If you get B 2.0, and someone now adds A 2.0 to index 2, and changes B 2.0 to depend on A > 1.0, is the correct thing to upgrade A to 2.0, or downgrade B to 1.0? The point here is that there's multiple possible choices, and if you don't factor index priority into the core of your resolution algorithm, you end up with a system that users won't have a good intuition about, and which might depend on implementation details. Both of which can lead to security issues. I'm not saying that PDM has such a problem - they may well have considered all of this. Just that "which index did this come from" is an extra axis you have to consider as part of resolution, not just something you can keep separate from the resolver. Anyway, the key point for |
To be able to find package C, D, E, etc. 😄 In my case, all packages from index 1 take precedence, even if higher versions are available in index 2. Index 1 contains just a tiny few packages, index 2 contains all the rest (PyPI.org in my case). Well, PEP 708 will bring the same ability with finer-grain control (per-project fallback). I'll see if Could also be interesting to know how PDM handles ordering. If @frostming wants to chime in 😄 |
Cool, so your approach to priority is like pinning, but with the decision on whether to pin being "if package A is in index 1, then pin to index 1, else fall through". That works, but it doesn't support the piwheels case where they supplement PyPI with wheels for the raspberry pi architecture, letting installers fall back to PyPI if the wheel isn't valid for the user's architecture (at least that's how I understand what they do...) Getting into this much detail may be more than the |
@pfmoore Thank you very much for your detailed comments here. They've been super helpful. I'd like to make a small tweak to how So today, our implementation works by giving a preference order to the indexes made available to uv/crates/uv-client/src/registry_client.rs Lines 191 to 210 in 995fba8
That is, given Since This will not match pip's behavior in every case, but I think it does help address some of the common cases and I think also helps to mitigate the dependency confusion concerns. That is, if Otherwise, I do agree that if we can get away with it, we should probably avoid encoding And popping up a level, I do think we'll want to absolutely address the multi-registry issue by giving users more control when we build out more project management features. But I think until then, we'll probably want to avoid adding too many additional abstractions into |
Nice! Worth noting is users who want the flipped behavior can do |
Ah yes! I meant to call that out, but yes indeed. |
Previously, we would prioritize `--index-url` over all `--extra-index-url` values. But now, we prioritize all `--extra-index-url` values over `--index-url`. That is, `--index-url` has gone from the "primary" index to the "fallback" index. In most setups, `--index-url` is left as its default value, which is PyPI. The ordering of `--extra-index-url` with respect to one another remains the same. That is, in `--extra-index-url foo --extra-index-url bar`, `foo` will be tried before `bar`. Finally, note that this specifically does not match `pip`'s behavior. `pip` will attempt to look at versions of a package from all indexes in which in occurs. `uv` will stop looking for versions of a package once it finds it in an index. That is, for any given package, `uv` will only utilize versions of it from a single index. Ref #171, Fixes #1377, Fixes #1451, Fixes #1600
Previously, we would prioritize `--index-url` over all `--extra-index-url` values. But now, we prioritize all `--extra-index-url` values over `--index-url`. That is, `--index-url` has gone from the "primary" index to the "fallback" index. In most setups, `--index-url` is left as its default value, which is PyPI. The ordering of `--extra-index-url` with respect to one another remains the same. That is, in `--extra-index-url foo --extra-index-url bar`, `foo` will be tried before `bar`. Finally, note that this specifically does not match `pip`'s behavior. `pip` will attempt to look at versions of a package from all indexes in which in occurs. `uv` will stop looking for versions of a package once it finds it in an index. That is, for any given package, `uv` will only utilize versions of it from a single index. Ref #171, Fixes #1377, Fixes #1451, Fixes #1600
Previously, we would prioritize `--index-url` over all `--extra-index-url` values. But now, we prioritize all `--extra-index-url` values over `--index-url`. That is, `--index-url` has gone from the "primary" index to the "fallback" index. In most setups, `--index-url` is left as its default value, which is PyPI. The ordering of `--extra-index-url` with respect to one another remains the same. That is, in `--extra-index-url foo --extra-index-url bar`, `foo` will be tried before `bar`. Finally, note that this specifically does not match `pip`'s behavior. `pip` will attempt to look at versions of a package from all indexes in which in occurs. `uv` will stop looking for versions of a package once it finds it in an index. That is, for any given package, `uv` will only utilize versions of it from a single index. Ref #171, Fixes #1377, Fixes #1451, Fixes #1600
Removing with |
Current uv docs for dependencies make it sound like this feature is already available. Let to a bit of churn trying to figure out how to use it 😅 https://docs.astral.sh/uv/concepts/dependencies/
|
Does uv not expand environment variables in the [tool.uv]
# uv doesn't expand ARTIFACTORY_USER and ARTIFACTORY_API_TOKEN?
extra-index-url = ["https://${ARTIFACTORY_USER}:${ARTIFACTORY_API_TOKEN}@pythonpackages.corp.example.com/artifactory/api/pypi/pypi-colorado-tools-snapshot/simple"] It does work if I set And as a point of reference, we are currently using pdm and this works: [[tool.pdm.source]]
name = "colorado-tools-snapshot"
url = "https://${ARTIFACTORY_USER}:${ARTIFACTORY_API_TOKEN}@pythonpackages.corp.example.com/artifactory/api/pypi/pypi-colorado-tools-snapshot/simple"
include_packages = ["venice", "venice-*"] The packages in Great work on uv! |
this is probably something that could be solved via keyring and --keyring-provider=subprocess (and is also unrelated to this issue) |
Please comment on #5734 instead for environment variable expansion. |
This is starting to come together in #7481. |
Here's an interesting pattern w/ poetry I've seen used to allow local dev on osx with pytorch but also enable GPU usage on linux specifically on x86_64.
|
Good news! 🎉 |
What about CLI? Will uv support adding dependencies with custom registries via I with I could do uv add --index internal somepackage |
It looks like #7747 sorta does that. It doesn't allow just giving the index name though, you have to give the name and url. |
Discussed this with Armin -- pip doesn't support it, and it seems like a big problem? If you have an internal index, but also want to get some packages from PyPI, there's no way to ensure that your internal packages come from your internal index. Packages on PyPI could even shadow them.
The text was updated successfully, but these errors were encountered: