-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Warn users about dependency conflicts when updating other packages #7744
Comments
My initial thoughts (assuming we're talking about the new resolver); Approach 1 is just wrong. The new resolver should (IMO) never install an incompatible set of packages. Printing a warning is just saying "I messed up". Printing an error and not reverting is unfriendly (it's what the current resolver does, so it's clearly wrong 😉). Approach 2 I agree is reasonable at first, but awkward in practice. I'd think of this as saying that we shouldn't go "up" the dependency tree (from dependency to parent) when upgrading. By saying "upgrade foo", the user is giving implicit permission to upgrade foo's dependencies, but it's not obvious that they have agreed to upgrade bar, even if that's needed to complete the upgrade of foo. Approach 3 I agree is not the right thing to do. We're not doing what the user asked. At a minimum we should say "cannot upgrade without forcing upgrade of unrelated package Approach 4 makes sense, but it's more a case of giving up cleanly than being the right approach. IMO, the best approach would be a combination of 2 and 3. We find that the resolution would require upgrading a package that isn't a dependency of anything the user requested on the command line, and say "upgrading would require also upgrading the following unrelated packages: |
Note that we don't have any wait-for-user-input behaviours, outside of anything we're not overriding in network/VCS code. Introducing something like this would be slightly disruptive for certain users. |
We’d also need to handle non-interactive situations if we implement a prompt. In that case IMO it’s better to error loudly (because that’s the main case for CI). So… a combination of 2, 3, and 4? I wonder if we can use |
Honestly, I suggest we treat 4 as the "fallback" option for us here -- unless we have reasonable precedence and consensus on some other option, let's go with that. Can someone check what other package managers with "proper" resolvers do? |
Most package managers don’t have this problem because they keep a “manifest” describing the user’s original intention (e.g. |
For Python, we have https://www.python.org/dev/peps/pep-0376/#requested that could serve a similar purpose. |
TIL! Yes that sounds like a good idea. pip does not already implement it, does it? (I didn’t read the implementation, but couldn’t find this file in any of my existing environments.) |
There should be a tracking issue for this. pip does not implement this. |
PEP 376 is something of an odd thing - it was the basis of the aborted "distutils2" rewrite, but it had a lot of good ideas, particularly in PEP 376. But even though the PEP as a whole was accepted, it was never all implemented. None of the IMO, it would be a good idea to go through PEP 376 and revise it to document what is implemented, and decide what of the rest should be (and retain that) or should be dropped (and do so). Maybe we could do this incrementally, I don't know? PEP 376 needs a certain critical mass of stuff that is actually going to be implemented to maintain credibility 🙁 |
And I’m not sure |
@pradyunsg @uranusjr One thing I just thought of. With the original example, of The root set of requirements (from the command line) is I think we need to consider the option that even with the new resolver, pip will only solve for requested requirements and their dependencies. This would mean that pip could produce a broken environment (in effect option 1 above, with the message coming from a follow-up automatic I can see ways that we could let the resolver "see" what's on the index for requirements specified by the user and their dependencies, but for other installed packages just see the installed version. But that could be tricky to implement. Actually, it could be tricky to get the resolver to even ask for available candidates for I guess what I'm saying is that if we assume a resolver that follows the dependency graph from the root set provided by the user, but only going in the direction "parent -> dependency" (because that's the only direction the metadata supports), it's not actually clear to me if we even can implement behaviours 2-4... The point @uranusjr made in our meeting is extremely relevant here - pip does not have any sort of information about the global question "what does the user want to have in this environment" (the sort of information that other tools maintain via files like With that in mind, behaviour (1) becomes much more reasonable as simply being "upgrade what the user asked for". |
I do have a feeling this is less a problem in practice since people having this kind of problems are likely already using a higher-level tool (e.g. Pipenv, Poetry, pip-tools). The resolver does not really need to know |
To be abundantly clear, I think this is going to be one of the issues that can cause "oh the new resolver actually breaks my setup, leaving a sour taste in my mouth" situation for our users, during the new resolver's rollout. I think the resolver should consider the existing packages in the environment (otherwise, there's very little benefit to this whole exercise for many users), and we should have our default "strategy" be:
I imagine this would be the equivalent of the "only-if-needed" strategy that we have. For "eager", we can skip the "if directly depended upon" to skip the "prefer existing installed version" behavior. |
I believe this would “simply work”[*] in practice. Assuming the user already has a “working” environment, a package being not installed means it’s not depended by any existing distributions. The resolver can choose any version it needs to depending on the newly requested packages. [*]: A working environment does not necessarily mean the environment does not contain any conflicts or broken dependencies, but there’s nothing wrong as far as the user concerns. pip does not need to actively fix the environment unless the user specifies so.
With It would be way too restrictive to always error out, of course. My feeling is this should be where This leaves only one question, whether we should error out, or simply warn if an incompatibility would occur (for a package in the environment, but not a dependency of the given packages in the |
My feeling is that this is something where any decision we make will have to be reviewed in the light of actual user feedback. So (a) we shouldn't get too concerned with working out the "perfect" answer (I think @uranusjr's summary above seems about right), and (b) it would be good if the UX work could get user feedback on this question. (Can someone ping Bernard on this issue? I can't remember or find his github username 🙁) |
@ei8fdb ^ |
Given the sheer number of users, it's possible that we don't surface folks who do care about the failure case here, as part of our UX study. I personally feel that we should error out vs warn in most scenarios w/ the new resolver, since (1) it's stricter, and (2) it'll help eagerly surface any users who'd care about the stricter behavior here, and we can work with them to understand why the new/stricter behavior doesn't work for them. Plus, we've said:
If you do separate "pip install" runs and pip won't refuse to install a broken combination... That's gonna be tricky to explain. :P |
Well you see we said “ask pip to install two packages” but here you’re only asking it to install one so it does not know about the other you installed previously… Yeah I get what you mean. I don’t mind always error out, but we’ll need to offer some escape mechanism so a user can say “hey I actually don’t care about these breakages” without needing to uninstall-try again-uninstall stuff until pip feels satisfied. We can probably put this on hold, and get a better grasp to the problem after we actually ship the resolver. |
ACTUALLY! One of the things I just remembered is that if/when pip does install conflicting packages, it'll print the warnings about them: pip/src/pip/_internal/commands/install.py Lines 560 to 561 in a0ec4be
This "FYI: I detected conflicts in the final set of packages you'll have when I'm done" logic in pip, is also run with the new resolver, which means we're already printing some sort of relevant information toward helping the user understand the situation. I'm 100% OK to add the textual context to that error message. I'm imagining it'd look something like:
(IDK what the best phrasing might be) |
this looks great @pradyunsg - I am a big +1 on only showing the message when pip does install conflicting packages. |
Discussed with @pradyunsg For 20.3, we will change the error to: Old resolver:
New resolver:
For after 20.3 we will write some docs and link to it from the error message (in a future PR) |
As a quick note, I think the message presented to the users of the legacy resolver should be:
I don't think we need to nudge those users to the new resolver -- they're on Python 2 or opting-into the legacy resolver. |
My two cents Option 2 mentioned in the original post looks like the behavior of You simply cannot modify the existing packages unless user explicitly tell you to, otherwise you will very likely break somebody's code that depends on an API that only existed in certain package version of their choice. So option 2 cannot be made default, but can be Personally, I most favorite option 4, I'd rather break my builds than not being able to control what my environment has. But considering not everyone think as I do, this option should also be made |
FYI Edit: It seems like There is still a slight difference, Conda would update a package if that update would not break dependencies elsewhere, and you don’t want even that to happen. This is a reasonable scenario, but I believe Conda’s stance is you should spell out exact versions in environment.yml in that case instead. pip’s is similar (but with requirements files), and most people likely also expect that, so disallowing any package updates may be too drastic a change for pip to implement. |
You are mostly right about conda's behavior, but in practice it often behave in a way confusing to users. The tricky part about conda's update is that there can be difference between the metadata from the moment you create environment to the point you update the environment using the same spec. When one try to update an environment that has a good amount of packages created months ago, they often cannot do that because base on the new metadata, the environment has already broken although the user changed nothing. This issue was discussed here.
I agree, and I think my previous comment was confusing. The concern I had was when someone tries to update/install a package in an existing environment, by for example simply type __
Thanks for the link, I followed to this one through here |
Yeah, this is the problem pip needs to deal with as well, plus the additional hurdle most environments pip deals with weren’t even populated by an Personally I think this is unsolvable for tools like Conda and pip, because they allow users too much freedom manually tinkering with the environment without expressing intent. So there is not a theoratical correct answer here, and all we can do is to pick the behaviour that makes the least people unhappy the least of the time. |
What is the practical solution to #9482 though? If I'm running |
I see it the same way. The default behavior should be that the whole installation process aborts with an error message, telling the user that their python environment could get broken if they force the installation and maybe also give an overview of which part of the depenency graph will get broken, if so. However, I don't know how far this is possible to implement. I just don't think that it is smart to believe that the user knows what they are doing. That is kinda like offering someone a car key without telling them that - if they accept - the money for the car is withdrawn from their bank account which may break their financial situation. Never assume that users know the exact consequences of their actions. |
This is a mental note on a topic I realise needing a discussion while working on another issue.
Say we have package
foo
andbar
with the following dependencies:Given an environment with the followings installed:
and the user runs
pip install --upgrade foo
. What should we do? If we upgradefoo
to 2.0.0,six
needs to be upgraded as well (as an intrinsic requirement), but now it would conflict withbar
. I can think of three possibile approaches:foo
andsix
, and print an error/warning telling the userbar
now has unsatisfied requirements.bar
automatically to 2.0.0.foo
1.0.0 is the latest version without conflicts.Approach 1 is the simplest, but might be too difficult for the user to notice (especially on CI). This is probably not a good idea if we can avoid it.
Approach 2 looks like a good idea at first glance, but IMO may be confusing to the user. The dependency graph would be much less complex in more than one way in practice, and it would be difficult for the user to notice, or understand why a seemingly unrelated package got upgraded.
Approach 3 is “correct” in thoery, but is as unuseful to the user as pip’s famous “No matching distributions found for” error. There is clearly a newer version to upgrade to from the user’s perspective. Why is pip not finding it? Open GitHub and file a bug report.
Approach 4 is the most reasonable to me. In the above example, pip would emit something like
six>=1.12 (required by foo) would cause incompatibility in bar (requires six<1.12)
. The downside is pip would need to do more work to interpret the resolution result (this does not fit into the resolution process IMO).The text was updated successfully, but these errors were encountered: