Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Warn users about dependency conflicts when updating other packages #7744

Closed
uranusjr opened this issue Feb 13, 2020 · 61 comments · Fixed by #9124
Closed

Warn users about dependency conflicts when updating other packages #7744

uranusjr opened this issue Feb 13, 2020 · 61 comments · Fixed by #9124
Assignees
Labels
C: dependency resolution About choosing which dependencies to install state: needs discussion This needs some more discussion type: maintenance Related to Development and Maintenance Processes UX User experience related
Milestone

Comments

@uranusjr
Copy link
Member

uranusjr commented Feb 13, 2020

This is a mental note on a topic I realise needing a discussion while working on another issue.

Say we have package foo and bar with the following dependencies:

foo 1.0.0
    six<1.12

foo 2.0.0
    six>=1.12

bar 1.0.0
    six<1.12

bar 2.0.0
    six>=1.12

Given an environment with the followings installed:

foo 1.0.0
bar 1.0.0
six 1.11.0

and the user runs pip install --upgrade foo. What should we do? If we upgrade foo to 2.0.0, six needs to be upgraded as well (as an intrinsic requirement), but now it would conflict with bar. I can think of three possibile approaches:

  1. Upgrade foo and six, and print an error/warning telling the user bar now has unsatisfied requirements.
  2. Upgrade bar automatically to 2.0.0.
  3. Telling the user everything is up-to-date, since the installed foo 1.0.0 is the latest version without conflicts.
  4. Error out without modifying the environment, saying the upgrade would introduce incompatibilities.

Approach 1 is the simplest, but might be too difficult for the user to notice (especially on CI). This is probably not a good idea if we can avoid it.

Approach 2 looks like a good idea at first glance, but IMO may be confusing to the user. The dependency graph would be much less complex in more than one way in practice, and it would be difficult for the user to notice, or understand why a seemingly unrelated package got upgraded.

Approach 3 is “correct” in thoery, but is as unuseful to the user as pip’s famous “No matching distributions found for” error. There is clearly a newer version to upgrade to from the user’s perspective. Why is pip not finding it? Open GitHub and file a bug report.

Approach 4 is the most reasonable to me. In the above example, pip would emit something like six>=1.12 (required by foo) would cause incompatibility in bar (requires six<1.12). The downside is pip would need to do more work to interpret the resolution result (this does not fit into the resolution process IMO).

@triage-new-issues triage-new-issues bot added the S: needs triage Issues/PRs that need to be triaged label Feb 13, 2020
@pfmoore
Copy link
Member

pfmoore commented Feb 13, 2020

My initial thoughts (assuming we're talking about the new resolver);

Approach 1 is just wrong. The new resolver should (IMO) never install an incompatible set of packages. Printing a warning is just saying "I messed up". Printing an error and not reverting is unfriendly (it's what the current resolver does, so it's clearly wrong 😉).

Approach 2 I agree is reasonable at first, but awkward in practice. I'd think of this as saying that we shouldn't go "up" the dependency tree (from dependency to parent) when upgrading. By saying "upgrade foo", the user is giving implicit permission to upgrade foo's dependencies, but it's not obvious that they have agreed to upgrade bar, even if that's needed to complete the upgrade of foo.

Approach 3 I agree is not the right thing to do. We're not doing what the user asked. At a minimum we should say "cannot upgrade without forcing upgrade of unrelated package bar".

Approach 4 makes sense, but it's more a case of giving up cleanly than being the right approach.

IMO, the best approach would be a combination of 2 and 3. We find that the resolution would require upgrading a package that isn't a dependency of anything the user requested on the command line, and say "upgrading would require also upgrading the following unrelated packages: bar - continue?" and then proceed if the user agrees, otherwise stop and say the system is as up to date as we can make it.

@pradyunsg
Copy link
Member

Note that we don't have any wait-for-user-input behaviours, outside of anything we're not overriding in network/VCS code.

Introducing something like this would be slightly disruptive for certain users.

@uranusjr
Copy link
Member Author

We’d also need to handle non-interactive situations if we implement a prompt. In that case IMO it’s better to error loudly (because that’s the main case for CI). So… a combination of 2, 3, and 4?

I wonder if we can use --upgrade-strategy for this. If that’s the case—Approach 2 sounds like always to me. What should only-if-needed do, and which approach should be default?

@pradyunsg
Copy link
Member

Honestly, I suggest we treat 4 as the "fallback" option for us here -- unless we have reasonable precedence and consensus on some other option, let's go with that.

Can someone check what other package managers with "proper" resolvers do?

@uranusjr
Copy link
Member Author

Most package managers don’t have this problem because they keep a “manifest” describing the user’s original intention (e.g. package.json, Cargo.toml, Gemfile). From my understanding they’d go approach 2 if the user allows b to be upgraded, otherwise approach 3.

@xavfernandez
Copy link
Member

Most package managers don’t have this problem because they keep a “manifest” describing the user’s original intention (e.g. package.json, Cargo.toml, Gemfile).

For Python, we have https://www.python.org/dev/peps/pep-0376/#requested that could serve a similar purpose.

@xavfernandez xavfernandez reopened this Feb 14, 2020
@pfmoore pfmoore changed the title Hadle conflicts from installed packages when updating other packages Handle conflicts from installed packages when updating other packages Feb 14, 2020
@uranusjr
Copy link
Member Author

For Python, we have https://www.python.org/dev/peps/pep-0376/#requested that could serve a similar purpose.

TIL! Yes that sounds like a good idea. pip does not already implement it, does it? (I didn’t read the implementation, but couldn’t find this file in any of my existing environments.)

@pradyunsg
Copy link
Member

There should be a tracking issue for this. pip does not implement this.

@pfmoore
Copy link
Member

pfmoore commented Feb 15, 2020

PEP 376 is something of an odd thing - it was the basis of the aborted "distutils2" rewrite, but it had a lot of good ideas, particularly in PEP 376. But even though the PEP as a whole was accepted, it was never all implemented. None of the pkgutil functions were ever implemented, for example.

IMO, it would be a good idea to go through PEP 376 and revise it to document what is implemented, and decide what of the rest should be (and retain that) or should be dropped (and do so). Maybe we could do this incrementally, I don't know? PEP 376 needs a certain critical mass of stuff that is actually going to be implemented to maintain credibility 🙁

@uranusjr
Copy link
Member Author

And I’m not sure REQUESTED is the best place to store the information TBH (although storing it is definitely better than not). Let’s have more chat on this… (and other things in PEP 376 as a whole)

@pfmoore
Copy link
Member

pfmoore commented Feb 15, 2020

https://discuss.python.org/t/does-pep-376-need-a-review/3269

@pradyunsg pradyunsg added state: needs discussion This needs some more discussion type: maintenance Related to Development and Maintenance Processes and removed S: needs triage Issues/PRs that need to be triaged labels Feb 17, 2020
@pfmoore
Copy link
Member

pfmoore commented Feb 17, 2020

@pradyunsg @uranusjr One thing I just thought of. With the original example, of foo and bar, when the user types pip install foo, how will the resolver even know that bar exists?

The root set of requirements (from the command line) is foo. The dependencies that get discovered will include six, but why will the resolver ever add bar to the dependency graph? It will only do that if we seed the graph with "everything the user currently has installed" - and that's potentially a large set and could slow down resolve times substantially.

I think we need to consider the option that even with the new resolver, pip will only solve for requested requirements and their dependencies. This would mean that pip could produce a broken environment (in effect option 1 above, with the message coming from a follow-up automatic pip check, just as it does now).

I can see ways that we could let the resolver "see" what's on the index for requirements specified by the user and their dependencies, but for other installed packages just see the installed version. But that could be tricky to implement. Actually, it could be tricky to get the resolver to even ask for available candidates for bar above.

I guess what I'm saying is that if we assume a resolver that follows the dependency graph from the root set provided by the user, but only going in the direction "parent -> dependency" (because that's the only direction the metadata supports), it's not actually clear to me if we even can implement behaviours 2-4...

The point @uranusjr made in our meeting is extremely relevant here - pip does not have any sort of information about the global question "what does the user want to have in this environment" (the sort of information that other tools maintain via files like Pipfile or cargo.toml), other than in the form "this is a list of exact project/version pairs that are currently installed". That's a fundamental limitation of pip's design as a low-level tool, and we should be very careful about trying to design features that assume otherwise.

With that in mind, behaviour (1) becomes much more reasonable as simply being "upgrade what the user asked for".

@uranusjr
Copy link
Member Author

I do have a feeling this is less a problem in practice since people having this kind of problems are likely already using a higher-level tool (e.g. Pipenv, Poetry, pip-tools).

The resolver does not really need to know bar exists in the simplistic example, only that there’s a six<1.12 contributed by someone (i.e. find_matches() should only return candidates that don’t cause conflicts with any currently installed packages). Conflcits are still possible if a new requirement is introduced deeper down the tree, and pip would need to choose whether to go behaviour 1, or error out after the resolution finishes (by checking ther result against installed packages again). This would be a strong indication the user should switch to a more capable setup, and I feel approach 1 would be acceptable (maybe with a suggestion message) at this point.

@pradyunsg pradyunsg added the C: dependency resolution About choosing which dependencies to install label Feb 26, 2020
@pradyunsg
Copy link
Member

To be abundantly clear, I think this is going to be one of the issues that can cause "oh the new resolver actually breaks my setup, leaving a sour taste in my mouth" situation for our users, during the new resolver's rollout.

I think the resolver should consider the existing packages in the environment (otherwise, there's very little benefit to this whole exercise for many users), and we should have our default "strategy" be:

  • if not installed:
    • install whatever works "best" -> best == newest compatible version.
  • if already installed:
    • if not directly depended on by the package:
      • "don't touch it": error out when it's incompatible with all the potential choices.
      • Q: do we care about telling the user "hey, pip picked an older version than it would've if you didn't have X in your environment", in cases where stuff worked out?
    • if directly depended upon by the package:
      • prefer existing installed version when possible, and allowed to change the version.
      • TODO: do we allow downgrades and upgrades? do we want to treat those differently?

I imagine this would be the equivalent of the "only-if-needed" strategy that we have. For "eager", we can skip the "if directly depended upon" to skip the "prefer existing installed version" behavior.

@uranusjr
Copy link
Member Author

uranusjr commented Apr 7, 2020

  • if not installed:
    • install whatever works "best" -> best == newest compatible version.

I believe this would “simply work”[*] in practice. Assuming the user already has a “working” environment, a package being not installed means it’s not depended by any existing distributions. The resolver can choose any version it needs to depending on the newly requested packages.

[*]: A working environment does not necessarily mean the environment does not contain any conflicts or broken dependencies, but there’s nothing wrong as far as the user concerns. pip does not need to actively fix the environment unless the user specifies so.

  • if already installed:
    • if not directly depended on by the package:
      • "don't touch it": error out when it's incompatible with all the potential choices.
      • Q: do we care about telling the user "hey, pip picked an older version than it would've if you didn't have X in your environment", in cases where stuff worked out?
    • if directly depended upon by the package:
      • prefer existing installed version when possible, and allowed to change the version.
      • TODO: do we allow downgrades and upgrades? do we want to treat those differently?

With REQUESTED out of the consideration for the short term, I feel pip is not in the position to switch behaviour based on reverse dependency information. It is still very often wrong to automatically upgrade a package, even if it is depended by another. Django packages, for example, usually depend on django, but I’d be very annoyed if installing the latest django-debug-toolbar automatically upgrades my Django installation from 2.2 to 3.0 because djangorestframework also depends on django. The better behaviour is to never touch an already-installed package, and always error out if that does not work. It would also be extremely useful to tell the user to either upgrade or uninstall the package, and what other packages depend on the conflicting package; otherwise the user would be hard-pressed to decide what action is best.

It would be way too restrictive to always error out, of course. My feeling is this should be where --upgrade comes into play. Upgrading (or downgrading; any version-changing operations should be treated the same in this context IMO) can happen if the user supplies this flag, and --upgrade-strategy affects whether to prefer upgrading the world, or the smallest possible set.

This leaves only one question, whether we should error out, or simply warn if an incompatibility would occur (for a package in the environment, but not a dependency of the given packages in the install --upgrade call) after resolution. The checks would be exactly the same (pip already implements it at install time); the only difference would be whether we install or not. I feel the current behaviour is actually useful sometimes, but personally don’t really care either way.

@pfmoore
Copy link
Member

pfmoore commented Apr 7, 2020

My feeling is that this is something where any decision we make will have to be reviewed in the light of actual user feedback. So (a) we shouldn't get too concerned with working out the "perfect" answer (I think @uranusjr's summary above seems about right), and (b) it would be good if the UX work could get user feedback on this question. (Can someone ping Bernard on this issue? I can't remember or find his github username 🙁)

@uranusjr
Copy link
Member Author

uranusjr commented Apr 7, 2020

@ei8fdb ^

@pradyunsg
Copy link
Member

Given the sheer number of users, it's possible that we don't surface folks who do care about the failure case here, as part of our UX study.

I personally feel that we should error out vs warn in most scenarios w/ the new resolver, since (1) it's stricter, and (2) it'll help eagerly surface any users who'd care about the stricter behavior here, and we can work with them to understand why the new/stricter behavior doesn't work for them.

Plus, we've said:

if you ask pip to install two packages with incompatible requirements, it will refuse (rather than installing a broken combination, like it does now).

If you do separate "pip install" runs and pip won't refuse to install a broken combination... That's gonna be tricky to explain. :P

@uranusjr
Copy link
Member Author

uranusjr commented Apr 8, 2020

Well you see we said “ask pip to install two packages” but here you’re only asking it to install one so it does not know about the other you installed previously…

Yeah I get what you mean. I don’t mind always error out, but we’ll need to offer some escape mechanism so a user can say “hey I actually don’t care about these breakages” without needing to uninstall-try again-uninstall stuff until pip feels satisfied. We can probably put this on hold, and get a better grasp to the problem after we actually ship the resolver.

@pradyunsg
Copy link
Member

My instinct is we should only display this if pip actually upgraded a dependency.

ACTUALLY!

One of the things I just remembered is that if/when pip does install conflicting packages, it'll print the warnings about them:

"{name} {version} requires {requirement}, but you'll have "
"{dep_name} {dep_version} which is incompatible."

This "FYI: I detected conflicts in the final set of packages you'll have when I'm done" logic in pip, is also run with the new resolver, which means we're already printing some sort of relevant information toward helping the user understand the situation. I'm 100% OK to add the textual context to that error message. I'm imagining it'd look something like:

WARNING: pip's dependency resolver does not currently take into account all the packages that are installed. This behavior is the source of the following dependency conflicts.
amazing-package 3.0 requires the-other-awesome-thing<2.0, but you'll have the-other-awesome-thing 3.0 which is incompatible.

(IDK what the best phrasing might be)

@pradyunsg pradyunsg added this to the 20.3 milestone Oct 28, 2020
brainwane added a commit to brainwane/pip that referenced this issue Oct 29, 2020
@nlhkabu
Copy link
Member

nlhkabu commented Oct 29, 2020

this looks great @pradyunsg - I am a big +1 on only showing the message when pip does install conflicting packages.
Your error message is good - however, it does not help the user actually fix the issue.
What would you recommend a user do to resolve the problem? Uninstall amazing-package 3.0,the-other-awesome-thing and try again, specifying a different version for amazing-package 3.0?

brainwane added a commit to brainwane/pip that referenced this issue Oct 29, 2020
@nlhkabu nlhkabu changed the title Handle conflicts from installed packages when updating other packages Warn users about dependency conflicts when updating other packages Nov 3, 2020
@pradyunsg pradyunsg assigned pradyunsg and nlhkabu and unassigned nlhkabu Nov 4, 2020
@nlhkabu
Copy link
Member

nlhkabu commented Nov 10, 2020

Discussed with @pradyunsg

For 20.3, we will change the error to:

Old resolver:

WARNING: pip's dependency resolver does not currently take into account all the packages that are installed. This behavior is the source of the following dependency conflicts.
amazing-package 3.0 requires the-other-awesome-thing<2.0, but you'll have the-other-awesome-thing 3.0, which is incompatible.

New resolver:

WARNING: pip's dependency resolver does not currently take into account all the packages that are installed. This behavior is the source of the following dependency conflicts.
amazing-package 3.0 requires the-other-awesome-thing<2.0, but you have the-other-awesome-thing 3.0, which is incompatible.

For after 20.3 we will write some docs and link to it from the error message (in a future PR)

@pradyunsg
Copy link
Member

As a quick note, I think the message presented to the users of the legacy resolver should be:

pip's legacy dependency resolver does not consider dependency conflicts when selecting packages. This behaviour is the source of the following dependency conflicts.

I don't think we need to nudge those users to the new resolver -- they're on Python 2 or opting-into the legacy resolver.

@junpuf
Copy link

junpuf commented Nov 17, 2020

My two cents

Option 2 mentioned in the original post looks like the behavior of conda where it tries to identify inconsistencies and automatically resolve them, by upgrading or downgrading existing packages in the environment. Based on my own experience as well as folks that I have worked with, this behavior is the least favorable.

You simply cannot modify the existing packages unless user explicitly tell you to, otherwise you will very likely break somebody's code that depends on an API that only existed in certain package version of their choice. So option 2 cannot be made default, but can be opt-in.

Personally, I most favorite option 4, I'd rather break my builds than not being able to control what my environment has. But considering not everyone think as I do, this option should also be made opt-in.

@uranusjr
Copy link
Member Author

uranusjr commented Nov 18, 2020

FYI conda update has --update-dependencies and --no-update-dependencies, and you can set either in configuration as default. I’m not sure which one is the default, however.


Edit: It seems like update_dependencies = False is the default, so Conda is doing almost exactly what you want.

There is still a slight difference, Conda would update a package if that update would not break dependencies elsewhere, and you don’t want even that to happen. This is a reasonable scenario, but I believe Conda’s stance is you should spell out exact versions in environment.yml in that case instead. pip’s is similar (but with requirements files), and most people likely also expect that, so disallowing any package updates may be too drastic a change for pip to implement.

@pradyunsg
Copy link
Member

As a general note -- I do think #9094 goes into what users are expecting based on a single-question survey's results.

AFAICT, we're basically planning on doing what @uranusjr just described above, with that issue being the tracking issue for it.

@junpuf
Copy link

junpuf commented Nov 18, 2020

There is still a slight difference, Conda would update a package if that update would not break dependencies elsewhere, and you don’t want even that to happen. This is a reasonable scenario, but I believe Conda’s stance is you should spell out exact versions in environment.yml in that case instead. pip’s is similar (but with requirements files), and most people likely also expect that, so disallowing any package updates may be too drastic a change for pip to implement.

You are mostly right about conda's behavior, but in practice it often behave in a way confusing to users. The tricky part about conda's update is that there can be difference between the metadata from the moment you create environment to the point you update the environment using the same spec. When one try to update an environment that has a good amount of packages created months ago, they often cannot do that because base on the new metadata, the environment has already broken although the user changed nothing. This issue was discussed here.

disallowing any package updates may be too drastic a change for pip to implement.

I agree, and I think my previous comment was confusing. The concern I had was when someone tries to update/install a package in an existing environment, by for example simply type pip install foo. In order to satisfy foo, existing dependency might be changed, which can result in conflict (now it looks like the original case you had at the begining).

__

As a general note -- I do think #9094 goes into what users are expecting based on a single-question survey's results.
AFAICT, we're basically planning on doing what @uranusjr just described above, with that issue being the tracking issue for it.

Thanks for the link, I followed to this one through here

@uranusjr
Copy link
Member Author

uranusjr commented Nov 19, 2020

The tricky part about conda's update is that there can be difference between the metadata from the moment you create environment to the point you update the environment using the same spec. When one try to update an environment that has a good amount of packages created months ago, they often cannot do that because base on the new metadata, the environment has already broken although the user changed nothing.

Yeah, this is the problem pip needs to deal with as well, plus the additional hurdle most environments pip deals with weren’t even populated by an environment.yml and are tend to be even more “broken.”

Personally I think this is unsolvable for tools like Conda and pip, because they allow users too much freedom manually tinkering with the environment without expressing intent. So there is not a theoratical correct answer here, and all we can do is to pick the behaviour that makes the least people unhappy the least of the time.

@alexreg
Copy link

alexreg commented Feb 17, 2021

What is the practical solution to #9482 though? If I'm running pip3 install -U foo, then really I want it stop and warn me before upgrading if the upgrade would break a dependency. Only if I pass an --ignore-breakages option or something would I want it to go ahead. This is highly undesirable behaviour at present.

@elsamuray7
Copy link

elsamuray7 commented Feb 22, 2021

What is the practical solution to #9482 though? If I'm running pip3 install -U foo, then really I want it stop and warn me before upgrading if the upgrade would break a dependency. Only if I pass an --ignore-breakages option or something would I want it to go ahead. This is highly undesirable behaviour at present.

I see it the same way. The default behavior should be that the whole installation process aborts with an error message, telling the user that their python environment could get broken if they force the installation and maybe also give an overview of which part of the depenency graph will get broken, if so. However, I don't know how far this is possible to implement. I just don't think that it is smart to believe that the user knows what they are doing. That is kinda like offering someone a car key without telling them that - if they accept - the money for the car is withdrawn from their bank account which may break their financial situation. Never assume that users know the exact consequences of their actions.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
C: dependency resolution About choosing which dependencies to install state: needs discussion This needs some more discussion type: maintenance Related to Development and Maintenance Processes UX User experience related
Projects
None yet
Development

Successfully merging a pull request may close this issue.