-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for opportunistic dependencies #214
Comments
This would only solve the problem because opencv-python doesn't upload their sdists to PyPI, right? That's not something we want to encourage, is it? Prior art note: Debian has If X depends on Y, that's mandatory and enforced: if Y can't be installed then X can't either, and if Y is removed then X must be removed too. If X recommends Y, then by default installing X will trigger the installation of Y, but you can toggle this with a config option, or remove Y afterwards. If X suggests Y, then nothing happens by default, but I guess the data might be shown in package management UIs, like "if you liked X, check out Y" or something? There's also |
@njsmith Thanks for the prior art note, that's a very good taxonomy for it.
That may be, but I think the use case is still valid. One can imagine a dependency that cannot be built on a given platform because it has no native support on that platform, in which case an One can also imagine that this could be useful in corporate or other locked-down environments where it's not possible to use certain packages for licensing reasons or because they have not yet gone through compliance. In that event, you could safely block the package in a caching proxy and anything with a "recommends" dependency would be satisfied to use the default case of "not included". Another possible use case (and one I haven't really thought through yet) would be in resolving cycles or package conflicts. You could, for example, do something like this:
That would check if anything requires |
Agreed, I'm not happy with encouraging binary-only uploads. Maybe an option would be a variant of the That would put the decision in the user's hands rather than the packager's, which seems more reasonable to me. If the package wants to only have binaries for certain platforms, they could always upload a pure-python "dummy" wheel that would be installed when there's no platform-specific binary. That dummy wheel could simply include a flag |
One thing to note is that right now the use cases are all or mostly about resolving "illusory conflicts" that are created by packages being forced to express stronger requirements than they actually have. I think by allowing packages to express dependencies in a more nuanced way, you could also allow consumers to more easily express their preferences with regards to dependency installation. For example, if we adopted both I think we probably need to spend some time thinking about how much complexity we actually want to expose into the dependency resolution system, but I know I've been chafing at the inability to express the various fallback mechanisms I've designed in my packages that would allow people to easily opt for a different balance of features to "installation weight", as it were. |
I think the problem of binary compatibility is a bit of a red herring. There are other reasons why you may have incompatibilities - for example, one of your "recommended" dependencies may be slow to add support for Python 3.7, or may only be available on Python 2 or 3 and you are supporting both. This would free you up to forge ahead and let your recommended dependencies upgrade at their own pace.
Adding |
It is and it isn't. If all you're suggesting is that an installer try building from sdist and ignores any error in the build, then that's probably OK (I say "probably" because there are all sorts of caveats over the practicalities of cleaning up after a failed build that would need reviewing and possibly addressing). Also, I'm not convinced that having a load of build errors, then a successful install, is a particularly nice UX (nor is hiding the build errors - what if the errors were unexpected and the user thought the dependency would install?) But you then go on to say "dependencies may be slow to add support for Python 3.7, or may only be available on Python 2 or 3", and I don't know how you expect that to work in practice (given that you're saying that not uploading sdists is not the mechanism you're intending). So your use case is a bit muddled here.
I'm not sure this is Python related. From a quick Google search it seems that opencv is a C library (with Python bindings)? So it's not possible to express a dependency on opencv in package metadata. Again, it's not entirely clear how what you're proposing would work in practice.
That's definitely a case where I'd expect to have a universal backend that basically does nothing, and platform-specific backends that have the speedup code (if only because of the same problem of "we don't want to encourage binary-only packages"). The core dateutil code then checks the actually installed backend module to see if it's the dummy or not before calling the speedups.
Point taken - although what I assume you mean by "Recommends data" feels somewhat different from what you were proposing originally. Maybe I'm misunderstanding, though, and the two cases are more similar than I imagine. As far as "recommends" metadata is concerned, AIUI that's typically done using extras right now ("install Overall, I'm not against the idea here in principle, but I think it needs to be a lot more clearly specified before it's viable as a proposal. |
The most obvious mechanism is a package that has
I recommend ignoring the specific
I cannot read your mind, but the core concept of "recommends" has not changed. It is a set of dependencies that a package uses and would like installed by default, but that is not necessarily required. It essentially means "Please install this, but if for any reason you can't, that's fine, using this library without such and such a dependency is still mostly supported".
This does not solve the problem, because both this and pypa/setuptools#1139 are slightly different flavors of "opt out" dependencies. Having the ability to opt out of un-required dependencies is in no way less control than having the ability to opt in to them, and in fact it will lead people to just make everything a hard dependency rather than bother with extras. I envisioned that people would do things like this: ...
install_requires=[
'somepkg; recommends
],
extras_require={
'somepkg': ['somepkg'],
} Such that My general principle when designing interfaces is to have the default be the thing most people want, but you should provide "escape hatches" for people who may want something different. That is defeated by not having a way to express the fact that some dependency of my package is something that most people will want, but some people may not want that, and not having it is a supported workflow. Right now the only options are that dependencies can be required or they can be opt-in. I want some dependencies to be opt-out.
Which is generally what issues like this are for. I hope that I have established that this is a real kind of dependency that we currently have no way to declare, and we can now move on to designing what support for such a thing would look like. The question of how best to do dependency resolution without the dependency resolution syntax becoming a full-fledged language with a package manager of its own is a tricky one. It is likely that we cannot consider this proposal in isolation and we may need a meeting to discuss it, or a small working group that designs a proposal. I suspect that it will be hard to design using only GH issues and/or mailing lists. I was hoping to start documenting the various dependency-specification related issues that come up here and elsewhere (again, see pypa/setuptools#1139 as one example), so that we have the data needed to come up with a solution that takes into account the various problems that have been cropping up. |
@pfmoore This is the opencv package that we're talking about on pypi, that has only wheels and no sdist: https://pypi.org/project/opencv-python/ @pganssle Allowing packages to describe different profiles of features vs. installation weight is exactly what extras do, right? We describe different options just fine; the problem is that pip always defaults to installing the most pared-down profile, which isn't a great default. So the question is exactly, under what circumstances should pip install these not-exactly-required packages by default, and how do we let users and packagers control that? The simplest mechanism would be like Debian, where packagers can say "install this by default", but the end-user can override it, and You could have a "binary-only dependency" ("only install this if you can find a wheel"), but that feels really weird to me -- normally we try to decouple "what we need" from "how we get it".
This is a pretty common situation, yeah. The way people normally handle it currently is that they include the backend code inside their regular sdist, and when
Isn't this the case where people usually just... have two packages? If someone needs timezone functionality, they depend on the package that provides timezone functionality, if they don't, they don't? There is a challenge in splitting a package in two without breaking users -- that's not something we have great tools for right now. The usual way Debian handles something like this is to create two new packages (dateutil-core and dateutil-zoneinfo, say), and then make 'dateutil' into a trivial package that just depends on those two new packages. I guess that's something we already support? I'm not sure if there's any way to do better.
The problem is, packaging is extremely complex and has endless edge cases and unusual situations. We can't make decisions on the basis of "can we imagine a situation where this might be useful". If we want our system to be useful to actual people in common situations, we need to actually look at those cases, to make sure that what we add solves real problems. |
Most of these problems do not come up because responsible maintainers come up with weird workarounds. In the original thread I provided 4 separate workarounds, and did not suggest that the original reporter somehow attempt to convince OpenCV to release an And in general these issues are all symptoms of a larger problem, which is that dependency declarations are not sufficiently expressive. Here's another example: What if you want to depend on Tensorflow but want the ability to fall back to pytorch or some alternative library if you need Python 3.7 support or something of that nature. Sure we "don't want to encourage people to only post sdists", but people are not going to stop using Tensorflow and Google is not going to play nice with the rest of the world because we make it hard for people to declare their dependencies on Tensorflow correctly. What is much more likely to happen is that people will not provide wheels, or do some other nonsense because dependency declarations are needlessly restrictive and they can always just write a
No, this is something different entirely but the specifics don't matter. It is very much an opportunistic dependency because the Ideally I would have something like this:
I'm not even entirely sure that |
I am not interested in further justifying the use of this functionality at this point. Anyone else can feel free to try to make the case for it if there are further objections. I think it is very obvious that there is a problem here, and we all know what at least one of the problems is - it is not possible for packages to declare that some dependency is not required but it should be installed by default. If we can agree on that, then we can start to focus on the solution. The major realistic problems I see are:
There are probably several other "big questions" to be answered, and we definitely need to consider the other dependency-declaration problems people have as part of a more general solution. There are other "small questions" to be addressed as well, like how exactly this information gets encoded - is it a new syntax? Is it one or more environment markers? I would probably also classify some of the "how will pip behave" questions as "small" in the sense that you just have to pick a behavior and run with it. |
Weak dependencies are definitely useful. RPM-based distros adopted a model similar to the Debian one a few years back: http://rpm.org/user_doc/dependencies.html (see the "Weak dependencies" section at the end) However, they also pretty much require a proper resolver to handle, since they create many more potential sets of compatible packages given an initial package listing (if you try to install an optional dependency and one if its mandatory dependencies is unavailable or otherwise fails to install, you need a resolver that's clever enough to back out and break off that entire sub-branch). It isn't a coincidence that Fedora et al didn't get weak dependencies until after the original yum implementation was superseded by dnf (which is backed by libsolv). These kinds of "use if available" links also introduce additional complexity into version pinning, since you need to decide how to handle them at both pinning time ( |
@ncoghlan what if you just call "pip install" for each weak dependency after (or before) installing the primary package? if it fails it fails. if it installs it installs. point is that the main manualy selected package is always installed. i see no problem... |
This would be useful thing for many scientific packages that have functionality supported by external (C/C++/Fortran with Python wrappers) library but which is not the core of the package. This would also enable them to be installed on platforms that can not install the binary dependencies. |
If I understand correctly, this issue may depend on pypa/pip#988 the new pip resolver. The Python Software Foundation's Packaging Working Group has secured funding to help finish the new dependency resolver, and is seeking two contract developers to aid the existing maintainers for several months. Folks in this thread: Please take a look at the request for proposals and, if you're interested, apply by 22 November 2019. And please spread the word to freelance developers and consulting firms. |
I just opened #432 but I now see that this thread is discussing very similar ideas. Let me know if that issue should be closed in favour of this one. |
…al dependencies Summary: The main changes are to `requirements.txt` and `setup.py`. After much discussion, we prefer that `pip install kats` installs everything, but that power users can have a way to opt out of some of dependencies required in only some parts of Kats (and then manually opt into them as desired). `pip` doesn't directly support this use case (see [setuptools issue](pypa/setuptools#1139), [packaging-problems issue](pypa/packaging-problems#214)), so we had to roll our own solution: ``` MINIMAL=1 pip install kats ``` Optional depedencies moved to `test_requirements.txt`. Two imports (`LunarCalendar`, `convertdate`) were not used anywhere anymore in Kats and removed. The rest of the changes wrap these now-optional imports in `try/except` blocks. In two cases, the changes are made to `__init__.py` files and log a warning that that the affected functionality is not available at all if the required depedencies aren't there. Otherwise, the modules can be imported but some methods cannot work without the optional dependencies (for example, some plotting methods require `plotly`). Reviewed By: rohanfb Differential Revision: D30172812 fbshipit-source-id: 2c7d8072e72cbdd2ac9960fe953cabe15db63cc9
Recently the issue pypa/setuptools#1599 was raised, hoping that it would be possible to specify "soft-failing" extra dependencies. This is currently not possible, but I think it's reasonable and would help me in some of the dependency-resolution problems that have been worrying me in
dateutil
.The original use case from @Harvie was for a package that works better with
opencv
, but can operate without it. Their main concern was thatopencv
is not available on ARM - it's possible that this can be handled via environment markers, but even if it works, that is essentially a hack around what they want to express, which is "if you can install this, you should, but if you can't, that doesn't need to block the installation of my package". Using environment markers, you are hard-coding that "arm doesn't needopencv
" when in reality ARM would benefit just as much fromopencv
as anyone else, and ifopencv
were to release an ARM-compatible package, you'd want your users to pick that up.I have a similar use case in
dateutil
- I would like to write a compiled backend, butdateutil
is very widely used, and I am not confident that I can make releases for every platform. Ideally, I would declare an optional dependency on a backend so that people are opportunistically upgraded as their platform becomes available. Obviously there are workarounds in this case since I control both packages, but as with pypa/setuptools#1139, it would be much better if we had a way to explicitly specify the nature of the dependencies.Possibly the easiest thing to do would be to implement this as an environment marker, maybe something like
soft_dependency
? e.g.:We may also need to get into the possibility of fallback dependencies, like:
The text was updated successfully, but these errors were encountered: