Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lots of dependencies are pinned to an exact (patch) version #3547

Closed
csadorf opened this issue Nov 14, 2019 · 12 comments · Fixed by #3597
Closed

Lots of dependencies are pinned to an exact (patch) version #3547

csadorf opened this issue Nov 14, 2019 · 12 comments · Fixed by #3597

Comments

@csadorf
Copy link
Contributor

csadorf commented Nov 14, 2019

Looking at the long list of requirements, it seems that the vast majority of dependencies for aiida-core are pinned to a specific version, usually to a specific patch version. This makes it very difficult to setup a Python environment that is actually compliant with all of these dependencies.

I am wondering whether a lot of these requirements could be relaxed, because this makes it very difficult to manage a Python environment that depends on aiida-core.

For example, the numpy package is currently pinned to version 1.16.4. Does that mean that aiida-core will not work with version 1.16.3 or 1.16.5? Or with version 1.17.4 (the currently latest release) for that matter?

I propose a strong relaxation of these requirements with liberal use of the compatible release version specifier. If pinning versions for testing is required/desired, then these dependencies should be managed separately.

@sphuber
Copy link
Contributor

sphuber commented Nov 14, 2019

We did this out of necessity really. Unfortunately we have a lot of dependencies, which we have been trying to reduce, and they would often break. When they would release a minor version with breaking changes, our release on PyPI would also be broken and we would have to release a patch, pinning that particular dependency. The pinning of numpy is a perfect example. Moving to 1.17 does indeed break our builds, even though it shouldn't as it is a minor version change.

Clearly all of this is not ideal but since we are mostly an "end-application" and not a library and we recommend to run in a virtual environment anyway, this still seemed like the preferable option. I think starting to use compatible release on patch version like you propose would however already be a good change.

@ramirezfranciscof
Copy link
Member

The pinning of numpy is a perfect example. Moving to 1.17 does indeed break our builds, even though it shouldn't as it is a minor version change.
(...)
I think starting to use compatible release on patch version like you propose would however already be a good change.

Probably I'm missing something, but from what I understood of compatible releases from @csadorf 's link, if we were using them the numpy update would have broken our builds anyways and we still would have needed to patch it, correct? If that was the case, you still think it would be a net positive to implement?

@sphuber
Copy link
Contributor

sphuber commented Nov 15, 2019

You can use the compatible release notation to specify that all patch releases should be considered. For example numpy~=1.16.1 will install all 1.16.* versions but not 1.17 etc. Technically it is still possible that patch releases break our code, but that is a lot rarer.

@csadorf
Copy link
Contributor Author

csadorf commented Nov 15, 2019

It is of course possible that even minor or patch releases breaks our builds due to poorly implemented upstream changes, however I'd argue that we should absolutely make it a priority to fix those builds (including patching upstream dependencies) instead of pinning to a particular patch version unless there is absolutely no alternative.

I understand that you interpret aiida-core to be an end-user application rather than a library, however I'd argue that since the package is supposed to support a multitude of workflows in combination with other applications, we should make it as easy as possible to install the package in conjunction with other applications.

So, I propose that we review requirements and relax them wherever possible and use the version pinning as a last resort version with a fast-turnaround on relaxing these requirements once incompatibilities have been resolved upstream.

@ltalirz
Copy link
Member

ltalirz commented Nov 15, 2019

Just my 2 cents on this:

I think we all agree on the goal - the problem is that we were bitten so many times by issues with broken dependencies.
Since you mention numpy - actually, not too long ago, scipy released a version 1.0 that was not installable and broke huge numbers of projects (including AiiDA). If you can't trust scipy, then whom can you trust? (to be fair, it was a new major release)

In my opinion, there are two directions one can tackle.

One is the technical direction, where we could use tools like pipenv that allow us to specify the dependency ranges we would be happy with, but also to concretize them down to the very last dependency of dependency in order to be able to provide a complete list of dependencies for which we can guarantee that AiiDA will work.
Unfortunately, because of our dependency on circus, aiida-core[notebook] has conflicting dependencies, and overriding dependencies in pipenv is still an open issue.
Progress in this direction would be greatly appreciated.

Whatever technical solution we choose, the more fundamental direction is to reduce the number of dependencies in general.
If AiiDA depends on projects that break their API in minor/patch releases, then we should probably reconsider whether depending on those is a good idea.
I've been making small steps in this direction for some time, but I believe much more is needed.

@csadorf
Copy link
Contributor Author

csadorf commented Nov 15, 2019

While it is true that even major projects like numpy, scipy, or even Python itself sometimes make bad releases and all downstream projects need to deal with the fallout, I don't think that pinning all requirements down to the patch level is the appropriate response to that. Maintainers should temporarily pin requirements, report unexpected incompatibility issues back to the upstream projects, and then either wait for a resolution of the compatibility upstream or resolve it downstream, at which point the requirement should be relaxed again.

I would also like to point out that numpy and Python do not follow semantic versioning and django and scipy do not strictly. It can therefore not be assumed that breaking changes can be automatically detected from the version number alone. Release notes for all dependencies must be followed and manually reviewed regardless which is of course another good reason to reduce the number of dependencies aggressively. Finally, one benefit of breaking CI pipelines -- even if annoying -- is that incompatibilities are detected regardless.

@muhrin
Copy link
Contributor

muhrin commented Nov 18, 2019

Just to chime in here. Having been along for the journey of what led us to the current 'pin everything' solution I can say that I'm also not that happy with it but found the advantage of being able to pip install and have it work to be significantly prefereable to having to deal with issues of not. The other advantage is one of perception: for inexperienced users it's really important that aiida 'just works' whilst for more experienced developers it's not so bad if they're asked have to manually upgrade a package and then maybe deal with (e.g. post a bug report) for an incompatibility they find.

I am, however, persuaded by @csadorf 's assertion that we're no purely an application as we want people to incorportate AiiDA into their scientific workflows which may mean different library versions.

To balance these two sides my suggestion would be to adopt @csadorf 's strategy in the last post at least for libraries that we judge users in our community may have different versions of (numpy, ase, scipy, pyspglib, etc) whilst being more open to fully pinning more 'utility' libraries. That said, I'm mindful of the fact that the ones we don't pin may, in fact, be the harder ones to deal with and perhaps a strategy here could be to try and automate some of the workload through the use of special unit test/CI scripts.

@csadorf
Copy link
Contributor Author

csadorf commented Nov 18, 2019

@muhrin I think it should absolutely be the goal that aiida just works, but for the reasons that I laid out, I don't think that pinning dependencies is the optimal way to achieve that. Especially when end users run into conflicts.

Maybe a good compromise would be to still have strict constraints for tested versions, but manage these requirements more actively. For this we could institute a dependency manager, who follows the dependency releases, upgrades the requirement and checks whether those updates break our tests. We can then do a patch release for AiiDA that explicitly supports the new version.

I'm proposing that the responsibilities of the dependency manager are to watch releases of new dependencies and manage dependencies according to the following protocol:

  1. Pin all versions to compatible release versions on a per-project basis. For projects that follow semantic versioning this means pinning to a major release, for others it would be on a case-by-case basis.
  2. For every potentially breaking release, the dependency manager upgrades the requirement, executes the CI pipeline, and then either a) recommends a patch release with the updated requirements if CI passes, b) fixes any trivial incompatibilities and recommends a patch release or c) pins the requirements for non-trivial incompatibilities and places the dependency on a problematic dependency watch list.
  3. For dependencies that break within the compatibility constraints, the dependency manager immediately pins the version and recommends a patch release, places the dependency on the problematic dependency watch list, and c) attempts to resolve the incompatibility as in 2.

For all dependencies that are on the problematic dependency watch list, the dependency manager should report problems upstream and potentially recommend that alternatives should be sought if upstream issues are not resolved.

And yes, I'm volunteering for that role. I could write up an AEP detailing the role and the procedures and then we can work on improving this process.

@muhrin
Copy link
Contributor

muhrin commented Nov 18, 2019

Sounds good @csadorf , and thanks for volunteering. One thing I think should be noted in the AEP is that some care will have to be taken in releasing a patch version and updating develop which may have diverged since the last release. It's not difficult, it just needs to be done methodically.

I would propose that if we start this policy we do it for a bit of time (1-2 months) on develop before deploying it in the wild as a release to see how likely problem cases are and which packages seem to cause issues. That or we 'simulate' it by adopting the convention and try installing with old (but compatible) package versions.

@ltalirz
Copy link
Member

ltalirz commented Nov 18, 2019

Just to mention one more point: relaxing dependency constraints will make "historic" AiiDA versions (that aren no longer supported by patch releases) more difficult to install, and so if we go down this route it would be great if we could provide for every AiiDA release alongside the "lax" requirements (that may only work in today's python package environment) also a complete set of concretized requirements that are "guaranteed to work and futureproof".
This essentially comes for free with pipenv.

@csadorf
Copy link
Contributor Author

csadorf commented Nov 18, 2019

@ltalirz I'm not quite sure how pipenv would help with this, but I'll check it out and maybe you can help me in formalizing that as part of the AEP. I'll create a draft and then maybe you can co-author it?

@ltalirz
Copy link
Member

ltalirz commented Nov 18, 2019

@ltalirz I'm not quite sure how pipenv would help with this, but I'll check it out

See the docs or e.g. what we do here.

I'll create a draft and then maybe you can co-author it?

Sure. So, one possibility would be to take the original "pip" requirements, generate a pipfile from them, and then concretize the pipfile to get a fully locked-down set of requirements.
This could run as part of the deploy job for a new release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants