Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Addressing Dependency Challenges / Future of Sage-the-Distribution #39192

Closed
tobiasdiez opened this issue Dec 23, 2024 · 9 comments
Closed

Addressing Dependency Challenges / Future of Sage-the-Distribution #39192

tobiasdiez opened this issue Dec 23, 2024 · 9 comments

Comments

@tobiasdiez
Copy link
Contributor

tobiasdiez commented Dec 23, 2024

One of the major pain points for users and developers of Sage is the need to install a wide range of dependencies before getting started with actually building sagelib. The idea of this issue is to discuss potential solutions and hopefully converge to one long-term vision, which can then be presented/discussed further on the mailing list.


Problem Overview

  • Installing Sage often requires handling numerous dependencies, some of which can be tricky to install due to version conflicts, compatibility issues, or the lack of packaged versions in common package managers.
  • This complexity acts as a barrier for users, particularly those who are not experienced with package management or building software from source.

Potential Solutions

1. Provide Build and Install Scripts for All Dependencies

This is the current status quo for sage-the-distribution. Based on a minimal set of dependencies, all Sage dependencies are installed via a manually maintained set of installation scripts and a dependency tree.

Advantages:

  • Convenient for users.
  • Ensures a consistent set of dependencies, providing a stable basis for testing.

Disadvantages:

  • Installing these dependencies is a common source of build errors, requiring developer time to assist users with troubleshooting.
  • Scripts require constant updates as dependencies evolve.
  • There is no automated mechanism for updating external dependencies in sage-the-distribution.
  • Extensive CI testing is required to maintain the scripts, adding to the maintenance burden.
  • Lack of Windows support (currently).
  • The setup is not reusable in other projects.

2. Provide Build and Install Scripts for a Minimal Set of Dependencies

This approach involves selecting a set of supported operating systems and maintaining scripts only for a minimal set of dependencies required to install sagelib after using the OS-provided package manager. There are different flavors, Python could be managed via external tools like pyenv or uv.

Advantages:

  • Reduces the scope of dependencies Sage-the-Distribution maintains directly.
  • Easier to manage and test a minimal set of scripts compared to the full dependency tree.

Disadvantages:

  • Users may still face challenges, especially with dependencies not included in the minimal set.
  • Some dependencies would rely on external tools, requiring additional user knowledge or documentation.
  • Limited to supported operating systems, reducing flexibility.

3. Conda

Use Conda as the primary package and environment manager for Sage. Either directly using conda/mamba or via pixi.

Advantages:

  • Conda provides prebuilt binaries, allowing users to install SageMath with minimal effort and minimal chance for installation problems.
  • Conda creates isolated environments, making it possible to easily experiment with different versions of dependencies.
  • Works seamlessly across Windows, macOS, and Linux.
  • Familiarity within the scientific ecosystem: Many contributors to projects like NumPy and SciPy are already familiar with Conda workflows.

Disadvantages:

  • Users unfamiliar with Conda need to install it first and learn its basics.
  • Keeping Conda recipes updated and compatible across platforms requires ongoing effort.

4. Meson Wrap-DB

Explore using Meson Wrap-DB to manage dependencies. Meson Wrap-DB allows dependencies to be included as subprojects, simplifying their integration into the build system. In essence it is similar to the current sage-the-distro setup, but the build scripts of dependencies are written in Meson.

Advantages:

  • Meson is fast, modular, and integrates well with subprojects through Wrap-DB.
  • Wrap-DB simplifies dependency integration, reducing manual maintenance.
  • Other projects could profit from the work done, e.g. python-flint could use the same scripts to install flint in their project
  • Perhaps it's the only approach that easily allows to build dependencies for systems that they don't officially support (e.g. Windows).

Disadvantages:

  • Transitioning to Meson would require significant effort and might disrupt the current build system.
  • Contributors need to learn Meson and Wrap-DB concepts, which could initially slow development.

5. Only Rely on Package Managers to Install Dependencies

This approach would remove the need for custom build and install scripts and instead rely entirely on popular package managers like apt, yum, brew, pacman for dependency management. Users would be instructed to install dependencies using these package managers, and SageMath would ensure compatibility by specifying the required versions.
If an required version is not available, Sage would simply not install the part of sagelib that requires on this dependency.

Advantages:

  • This approach leverages the existing ecosystem of package managers, eliminating the need for custom scripts.
  • Package managers ensure that dependencies are installed in a consistent, well-tested way.

Disadvantages:

  • Many dependencies may not be available, potentially reducing the functionality of the installed sage.
  • Relying on package managers may lead to conflicts between system-level packages and the versions required by SageMath, especially for highly specific or custom libraries.
  • This method provides less control over the installation process, and custom patches or configurations for dependencies may be harder to manage.

Mixed Strategies

These potential solutions should not necessarily be seen as mutually exclusive. A mixed approach could combine the strengths of different strategies to balance simplicity, flexibility, and maintainability. For example: Conda could be the recommended installation procedure (3.) and otherwise Sage would simply not install parts of sagelib if a dependency is not installed via the system's package maanger (5.).

Request for Community Input

I invite the SageMath community to share their experiences and thoughts about these solutions. Are there additional tools or approaches we should consider to reduce the dependency burden?
Looking forward to your feedback and ideas!

Pinging a few people that showed interest in discussing these topics in the past: @kwankyu @dimpase @orlitzky @tornaria @antonio-rojas @saraedum

@kwankyu
Copy link
Collaborator

kwankyu commented Dec 23, 2024

This is a good topic to open in Discussions but inappropriate as an issue, which should focus narrowly.

@saraedum
Copy link
Member

@tobiasdiez let me know if you want me to convert this to a discussion.

@saraedum
Copy link
Member

saraedum commented Dec 23, 2024

I would like to add pixi to the list. I think it solves many of the usability problems that a plain conda setup has when providing developer dependencies.

@dimpase
Copy link
Member

dimpase commented Dec 23, 2024

one immediate improvement of the current sage the distro model would be to make minimal requirements not as needlessly minimal as now. You could see that e.g. on Linux systems these don't include g++ and gfortran - something which makes no sense, as all the Linux systems we support have these ready to build Sage.

These are set in build/pkgs/_prereq/distros/
You can read more on this in #36726, where a fight erupted over the refusal to add a package providing pkill to a set of minimum requirements. And yes, we still contribute to global warming by building, in CI, gcc on systems which have perfectly OK gcc's. Can #36726 be revisited, or we need to wait (hopefully not too long?) for a Sage Technical Committee to take charge of reducing the technical debt?

@orlitzky
Copy link
Contributor

Now that we can build sagelib independently using meson, I'm pretty happy.

We have at least five people packaging the dependencies of sage in various ecosystems. I'm quite content to build sagelib and ignore everything under build/ from now on. If someone wants to maintain that, to each his own, but I've been saying it's a waste of time for like 15 years now.

I'll be focusing on the few remaining issues (database packages) needed to get sage into Gentoo. When that's done, if you want to use sage, install the Gentoo package. If you want to develop sage... install the Gentoo package, and then clone sage.git and build sagelib using meson.

That doesn't work for non-Gentoo users, but I think that taking a similar stance with (say) conda would be a lot more scalable than what we are doing now.

@PredictiveManish
Copy link

Based on above discussions, I think:

  1. Adopting Conda as primary Solution is good for all kind of people: Making Conda primary for installation, as it provides an easier setup, cross-platform experience also it is widely used in scientific community, in various other orgs.
  2. For those users who prefer or don't use conda or they don't have conda in their environments, SageMath can allow use of package managers, but with a clear set of supported versions and a less complete setup in case dependencies are unavailable.
  3. We can implement tools or scripts which can help automatically resolving dependencies (similar to conda's approach), which can reduce the need for manual maintenance.
  4. Lastly, if Meson's wrap-DB proves effective, it could help extend support to Windows users, resolving long-standing issues with compatibility on that platform.

Please let me know is this can be an effective solution or not as mixed approach can be a better option as beginners can be covered (via conda) and there can be flexibility for advanced users (via package managers or custom scripts). This approach also remains maintainable and adaptable to evolving needs.

@dimpase
Copy link
Member

dimpase commented Dec 28, 2024

It seems natural at this point to decouple, as much as possible, sagelib and its requirements/dependencies (a.k.a. sage the distro).

This will simplify packaging sagelib for the linux distros and for conda, as well as acknowledge the reality of how sagelib is used. People who are not interested in maintaining sage the distro won't be distracted from work on sagelib, too.

This will also make it easier to design, if desired, a slimmer alternative to the current sage the distro, which is a product of many years of incremental updates, and as such is way too complicated and suboptimal.

@tobiasdiez
Copy link
Contributor Author

This is a good topic to open in Discussions but inappropriate as an issue, which should focus narrowly.

I was not aware that we use Discussions actively. Feel free to move it there if you think it's a more appropriate category.

I would like to add pixi to the list. I think it solves many of the usability problems that a plain conda setup has when providing developer dependencies.

Thanks for the suggestion. I agree pixi looks great (so does uv if they would support conda). I've added it as a suboption of conda in the list above since they are quite similar in many regards. Okay?

So your preference is a pixi-only setup, or how would you envision the setup to look like?

I'm quite content to build sagelib and ignore everything under build/ from now on. If someone wants to maintain that, to each his own, but I've been saying it's a waste of time for like 15 years now.
It seems natural at this point to decouple, as much as possible, sagelib and its requirements/dependencies (a.k.a. sage the distro).

While I agree with the general sentiment here, I was hoping to find a solution that would fit most people's needs and wishes. I think we have also seen that the each-developer-their-own-corner approach only works as long as the corners don't intersect. But at some point you are bound to run into problems because one person needs some change that conflicts with someones else approach. To give some concrete examples:

  • Sage-the-lib cannot just require numpy 2.0 because our current testing system relies largely on sage-the-distro, which brings its own version of numpy.
  • There is the plan to move some of the runtime dependency checks on programs to the build stage using meson. Easy to do in sage-the-lib; but as long as sage-the-distro is not using meson this is not possible.
  • In many places, the docs refer to sage -install, which doesn't make sense for conda / pure-meson. How to improve this?

Sure, in each instance you could say, I don't care about sage-the-distro and just move ahead with the necessary changes in sage-the-lib; but then you are bound to get very angry looks from the people that would like to keep sage-the-distro as the main installation mechanism.

So, in summary, my hope was to find a way that would unite most developers to work in the same direction.

For what's worth, I personally I essentially agree with the approach sketched above by @PredictiveManish, except that I don't think we need point 3 "installer scripts": Conda as the primary installation method and to setup the dev environment; complemented by sufficiently strong dependency checks in meson to make it possible to also manually install all dependencies (say via package managers) and meson-wraps for some important dependencies that are hard to install (on a given system).

@PredictiveManish
Copy link

This is a good topic to open in Discussions but inappropriate as an issue, which should focus narrowly.

I was not aware that we use Discussions actively. Feel free to move it there if you think it's a more appropriate category.

I would like to add pixi to the list. I think it solves many of the usability problems that a plain conda setup has when providing developer dependencies.

Thanks for the suggestion. I agree pixi looks great (so does uv if they would support conda). I've added it as a suboption of conda in the list above since they are quite similar in many regards. Okay?

So your preference is a pixi-only setup, or how would you envision the setup to look like?

I'm quite content to build sagelib and ignore everything under build/ from now on. If someone wants to maintain that, to each his own, but I've been saying it's a waste of time for like 15 years now.
It seems natural at this point to decouple, as much as possible, sagelib and its requirements/dependencies (a.k.a. sage the distro).

While I agree with the general sentiment here, I was hoping to find a solution that would fit most people's needs and wishes. I think we have also seen that the each-developer-their-own-corner approach only works as long as the corners don't intersect. But at some point you are bound to run into problems because one person needs some change that conflicts with someones else approach. To give some concrete examples:

  • Sage-the-lib cannot just require numpy 2.0 because our current testing system relies largely on sage-the-distro, which brings its own version of numpy.
  • There is the plan to move some of the runtime dependency checks on programs to the build stage using meson. Easy to do in sage-the-lib; but as long as sage-the-distro is not using meson this is not possible.
  • In many places, the docs refer to sage -install, which doesn't make sense for conda / pure-meson. How to improve this?

Sure, in each instance you could say, I don't care about sage-the-distro and just move ahead with the necessary changes in sage-the-lib; but then you are bound to get very angry looks from the people that would like to keep sage-the-distro as the main installation mechanism.

So, in summary, my hope was to find a way that would unite most developers to work in the same direction.

For what's worth, I personally I essentially agree with the approach sketched above by @PredictiveManish, except that I don't think we need point 3 "installer scripts": Conda as the primary installation method and to setup the dev environment; complemented by sufficiently strong dependency checks in meson to make it possible to also manually install all dependencies (say via package managers) and meson-wraps for some important dependencies that are hard to install (on a given system).

Yes you're right point 3 is just an additional point. We can easily eliminate it apart from it all others can do the whole bunch of tasks covering entire spectrum either beginner or advanced developers.

@sagemath sagemath locked and limited conversation to collaborators Jan 4, 2025
@saraedum saraedum converted this issue into discussion #39272 Jan 4, 2025

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants