-
-
Notifications
You must be signed in to change notification settings - Fork 548
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Preliminary steps to save the CI infrastructure #39009
base: develop
Are you sure you want to change the base?
Conversation
Documentation preview for this PR (built with commit 527a5ae; changes) is ready! 🎉 |
ff6bdcd
to
7da5efe
Compare
db666ee
to
6df7a6a
Compare
before proceeding it's good to decide whether we rather go with the other PR |
d63c7e7
to
a83f7c3
Compare
I don't think it is doing
I agree with you that your diagram is the most complete and most logical one; I just very much doubt that we have the resources to have all these test pass. For this reason, I would start with "standard" and the old "optional", and once they are green most of the times and someone still has resources we can activate "minimal" and your "optional".
|
In my view, "minimal" is the most important job because in it, all standard sage packages are built. How can they be tested on a platform without actually installing them on the platform? A reasonable way to proceed is this (sorry for repeating):
Some reasonable ways (I may accept) to reduce resource usage include:
Putting more packages into |
No, it does not work as you think. In "maximal pre", all system packages corresponding to optional packages are installed. For example, find
in https://github.com/sagemath/sage/actions/runs/13596179432/job/38030950449, and in old "optional", find
in https://github.com/sagemath/sage/actions/runs/13596179432/job/38036933209. "bliss" is an optional package. |
there are plenty of packages in standard which are never built on most platforms. If you build gcc on a platform where another version of gcc is always used, your test is meaningless and might be misleading. when we at last adjust _prereqs to meaningful contents, it might start making sense to talk about running "minimal" tests - although they won't be much different from standard then. |
Please separate the issues for constructive discussion. This PR fixes the CI so that "switches and knobs" work for the components of the CI machine. We may discuss in the other PR about which switches to turn on and off, considering resources ("time" and "electricity") and utility. The situation (about resources and utility) may change in time. I created this PR because removing those "switches and knobs" is not a proper way to maintain the CI machine. |
Is it? Then what does that part mean?
|
That shows the decision of |
and then you never really know which bliss is used in this test. if you must test bliss as installed by sage, then you must make sure there is no other bliss around. Note that even if configure rejected system bliss, it would still possible for some other optional package to use the system bliss, in principle. |
Note that this is the situation of old "optional" based on "maximal-pre".
Then the new "optional" based on "standard" is even better in that regard since there is no other bliss around. |
I removed some linux releases (ubuntu and fedora releases) from CI, according to
I suggest that as a guideline to decide which Linux releases we should run tests for in CI. I don't know if there is already a similar guideline in our documentation (or in sage-devel). |
I don't think we necessarily need to test this. In case of errors during the compilation of sage packages, people seem to be pragmatic and recommend to install the system package (latest example).
I agree we should not test systems past their EOL (but this should not be the only criteria for dropping support). |
We are testing sage packages on multiple levels:
for the best stability of the release, when built on the user's machine.
This is a failure of the multiple-level testing, since the testing is not perfect. Reducing such instances of build failure is exactly the purpose of our CI infrastructure. That helps people, including Dima, live better life. |
Do we still support python 3.9? Where is documented the oldest python we support? We may remove more platforms if we do not support python 3.9... |
We still support python 3.9 according to https://doc-release--sagemath.netlify.app/html/en/reference/spkg/python3 |
I think we went through this argument: yes, the CI did its job (with occasional fix needed such as this one?), but we don't have enough resources to fix failures anyway and/or people who could fix it doesn't see the necessity because workaround is available. (Your argument against it, people could fix it doesn't because testing is not perfect instead, yes it's true that testing is not perfect, but the implication wouldn't hold if people getting build issues still are redirected to install system package even before CI failing.) Yes, I also agree that it's better to disable than delete the code so (hypothetically) if we get more resources in the future it can simply be reenabled.
has been dropped since #39251 (unfortunately there isn't enough links between relevant parts in source code/documentation so people can forget to update one when another is updated) |
Exactly. Each vendored package is more work down the road, not less. |
I think you mean people or developers by "resources". We cannot say definitely "people will fix it", "people won't fix it", "we have enough resource", or "we don't have enough resources". We don't know who will be interested in fixing some sage package for some platform. So what is the implication of your argument? Stop running CI (testing sage packages)?
I didn't argue that "people could fix it doesn't, because testing is not perfect". I said "build failures on user machines happen because testing (through CI) is not perfect".
OK. Then I will just leave the task of dropping other old platforms to the other PR. |
people should not be interested in fixing vendored packages which can be perfectly replaced by what's provided by systems/distros. We have a lot of really messy old broken code in sagelib, big chunks should be redone, for a variety of reasons - this is where the time should go into. Not elsewhere |
off topic: then put them into |
To improve the situation with the CI infrastructure, this PR:
added comments untangling obscure code in CI-related files, for those poor guys who ever attempt to read the files for whatever reasons.
while doing the cosmetic changes, a bug (about
-uninstall
targets) was foundbuild/make/Makefile.in
, which is fixed here.to test, do
fixed some jobs in the CI-linux workflow that fail because of duplicate artifact names.
removed ubuntu-lunar, ubuntu-mantic, conda-forge-python3.11, ubuntu-bionic-gcc_8-i386, debian-bullseye-i386 from the list of the default systems that CI runs for. This is how to properly modify the list:
tox.ini
(find DEFAULT_SYSTEM_FACTORS)tox -e update_docker_platforms
removed old versions of linuxmint and added new versions.
removed old versions of fedora distributions.
"optional" and "experimental" jobs now run upon "standard" docker images, instead of "maximal" ones, to avoid "out of runner space" error.
renamed "Reusable workflow for Docker-based portability CI" to "Workflow for Linux portability CI" for short name and made it runnable through github interface to facilitate testing specific platform by adding "workflow-dispatch" calling
docker.yml
.test: https://github.com/kwankyu/sage/actions/workflows/docker.yml
added helpful comments and updated the developer doc
reimplemented
.ci/write-dockerfile.sh
so that simplified Dockerfile is generated for present and future stabilityturned off failing jobs in "CI Linux incremental"
removed seemingly useless
subprojects/factory
directory to eliminate certain git warnings.turned off "standard-sitepackegs" and "standard-constraints_pkgs-norequirements" jobs as they fail on (almost) all platforms.
test CI run (as of 10.6.beta8): https://github.com/kwankyu/sage/actions/runs/13676372856
compare with the status quo: https://github.com/sagemath/sage/actions/runs/13596179432
test CI with a PR: kwankyu#82
The main objective of this PR is to solve issues with the workflow "CI Linux" such that a failure on a platform reveals solely some problem of sage built on the platform, but not a problem of the CI infrastructure. After this PR, hopefully, each of failing platforms should be tackled individually. If a platform fails, perhaps we should
I suggest discontinuing support (at least in CI) for Linux releases that have been past their EOL (end of life or end of support by the distributor) for more than 2 years.
Only decent platforms according to the CI results should be listed in https://github.com/sagemath/sage/wiki/Sage-10.6-Release-Tour#availability-and-installation-help.
The following diagram shows how packages are installed for each of CI jobs:
where "S" represents system package and dash "-" represents Sage package. Hence
📝 Checklist
⌛ Dependencies