-
Notifications
You must be signed in to change notification settings - Fork 133
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
nightly + 1.10-alpha1: GC marking segfault during GAP tests #2543
Comments
sigh OK in this PR, we are in Julia code invoked from GAP, which worried me; but the other two are in pure GAP, it seems. (Not that this is great, but at least it's not yet another completely unrelated can of worms to be opened) So I wonder if there is yet another GC change on the Julia side behind this... |
I am able to reproduce this with a loop: for i in 1:100
println(i)
include("test/GAP/runtests.jl")
println("-----")
end It crashed in iteration 13 for me in Julia 1.10.0-alpha1 on an M1 Mac. So I'll see if I can bisect using this... |
Unfortunately it seems that the crash does not reproduce reliably (in subsequent tests with versions that exhibited the crash, so my hopes of being able to bisect don't seem to pan out :-( |
There is a slightly longer error message on a recent failure on macos here https://github.com/oscar-system/Oscar.jl/actions/runs/5541671673/jobs/10115375924#step:6:489. Maybe the ptr queue that is printed there can help? To make sure it doesn't disappear I copied it here. |
I discovered JuliaLang/julia#50434 last night which was a gcext error that was fixed a few days ago in JuliaLang/julia#50533 Of course now we have even more nightly crashes for other reasons (???) so I am not sure whether or not this error here persists... |
In the 32 CI jobs (here + here, all on Linux to avoid occupying the macos runners) that I ran last night on nightly there were 6 failures but none like the one discussed here.
Another round of tests is still running. But there is one new issue: There are very long GC pauses now and some of the tests take extremely long.
I tried bisecting this locally which is always a bit complicated with timing issues as they don't always show up. But this seems to have started with JuliaLang/julia#50533 which added a new loop for the freelist to
|
To trigger these long pauses it suffices to run some of our test files in the same session, e.g.:
Other testgroups should work as well, some more extreme, some less, e.g. for
PS: This needs a current Oscar master, older releases are not working with julia nightly. |
yeah if you can get a MWE of this, it would be great to get it into the GCBenchmarks |
The offending loop over the freelist was removed again in JuliaLang/julia#50599 so hopefully the long GC pauses are gone again... |
Should we close this in favor of #2441? Or maybe open a new issue? The CI situation with 1.10+? and nightly is absurd. |
I think the issue discussed here is resolved (the crash and the pauses). |
Another rare crash, this time during the GAP OscarInterface tests, log:
The log is from this run:
https://github.com/oscar-system/Oscar.jl/actions/runs/5516967428/jobs/10058976212?pr=2542#step:6:441 (julia commit 680e3b3320f)
More logs:
https://github.com/oscar-system/Oscar.jl/actions/runs/5513751902/jobs/10052247798#step:6:811 (julia commit e2e34f6987d)
https://github.com/oscar-system/Oscar.jl/actions/runs/5511670816/jobs/10047525726#step:6:813 (julia 1.10.0-alpha1)
cc: @fingolfin
The text was updated successfully, but these errors were encountered: