Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

crashers/mutation_inside_cyclegc.py hangs and sometimes segfaults on the free-threading build #126365

Open
tomasr8 opened this issue Nov 3, 2024 · 5 comments
Labels
3.14 new features, bugs and security fixes topic-free-threading type-crash A hard crash of the interpreter, possibly with a core dump

Comments

@tomasr8
Copy link
Member

tomasr8 commented Nov 3, 2024

Crash report

What happened?

I ran into this while working on #126360

Running ./python Lib/test/crashers/mutation_inside_cyclegc.py built with --with-pydebug --disable-gil
hangs and randomly segfaults. I managed to get the backtrace from gdb:

gdb output
(gdb) run ./Lib/test/crashers/mutation_inside_cyclegc.py 
Starting program: /home/tomas/dev/cpython/python ./Lib/test/crashers/mutation_inside_cyclegc.py
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".

Program received signal SIGSEGV, Segmentation fault.
visit_decref (op=<unknown at remote 0x28080fbd450>, arg=0x0) at Python/gc_free_threading.c:442
442         if (_PyObject_GC_IS_TRACKED(op) && !_Py_IsImmortal(op)) {
(gdb) bt
#0  visit_decref (op=<unknown at remote 0x28080fbd450>, arg=0x0) at Python/gc_free_threading.c:442
#1  0x00005555556bfad9 in list_traverse (
    self=[<weakref.ReferenceType at remote 0x20000573e90>, ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], [...(truncated), visit=0x5555558a2b58 <visit_decref>, arg=0x0) at Objects/listobject.c:3286
#2  0x00005555558a5f2d in update_refs (heap=<optimized out>, area=<optimized out>, block=<optimized out>, block_size=<optimized out>, args=<optimized out>) at Python/gc_free_threading.c:509
#3  0x0000555555715d7a in _mi_heap_area_visit_blocks (area=area@entry=0x7fffffffcff0, page=0x200000018a8, visitor=0x5555558a5e62 <update_refs>, arg=0x7fffffffd200) at Objects/mimalloc/heap.c:566
#4  0x000055555571650d in mi_heap_area_visitor (heap=heap@entry=0x555555c95140 <_PyRuntime+366400>, xarea=xarea@entry=0x7fffffffcff0, arg=arg@entry=0x7fffffffd0d0) at Objects/mimalloc/heap.c:681
#5  0x0000555555715a93 in mi_heap_visit_areas_page (heap=0x555555c95140 <_PyRuntime+366400>, pq=<optimized out>, page=<optimized out>, vfun=0x5555557164c9 <mi_heap_area_visitor>, arg=0x7fffffffd0d0) at Objects/mimalloc/heap.c:661
#6  0x000055555570bd28 in mi_heap_visit_pages (heap=heap@entry=0x555555c95140 <_PyRuntime+366400>, fn=fn@entry=0x555555715a4a <mi_heap_visit_areas_page>, arg1=arg1@entry=0x5555557164c9 <mi_heap_area_visitor>, arg2=arg2@entry=0x7fffffffd0d0)
    at Objects/mimalloc/heap.c:46
#7  0x000055555570bd93 in mi_heap_visit_areas (heap=heap@entry=0x555555c95140 <_PyRuntime+366400>, visitor=visitor@entry=0x5555557164c9 <mi_heap_area_visitor>, arg=arg@entry=0x7fffffffd0d0) at Objects/mimalloc/heap.c:667
#8  0x000055555571be62 in mi_heap_visit_blocks (heap=heap@entry=0x555555c95140 <_PyRuntime+366400>, visit_blocks=visit_blocks@entry=true, visitor=visitor@entry=0x5555558a5e62 <update_refs>, arg=arg@entry=0x7fffffffd200) at Objects/mimalloc/heap.c:692
#9  0x00005555558a2d8e in gc_visit_heaps_lock_held (interp=interp@entry=0x555555c5b280 <_PyRuntime+129152>, visitor=visitor@entry=0x5555558a5e62 <update_refs>, arg=arg@entry=0x7fffffffd200) at Python/gc_free_threading.c:309
#10 0x00005555558a2e71 in gc_visit_heaps (interp=interp@entry=0x555555c5b280 <_PyRuntime+129152>, visitor=visitor@entry=0x5555558a5e62 <update_refs>, arg=arg@entry=0x7fffffffd200) at Python/gc_free_threading.c:348
#11 0x00005555558a36da in deduce_unreachable_heap (interp=interp@entry=0x555555c5b280 <_PyRuntime+129152>, state=state@entry=0x7fffffffd200) at Python/gc_free_threading.c:688
#12 0x00005555558a6c88 in gc_collect_internal (interp=0x555555c5b280 <_PyRuntime+129152>, state=state@entry=0x7fffffffd200, generation=generation@entry=0) at Python/gc_free_threading.c:1235
#13 0x00005555558a6e7a in gc_collect_main (tstate=tstate@entry=0x555555c937e0 <_PyRuntime+359904>, generation=generation@entry=0, reason=reason@entry=_Py_GC_REASON_HEAP) at Python/gc_free_threading.c:1349
#14 0x00005555558a7228 in _Py_RunGC (tstate=tstate@entry=0x555555c937e0 <_PyRuntime+359904>) at Python/gc_free_threading.c:1799
#15 0x00005555558b133e in _Py_HandlePending (tstate=tstate@entry=0x555555c937e0 <_PyRuntime+359904>) at Python/ceval_gil.c:1297
#16 0x000055555583e6cc in _PyEval_EvalFrameDefault (tstate=tstate@entry=0x555555c937e0 <_PyRuntime+359904>, frame=<optimized out>, throwflag=throwflag@entry=0) at Python/generated_cases.c.h:1003
#17 0x000055555586b756 in _PyEval_EvalFrame (throwflag=0, frame=<optimized out>, tstate=0x555555c937e0 <_PyRuntime+359904>) at ./Include/internal/pycore_ceval.h:116
#18 _PyEval_Vector (tstate=tstate@entry=0x555555c937e0 <_PyRuntime+359904>, func=func@entry=0x20000a52cb0, 
    locals=locals@entry={'__name__': '__main__', '__doc__': None, '__package__': None, '__loader__': <SourceFileLoader(name='__main__', path='/home/tomas/dev/cpython/./Lib/test/crashers/mutation_inside_cyclegc.py') at remote 0x20000330a20>, '__spec__': None, '__builtins__': <module at remote 0x2000025c1e0>, '__file__': '/home/tomas/dev/cpython/./Lib/test/crashers/mutation_inside_cyclegc.py', '__cached__': None, 'weakref': <module at remote 0x20000798190>, 'A': <type at remote 0x20000c44b10>, 'callback': <function at remote 0x20000a5a030>, 'keepalive': [<weakref.ReferenceType at remote 0x20000573e90>, ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'], ['0'],

CPython versions tested on:

CPython main branch

Operating systems tested on:

Linux

Output from running 'python -VV' on the command line:

No response

@tomasr8 tomasr8 added type-crash A hard crash of the interpreter, possibly with a core dump topic-free-threading 3.14 new features, bugs and security fixes labels Nov 3, 2024
@ZeroIntensity
Copy link
Member

Well, isn't the point of anything inside crashers/ that it crashes? I find it odd that this doesn't crash on the GIL-ful builds.

@tomasr8
Copy link
Member Author

tomasr8 commented Nov 3, 2024

I should clarify, this example does not crash anymore (and probably hasn't for a while) on the normal build. The crasher also predates free-threading and the crash seems to be related to the way GC works in free-threading, though I don't know enough about that topic and hence this issue 🙂

@ZeroIntensity
Copy link
Member

I've been doing some work on the free-threaded GC recently, I can investigate. Though, we probably should add tests to make sure that the crashers actually crash.

@tomasr8
Copy link
Member Author

tomasr8 commented Nov 3, 2024

That'd be great, thanks!

Though, we probably should add tests to make sure that the crashers actually crash.

See the original issue for more context: #121921 I started by moving non-crashers out and into unit tests to ensure we don't regress, but having tests for crashers might also make sense.

@zware
Copy link
Member

zware commented Nov 5, 2024

There's some history around running the crashers that should be dug up and understood before re-enabling them in automated testing. I particularly don't want them running on buildbots unless we can guarantee (better than we used to :)) that they're not going to leave things in a mess, but they could possibly make sense in GHA.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.14 new features, bugs and security fixes topic-free-threading type-crash A hard crash of the interpreter, possibly with a core dump
Projects
None yet
Development

No branches or pull requests

3 participants