Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PyCode_GetCode is not thread-safe and causes assertion fail with Python 3.13td #127020

Closed
XuehaiPan opened this issue Nov 19, 2024 · 3 comments
Closed
Labels
3.13 bugs and security fixes 3.14 new features, bugs and security fixes topic-free-threading type-crash A hard crash of the interpreter, possibly with a core dump

Comments

@XuehaiPan
Copy link
Contributor

XuehaiPan commented Nov 19, 2024

Crash report

What happened?

Race condition here:

cpython/Objects/codeobject.c

Lines 1663 to 1665 in 60403a5

deopt_code(co, (_Py_CODEUNIT *)PyBytes_AS_STRING(code));
assert(co->_co_cached->_co_code == NULL);
co->_co_cached->_co_code = Py_NewRef(code);

Core dump and backtrace:

https://github.com/metaopt/optree/actions/runs/11913729282/job/33200071659#step:15:172

Core was generated by `python -X dev -m pytest --verbose --color=yes --durations=10 --showlocals --cov'.
Program terminated with signal SIGABRT, Aborted.
#0  __pthread_kill_implementation (no_tid=0, signo=6, threadid=140335707584064) at ./nptl/pthread_kill.c:44
[Current thread is 1 (Thread 0x7fa273fff640 (LWP 2491))]
#0  __pthread_kill_implementation (no_tid=0, signo=6, threadid=140335707584064) at ./nptl/pthread_kill.c:44
#1  __pthread_kill_internal (signo=6, threadid=140335707584064) at ./nptl/pthread_kill.c:78
#2  __GI___pthread_kill (threadid=140335707584064, signo=signo@entry=6) at ./nptl/pthread_kill.c:89
#3  0x00007fa27b642476 in __GI_raise (sig=6) at ../sysdeps/posix/raise.c:26
#4  0x00007fa27be3f0f0 in faulthandler_fatal_error (signum=6) at ./Modules/faulthandler.c:338
#5  <signal handler called>
#6  __pthread_kill_implementation (no_tid=0, signo=6, threadid=140335707584064) at ./nptl/pthread_kill.c:44
#7  __pthread_kill_internal (signo=6, threadid=140335707584064) at ./nptl/pthread_kill.c:78
#8  __GI___pthread_kill (threadid=140335707584064, signo=signo@entry=6) at ./nptl/pthread_kill.c:89
#9  0x00007fa27b642476 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#10 0x00007fa27b6287f3 in __GI_abort () at ./stdlib/abort.c:79
#11 0x00007fa27b62871b in __assert_fail_base (fmt=0x7fa27b7dd130 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", 
    assertion=0x7fa27bf0b730 "co->_co_cached->_co_code == NULL", file=0x7fa27bf0b174 "Objects/codeobject.c", line=1664, 
    function=<optimized out>) at ./assert/assert.c:92
#12 0x00007fa27b639e96 in __GI___assert_fail (assertion=0x7fa27bf0b730 "co->_co_cached->_co_code == NULL", 
    file=0x7fa27bf0b174 "Objects/codeobject.c", line=1664, function=0x7fa27bf0c0e0 <__PRETTY_FUNCTION__.13> "_PyCode_GetCode")
    at ./assert/assert.c:101
#13 0x00007fa27bb7f165 in _PyCode_GetCode (co=0x200023271d0) at Objects/codeobject.c:1664
#14 0x00007fa27bb7f1a5 in PyCode_GetCode (co=0x200023271d0) at Objects/codeobject.c:1672
#15 0x00007fa27b1b2547 in CTracer_handle_call (frame=0x20026051010, self=0x20026070110) at coverage/ctracer/tracer.c:557
#16 CTracer_trace (self=0x20026070110, frame=0x20026051010, what=0, arg_unused=<optimized out>)
    at coverage/ctracer/tracer.c:844
#17 0x00007fa27bdd738d in call_trace_func (self=0x20002152ea0, arg=0x7fa27c14fc60 <_Py_NoneStruct>)
    at Python/legacy_tracing.c:189
#18 0x00007fa27bdd75fa in sys_trace_start (self=0x20002152ea0, args=0x7fa273ffcdf8, nargsf=9223372036854775810, kwnames=0x0)
    at Python/legacy_tracing.c:229
#19 0x00007fa27bdcc0a5 in _PyObject_VectorcallTstate (tstate=0x5620713a56e0, callable=0x20002152ea0, args=0x7fa273ffcdf8, 
    nargsf=9223372036854775810, kwnames=0x0) at ./Include/internal/pycore_call.h:168
#20 0x00007fa27bdce182 in call_one_instrument (interp=0x7fa27c1975c0 <_PyRuntime+128640>, tstate=0x5620713a56e0, 
    args=0x7fa273ffcdf8, nargsf=9223372036854775810, tool=7 '\a', event=0) at Python/instrumentation.c:907
#21 0x00007fa27bdcea63 in call_instrumentation_vector (tstate=0x5620713a56e0, event=0, frame=0x7fa27b0d8350, 
    instr=0x200023272aa, nargs=2, args=0x7fa273ffcdf0) at Python/instrumentation.c:1095
#22 0x00007fa27bdcec5d in _Py_call_instrumentation (tstate=0x5620713a56e0, event=0, frame=0x7fa27b0d8350, instr=0x200023272aa)
    at Python/instrumentation.c:1132
#23 0x00007fa27bd46096 in _PyEval_EvalFrameDefault (tstate=0x5620713a56e0, frame=0x7fa27b0d8350, throwflag=0)
    at Python/generated_cases.c.h:3474
#24 0x00007fa27bd32d31 in _PyEval_EvalFrame (tstate=0x5620713a56e0, frame=0x7fa27b0d8020, throwflag=0)
    at ./Include/internal/pycore_ceval.h:119
#25 0x00007fa27bd56877 in _PyEval_Vector (tstate=0x5620713a56e0, func=0x200012a69d0, locals=0x0, args=0x7fa273ffec40, 
    argcount=1, kwnames=0x0) at Python/ceval.c:1806
#26 0x00007fa27bb7271e in _PyFunction_Vectorcall (func=0x200012a69d0, stack=0x7fa273ffec40, nargsf=1, kwnames=0x0)
    at Objects/call.c:413
#27 0x00007fa27bb765f3 in _PyObject_VectorcallTstate (tstate=0x5620713a56e0, callable=0x200012a69d0, args=0x7fa273ffec40, 
    nargsf=1, kwnames=0x0) at ./Include/internal/pycore_call.h:168
#28 0x00007fa27bb76c80 in method_vectorcall (method=0x200248b8890, args=0x7fa27c197548 <_PyRuntime+128520>, nargsf=0, 
    kwnames=0x0) at Objects/classobject.c:70
#29 0x00007fa27bb7208f in _PyVectorcall_Call (tstate=0x5620713a56e0, func=0x7fa27bb76a66 <method_vectorcall>, 
    callable=0x200248b8890, tuple=0x7fa27c197520 <_PyRuntime+128480>, kwargs=0x0) at Objects/call.c:273
#30 0x00007fa27bb7243c in _PyObject_Call (tstate=0x5620713a56e0, callable=0x200248b8890, 
    args=0x7fa27c197520 <_PyRuntime+128480>, kwargs=0x0) at Objects/call.c:348
#31 0x00007fa27bb72517 in PyObject_Call (callable=0x200248b8890, args=0x7fa27c197520 <_PyRuntime+128480>, kwargs=0x0)
    at Objects/call.c:373
#32 0x00007fa27bec8297 in thread_run (boot_raw=0x5620712f0d80) at ./Modules/_threadmodule.c:337
#33 0x00007fa27be210f7 in pythread_wrapper (arg=0x5620712f0f30) at Python/thread_pthread.h:243
#34 0x00007fa27b694ac3 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
#35 0x00007fa27b726850 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81

CPython versions tested on:

3.13

Operating systems tested on:

Linux

Output from running 'python -VV' on the command line:

Python 3.13.0 experimental free-threading build

Linked PRs

@XuehaiPan XuehaiPan added the type-crash A hard crash of the interpreter, possibly with a core dump label Nov 19, 2024
@colesbury colesbury added topic-free-threading 3.13 bugs and security fixes 3.14 new features, bugs and security fixes labels Nov 19, 2024
colesbury added a commit to colesbury/cpython that referenced this issue Nov 19, 2024
Some fields in PyCodeObject are lazily initialized. Use atomics and
critical sections to make their initialization and access thread-safe.
colesbury added a commit to colesbury/cpython that referenced this issue Nov 19, 2024
Some fields in PyCodeObject are lazily initialized. Use atomics and
critical sections to make their initializations and accesses thread-safe.
colesbury added a commit to colesbury/cpython that referenced this issue Nov 19, 2024
Some fields in PyCodeObject are lazily initialized. Use atomics and
critical sections to make their initializations and accesses thread-safe.
@colesbury
Copy link
Contributor

Thanks for the bug report @XuehaiPan. I think the linked PR should fix the crash. I verified it with the following command when running optree tests:

python -X dev -m pytest --verbose --color=yes --durations=10 --showlocals --cov="optree" --cov-config=.coveragerc --cov-report=xml --cov-report=term-missing --exitfirst --cov-report=xml:coverage-cp313td-Linux.xml --junit-xml=junit-cp313td-Linux.xml .

@XuehaiPan
Copy link
Contributor Author

XuehaiPan commented Nov 20, 2024

I think the linked PR should fix the crash. I verified it with the following command when running optree tests.

Thanks for the fix. I can confirm the unit test in #127043 fails on the main branch and passes with the patch in #127043.

$ python3 -VV
Python 3.14.0a2+ experimental free-threading build (heads/main:c9b399fbdb0, Nov 20 2024, 15:18:27) [Clang 16.0.0 (clang-1600.0.26.4)]
$ python3 test_code.py
Assertion failed: (co->_co_cached->_co_code == NULL), function _PyCode_GetCode, file codeobject.c, line 1686.
[1]    37607 abort      python3 test_code.py
$ python3 -VV
Python 3.14.0a2+ experimental free-threading build (heads/pr/colesbury/127043:9242a83444c, Nov 20 2024, 15:13:34) [Clang 16.0.0 (clang-1600.0.26.4)]
$ python3 test_code.py
.
----------------------------------------------------------------------
Ran 1 test in 0.019s

OK

colesbury added a commit that referenced this issue Nov 21, 2024
)

Some fields in PyCodeObject are lazily initialized. Use atomics and
critical sections to make their initializations and accesses thread-safe.
miss-islington pushed a commit to miss-islington/cpython that referenced this issue Nov 21, 2024
…pythonGH-127043)

Some fields in PyCodeObject are lazily initialized. Use atomics and
critical sections to make their initializations and accesses thread-safe.
(cherry picked from commit 3926842117feffe5d2c9727e1899bea5ae2adb28)

Co-authored-by: Sam Gross <colesbury@gmail.com>
colesbury added a commit that referenced this issue Nov 21, 2024
GH-127043) (GH-127107)

Some fields in PyCodeObject are lazily initialized. Use atomics and
critical sections to make their initializations and accesses thread-safe.
(cherry picked from commit 3926842)

Co-authored-by: Sam Gross <colesbury@gmail.com>
@Eclips4
Copy link
Member

Eclips4 commented Nov 21, 2024

I guess it can be closed now?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.13 bugs and security fixes 3.14 new features, bugs and security fixes topic-free-threading type-crash A hard crash of the interpreter, possibly with a core dump
Projects
None yet
Development

No branches or pull requests

3 participants