[docker-wait-any]: Exit worker thread if main thread is expected to exit #12255

saiarcot895 · 2022-10-03T03:38:43Z

Signed-off-by: Saikrishna Arcot sarcot@microsoft.com

Why I did it

There's an odd crash that intermittently happens during config reload or config load_minigraph. The swss container has a python script at /usr/bin/docker-wait-any that waits for either the swss, syncd, or teamd containers to exit (one container being monitored per thread). When a container exits, a signal is sent to the main thread to exit the python script, which then tells systemd to stop the container.

Snippet of the backtrace:

#0  0x00007f2a01296ce1 in raise () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x00007f2a01280537 in abort () from /lib/x86_64-linux-gnu/libc.so.6
#2  0x00007f29ff3147ec in __gnu_cxx::__verbose_terminate_handler () at ../../../../src/libstdc++-v3/libsupc++/vterminate.cc:95
#3  0x00007f29ff31f966 in __cxxabiv1::__terminate (handler=<optimized out>) at ../../../../src/libstdc++-v3/libsupc++/eh_terminate.cc:48
#4  0x00007f29ff31f9d1 in std::terminate () at ../../../../src/libstdc++-v3/libsupc++/eh_terminate.cc:58
#5  0x00007f29ff31f3cc in __cxxabiv1::__gxx_personality_v0 (version=<optimized out>, actions=10, exception_class=0, ue_header=0x7f29fdc9cd70, context=<optimized out>)
    at ../../../../src/libstdc++-v3/libsupc++/eh_personality.cc:673
#6  0x00007f29ff2708a4 in _Unwind_ForcedUnwind_Phase2 (exc=0x7f29fdc9cd70, context=0x7f29fdc9b0c0, frames_p=0x7f29fdc9afc8) at ../../../src/libgcc/unwind.inc:182
#7  0x00007f29ff270f4e in _Unwind_ForcedUnwind (exc=0x7f29fdc9cd70, stop=<optimized out>, stop_argument=0x7f29fdc9bf10) at ../../../src/libgcc/unwind.inc:217
#8  0x00007f2a015e1c30 in __pthread_unwind () from /lib/x86_64-linux-gnu/libpthread.so.0
#9  0x00007f2a015d918c in pthread_exit () from /lib/x86_64-linux-gnu/libpthread.so.0
#10 0x0000000000645df5 in PyThread_exit_thread () at ../Python/thread_pthread.h:373
#11 0x00000000004262ae in take_gil (tstate=0x1144670) at ../Python/ceval_gil.h:224
#12 0x00000000005327b2 in PyEval_RestoreThread (tstate=tstate@entry=0x1144670) at ../Python/ceval.c:467
#13 0x00007f29ffb6ec1a in PyThreadStateGuard::~PyThreadStateGuard (this=<synthetic pointer>, __in_chrg=<optimized out>) at swsscommon_wrap.cpp:24543
#14 _wrap_SonicV2Connector_Native_connect (args=<optimized out>, kwargs=<optimized out>) at swsscommon_wrap.cpp:24545
#15 0x000000000053f350 in cfunction_call (func=<built-in method SonicV2Connector_Native_connect of module object at remote 0x7f29ffc97770>, args=<optimized out>, kwargs=<optimized out>)
    at ../Objects/methodobject.c:539
#16 0x000000000051d89b in _PyObject_MakeTpCall (tstate=0x1144670, callable=<built-in method SonicV2Connector_Native_connect of module object at remote 0x7f29ffc97770>, args=<optimized out>,
    nargs=<optimized out>, keywords=<optimized out>) at ../Objects/call.c:191
#17 0x00000000005175ba in _PyObject_VectorcallTstate (kwnames=0x0, nargsf=<optimized out>, args=0x7f29fecb8e10,
    callable=<built-in method SonicV2Connector_Native_connect of module object at remote 0x7f29ffc97770>, tstate=0x1144670) at ../Include/cpython/abstract.h:116
#18 _PyObject_VectorcallTstate (kwnames=0x0, nargsf=<optimized out>, args=0x7f29fecb8e10, callable=<built-in method SonicV2Connector_Native_connect of module object at remote 0x7f29ffc97770>,
    tstate=0x1144670) at ../Include/cpython/abstract.h:103
#19 PyObject_Vectorcall (kwnames=0x0, nargsf=<optimized out>, args=0x7f29fecb8e10, callable=<built-in method SonicV2Connector_Native_connect of module object at remote 0x7f29ffc97770>)
    at ../Include/cpython/abstract.h:127
#20 call_function (kwnames=0x0, oparg=<optimized out>, pp_stack=<synthetic pointer>, tstate=0x1144670) at ../Python/ceval.c:5072
#21 _PyEval_EvalFrameDefault (tstate=<optimized out>, f=<optimized out>, throwflag=<optimized out>) at ../Python/ceval.c:3487
#22 0x00000000005106ed in _PyEval_EvalFrame (throwflag=0,
    f=Frame 0x7f29fecb8c80, for file /usr/lib/python3/dist-packages/swsscommon/swsscommon.py, line 1651, in connect (self=<SonicV2Connector(this=<SwigPyObject at remote 0x7f29fecb7cf0>, STATE_DB='STATE_DB', APPL_DB='APPL_DB', GB_FLEX_COUNTER_DB='GB_FLEX_COUNTER_DB', APPL_STATE_DB='APPL_STATE_DB', ASIC_DB='ASIC_DB', CONFIG_DB='CONFIG_DB', COUNTERS_DB='COUNTERS_DB', LOGLEVEL_DB='LOGLEVEL_DB', GB_ASIC_DB='GB_ASIC_DB', GB_COUNTERS_DB='GB_COUNTERS_DB', PFC_WD_DB='PFC_WD_DB', FLEX_COUNTER_DB='FLEX_COUNTER_DB', RESTAPI_DB='RESTAPI_DB', SNMP_OVERLAY_DB='SNMP_OVERLAY_DB') at remote 0x7f2a0051d040>, db_name='STATE_DB', retry_on=False), tstate=0x1144670) at ../Include/internal/pycore_ceval.h:40

What's happening is that after the teamd container exits, the signal is sent to the main thread, but because there's no return or exit out of the while True loop that it's in, it waits for the teamd container to exit. Because the container isn't running, this is effectively a no-op, and execution moves on to the device_info.is_warm_restart_enabled(container_name) and device_info.is_fast_reboot_enabled() function calls (which call to C++ code). Meanwhile, the main thread has called sys.exit(0), and Python is bringing down all of its references/data structures, and (more importantly here) is telling the other threads to exit.

For the teamd thread, when it returns from calling the C++ function, the wrapper code generated by SWIG is destructing a C++ object that it has created (for the purposes of saving/restoring the Python thread state). It calls PyEval_RestoreThread() in the destructor, which sees that the thread is supposed to exit, and proceeds to call pthread_exit(). This is shown in frames 12 and 13 above.

pthread_exit() then calls a function that will unwind the stack, so that any cleanup or other handler functions can be called. This is so that there's a graceful exit to the thread. However, the way that unwinding works is that a special exception is called (abi::__forced_unwind or __cxxabiv1::__forced_unwind) that is expected to be propagated to the first frame. As the unwinder works frame-by-frame, one of these things happen for each frame on the stack:

If there's no cleanup handlers or exception handlers registered for that frame, then it just moves on.
If there's a cleanup handler registered, then that gets called.
If there's an exception handler registered (i.e. try/catch block), and there's a matching catch block for this exception, then that will get called.

For most frames, nothing probably happens. However, one of the frames on the stack here is a C++ destructor (which had called PyEval_RestoreThread() earlier). In C++11 and newer, C++ destructors are not allowed to have an exception get propagated outside of the destructor, and if they do, std::terminate() gets called. In other words, any exceptions that could be caused by functions that the destructor calls must be handled within the destructor, and must not be propagated up the stack. If the destructor specifies noexcept(false) to signify that exceptions could be propagated up, then maybe it's fine (I'm not entirely certain about this). Because the unwinder essentially uses a special exception to go up the stack, std::terminate gets called, which then results in a SIGABRT for the process. Because of this SIGABRT, systemd appears to treat the service as stopped, and doesn't call the ExecStop= command, which means the containers don't actually go down.

All of this is a timing issue; if it's unlucky enough that the thread exiting check is done around the call to C++ code, then a SIGABRT could happen. This, unfortunately, appears to be happening sufficiently often in some cases, as well as some forced cases (see below).

How I did it

A quick workaround is that if we know the main thread needs to exit, just return after sending the signal to the main thread, and don't continue execution. This at least tries to avoid it from getting into the problematic code path. However, it's still possible to get a SIGABRT because of the above, depending on thread/process timings (i.e. teamd exits, signals the main thread to exit, and then syncd exits, and syncd calls one of the two C++ functions, potentially hitting the issue).

A proper fix would likely be to make sure PyEval_RestoreThread() (and, in turn, pthread_exit() gets called from a regular C++ function, and not the destructor. The SWIG wrapper code generated with the -threads option does this, but there's still a gap there where it might get called from the destructor, so it's not immune. (Currently, the swsscommon wrapper code is not using the -threads option, and is manually adding support for multithreading.)

How to verify it

This was tested with the following Bash script. On my dev VM, at least, with this script, the core file was repro'ed in 1-2 iterations. With my fix, 90+ iterations were successfully done with no core file:

#!/bin/bash

set -euo pipefail

ITERATION=0

while [ -n "$(find "/var/core" -maxdepth 0 -type d -empty 2>/dev/null)" ]; do
        ITERATION=$(( $ITERATION + 1 ))
        echo "Starting iteration ${ITERATION}"
        python3 /usr/bin/docker-wait-any -s swss -d syncd teamd &
        python3 /usr/bin/docker-wait-any -s swss -d syncd teamd &
        python3 /usr/bin/docker-wait-any -s swss -d syncd teamd &
        python3 /usr/bin/docker-wait-any -s swss -d syncd teamd &
        sleep 3
        config load_minigraph -y
        sleep 90
done

echo "Core file found on iteration ${ITERATION}!"

Which release branch to backport (provide reason below if selected)

Description for the changelog

Ensure to add label/tag for the feature raised.

Link to config_db schema for YANG module changes

A picture of a cute animal (not mandatory but encouraged)

There's an odd crash that intermittently happens after the teamd container exits, and a signal is raised to the main thread to exit. This thread (watching teamd) continues execution because it's in a `while True`. The subsequent wait call on the teamd container very likely returns immediately, and it calls `is_warm_restart_enabled` and `is_fast_reboot_enabled`. In either of these cases, sometimes, there is a crash in the transition from C code to Python code (after the function gets executed). Python sees that this thread got a signal to exit, because the main thread is exiting, and tells pthread to exit the thread. However, during the stack unwinding, _something_ is telling the unwinder to call `std::terminate`. The reason is unknown. This then results in a python3 SIGABRT, and systemd then doesn't call the stop script to actually stop the container (possibly because the main process exited with a SIGABRT, so it's a hard crash). This means that the container doesn't actually get stopped or restarted, resulting in an inconsistent state afterwards. The workaround appears to be that if we know the main thread needs to exit, just return here, and don't continue execution. This at least tries to avoid it from getting into the problematic code path. However, it's still feasible to get a SIGABRT, depending on thread/process timings (i.e. teamd exits, signals the main thread to exit, and then syncd exits, and syncd calls one of the two C functions, potentially hitting the issue). Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>

saiarcot895 · 2022-10-05T16:27:15Z

/azp run Azure.sonic-buildimage

azure-pipelines · 2022-10-05T16:28:16Z

Azure Pipelines successfully started running 1 pipeline(s).

…xit (#12255) There's an odd crash that intermittently happens after the teamd container exits, and a signal is raised to the main thread to exit. This thread (watching teamd) continues execution because it's in a `while True`. The subsequent wait call on the teamd container very likely returns immediately, and it calls `is_warm_restart_enabled` and `is_fast_reboot_enabled`. In either of these cases, sometimes, there is a crash in the transition from C code to Python code (after the function gets executed). Python sees that this thread got a signal to exit, because the main thread is exiting, and tells pthread to exit the thread. However, during the stack unwinding, _something_ is telling the unwinder to call `std::terminate`. The reason is unknown. This then results in a python3 SIGABRT, and systemd then doesn't call the stop script to actually stop the container (possibly because the main process exited with a SIGABRT, so it's a hard crash). This means that the container doesn't actually get stopped or restarted, resulting in an inconsistent state afterwards. The workaround appears to be that if we know the main thread needs to exit, just return here, and don't continue execution. This at least tries to avoid it from getting into the problematic code path. However, it's still feasible to get a SIGABRT, depending on thread/process timings (i.e. teamd exits, signals the main thread to exit, and then syncd exits, and syncd calls one of the two C functions, potentially hitting the issue). Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com> Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>

Related work items: sonic-net#2151, sonic-net#2194, sonic-net#2224, sonic-net#2237, sonic-net#2264, sonic-net#2281, sonic-net#2286, sonic-net#2297, sonic-net#2299, sonic-net#2305, sonic-net#2325, sonic-net#2335, sonic-net#2338, sonic-net#2341, sonic-net#2343, sonic-net#2347, sonic-net#2350, sonic-net#2355, sonic-net#2356, sonic-net#2358, sonic-net#2360, sonic-net#2363, sonic-net#2367, sonic-net#2368, sonic-net#2370, sonic-net#2374, sonic-net#2392, sonic-net#2398, sonic-net#2408, sonic-net#2414, sonic-net#2415, sonic-net#2419, sonic-net#2421, sonic-net#2422, sonic-net#2423, sonic-net#2426, sonic-net#2427, sonic-net#2430, sonic-net#2431, sonic-net#2433, sonic-net#2434, sonic-net#2436, sonic-net#2437, sonic-net#2441, sonic-net#2444, sonic-net#2445, sonic-net#2446, sonic-net#2456, sonic-net#2458, sonic-net#2460, sonic-net#2461, sonic-net#2463, sonic-net#2472, sonic-net#2475, sonic-net#11877, sonic-net#12024, sonic-net#12065, sonic-net#12097, sonic-net#12130, sonic-net#12209, sonic-net#12217, sonic-net#12244, sonic-net#12251, sonic-net#12255, sonic-net#12276, sonic-net#12284

saiarcot895 marked this pull request as ready for review October 6, 2022 01:13

saiarcot895 requested a review from lguohan as a code owner October 6, 2022 01:13

yxieca approved these changes Oct 6, 2022

View reviewed changes

yxieca added the Request for 202205 Branch label Oct 6, 2022

yxieca merged commit 9251d4b into sonic-net:master Oct 6, 2022

saiarcot895 deleted the fix-python-crash branch October 6, 2022 01:14

yxieca added the Included in 202205 Branch label Oct 6, 2022

saiarcot895 mentioned this pull request Oct 12, 2022

Potential Python crash with -threads option and Python multithreading swig/swig#2396

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[docker-wait-any]: Exit worker thread if main thread is expected to exit #12255

[docker-wait-any]: Exit worker thread if main thread is expected to exit #12255

saiarcot895 commented Oct 3, 2022 •

edited

Loading

saiarcot895 commented Oct 5, 2022

azure-pipelines bot commented Oct 5, 2022

[docker-wait-any]: Exit worker thread if main thread is expected to exit #12255

[docker-wait-any]: Exit worker thread if main thread is expected to exit #12255

Conversation

saiarcot895 commented Oct 3, 2022 • edited Loading

Why I did it

How I did it

How to verify it

Which release branch to backport (provide reason below if selected)

Description for the changelog

Ensure to add label/tag for the feature raised.

Link to config_db schema for YANG module changes

A picture of a cute animal (not mandatory but encouraged)

saiarcot895 commented Oct 5, 2022

azure-pipelines bot commented Oct 5, 2022

saiarcot895 commented Oct 3, 2022 •

edited

Loading