Crash during shutdown with patch #8

Description
Originally reported by: RMTEW FULL NAME (Bitbucket: rmtew, GitHub: rmtew)
(originally reported in Trac by @akruis on 2013-01-24 16:13:06)
Setup
I'm working with stackless python version 2.7. compiled from a mercurial sandbox. Current changeset: f34947c81d3e+ (2.7-slp)
OS: Windows 7, Compiler VS 2008 professional, build target is x86 release with optimisation turned off.
Testcase
My test code is huge, confidential and the crash disappears if I make small modifications. The crash happens in about 1 of 5 test runs.
Details
- The windows error message is always: "Unhandled exception at 0x77df15de (ntdll.dll) in _fg2python.exe: 0xC0000005: Access violation reading location 0x00000018."
- The crash does not occur with versions before changeset 74135:ac70790fa499
- The crash occurs with every version that includes 74135:ac70790fa499
- The code does not involve tasklet switching.
- The crash does not occur if I make one of the following modifications:
- disable atexit processing
- call gc.collect() within atexit
- call stackless.enable_softswitch(False) during program startup
- I can't reproduce the crash on Linux 64bit
I'm fairly confident that I can explain and fix the problem. Look at the call stack:
Call Stack (innermost frame first)
ntdll.dll!_ZwRaiseException@12() + 0x12 bytes
ntdll.dll!_ZwRaiseException@12() + 0x12 bytes
python27.dll!_string_tailmatch() Line 2900 + 0x1c bytes C
This line varies between test runs. The arguments on the stack usually don't match to the code location.
python27.dll!PyEval_EvalFrameEx_slp(_frame * f=0x0318d8c0, int throwflag=0, _object * retval=0x1e320030) Line 964 + 0x11 bytes C
frame: co_filename "f:\fg2\eclipsews\fg2py\arch\win32\bin..\libexec\lib\linecache.py", co_name "updatecache" f_lasti=72
python27.dll!PyEval_EvalFrame_value(_frame * f=0x0328a040, int throwflag=0, _object * retval=0x1e320030) Line 3271 + 0x1a bytes C
python27.dll!PyEval_EvalFrameEx_slp(_frame * f=0x0328a040, int throwflag=0, _object * retval=0x1e320030) Line 964 + 0x11 bytes C
python27.dll!PyEval_EvalFrame_value(_frame * f=0x03289ec8, int throwflag=0, _object * retval=0x1e320030) Line 3271 + 0x1a bytes C
python27.dll!PyEval_EvalFrameEx_slp(_frame * f=0x03289ec8, int throwflag=0, _object * retval=0x1e320030) Line 964 + 0x11 bytes C
python27.dll!PyEval_EvalFrame_value(_frame * f=0x031c74e8, int throwflag=0, _object * retval=0x1e320030) Line 3271 + 0x1a bytes C
python27.dll!PyEval_EvalFrameEx_slp(_frame * f=0x031c74e8, int throwflag=0, _object * retval=0x1e320030) Line 964 + 0x11 bytes C
python27.dll!PyEval_EvalFrame_value(_frame * f=0x03289d50, int throwflag=0, _object * retval=0x1e320030) Line 3271 + 0x1a bytes C
python27.dll!PyEval_EvalFrameEx_slp(_frame * f=0x03289d50, int throwflag=0, _object * retval=0x1e320030) Line 964 + 0x11 bytes C
python27.dll!PyEval_EvalFrame_value(_frame * f=0x03289bd8, int throwflag=0, _object * retval=0x1e320030) Line 3271 + 0x1a bytes C
python27.dll!PyEval_EvalFrameEx_slp(_frame * f=0x03289bd8, int throwflag=0, _object * retval=0x1e320030) Line 964 + 0x11 bytes C
python27.dll!slp_eval_frame_newstack(_frame * f=0x03289bd8, int exc=0, _object * retval=0x1e320030) Line 470 + 0x11 bytes C
python27.dll!PyEval_EvalFrameEx_slp(_frame * f=0x03289bd8, int throwflag=0, _object * retval=0x1e320030) Line 910 + 0x11 bytes C
python27.dll!slp_frame_dispatch(_frame * f=0x03289bd8, _frame * stopframe=0x031d7f48, int exc=0, _object * retval=0x1e320030) Line 737 + 0x16 bytes C
python27.dll!PyEval_EvalCodeEx(PyCodeObject * co=0x023d23c8, _object * globals=0x02435a50, _object * locals=0x00000000, _object * * args=0x03249adc, int argcount=1, _object * * kws=0x00000000, int kwcount=0, _object * * defs=0x00000000, int defcount=0, _object * closure=0x00000000) Line 3561 + 0x16 bytes C
python27.dll!function_call(_object * func=0x024429b0, _object * arg=0x03249ad0, _object * kw=0x00000000) Line 542 + 0x3a bytes C
python27.dll!PyObject_Call(_object * func=0x024429b0, _object * arg=0x03249ad0, _object * kw=0x00000000) Line 2539 + 0x3e bytes C
python27.dll!PyObject_CallFunctionObjArgs(_object * callable=0x024429b0, ...) Line 2786 + 0xf bytes C
python27.dll!handle_weakrefs(_gc_head * unreachable=0x1e34c9f0, _gc_head * old=0x1e2f0d38) Line 752 + 0xf bytes C
python27.dll!collect(int generation=0) Line 1025 + 0xe bytes C
python27.dll!collect_generations() Line 1097 + 0x9 bytes C
python27.dll!_PyObject_GC_Malloc(unsigned int basicsize=44) Line 1559 C
python27.dll!PyType_GenericAlloc(_typeobject * type=0x00397f70, int nitems=0) Line 754 + 0x9 bytes C
python27.dll!PyTasklet_New(_typeobject * type=0x00397f70, _object * func=0x00000000) Line 218 + 0x13 bytes C
python27.dll!tasklet_new(_typeobject * type=0x00397f70, _object * args=0x01d58030, _object * kwds=0x00000000) Line 282 + 0xd bytes C
python27.dll!initialize_main_and_current() Line 1028 + 0x1c bytes C
python27.dll!slp_run_tasklet() Line 1231 + 0xe bytes C
python27.dll!slp_eval_frame(_frame * f=0x031d7f48) Line 313 + 0x5 bytes C
python27.dll!climb_stack_and_eval_frame(_frame * f=0x031d7f48) Line 274 + 0x9 bytes C
python27.dll!slp_eval_frame(_frame * f=0x031d7f48) Line 303 + 0x9 bytes C
python27.dll!PyEval_EvalCodeEx(PyCodeObject * co=0x023a4848, _object * globals=0x01fddae0, _object * locals=0x00000000, _object * * args=0x01d5803c, int argcount=0, _object * * kws=0x00000000, int kwcount=0, _object * * defs=0x00000000, int defcount=0, _object * closure=0x00000000) Line 3564 + 0x9 bytes C
python27.dll!function_call(_object * func=0x023b20b0, _object * arg=0x01d58030, _object * kw=0x00000000) Line 542 + 0x3a bytes C
python27.dll!PyObject_Call(_object * func=0x023b20b0, _object * arg=0x01d58030, _object * kw=0x00000000) Line 2539 + 0x3e bytes C
python27.dll!PyEval_CallObjectWithKeywords(_object * func=0x023b20b0, _object * arg=0x01d58030, _object * kw=0x00000000) Line 4219 + 0x11 bytes C
func: co_filename "f:\fg2\eclipsews\fg2py\arch\win32\libexec\lib\atexit.py", co_name "_run_exitfuncs"
python27.dll!call_sys_exitfunc() Line 1778 + 0xd bytes C
python27.dll!Py_Finalize() Line 433 C
python27.dll!Py_Main(int argc=3, char * * argv=0x00391b10) Line 683 C
_fg2python.exe!__tmainCRTStartup() Line 586 + 0x17 bytes C
kernel32.dll!76b233aa()
[Frames below may be incorrect and/or missing, no symbols loaded for kernel32.dll]
ntdll.dll!___RtlUserThreadStart@8() + 0x27 bytes
ntdll.dll!__RtlUserThreadStart@8() + 0x1b bytes
IMHO the crash is caused by the interpreter recursion
slp_run_tasklet() -> initialize_main_and_current() -> tasklet_new() -> PyTasklet_New() -> PyType_GenericAlloc() -> _PyObject_GC_Malloc() -> collect_generations() -> collect() -> handle_weakrefs() -> PyObject_CallFunctionObjArgs() -> ...
If I disable the garbage collector in initialize_main_and_current() during the execution of tasklet_new(), the crash does not occur (see attached patch).
Open questions:
- Why did the bug not occur with builds prior to changeset 74135:ac70790fa499?
- What is the exact mechanism for the access violation. Where the location 0x00000018 coming from.
- Why didn't I observe similar issues on linux 64bit.
I can't answer these questions, because my understanding of the internal workings of ceval.c is limited.
Could anybody please review the patch. Is there a better way to disable the GC? Unfortunately there is no C-API for gc.isenabled(), gc.disable() and gc.enable().