Skip to content

bpo-46210: Fix deadlock in print. #30310

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 15 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions Include/internal/pycore_sysmodule.h
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,10 @@ extern "C" {
# error "this header requires Py_BUILD_CORE define"
#endif

#ifdef HAVE_FORK
extern PyStatus _PySys_ReInitStdio(void);
#endif

PyAPI_FUNC(int) _PySys_Audit(
PyThreadState *tstate,
const char *event,
Expand Down
104 changes: 104 additions & 0 deletions Lib/test/test_thread.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,6 @@
import io
import os
import sys
import unittest
import random
from test import support
Expand Down Expand Up @@ -265,5 +267,107 @@ def tearDown(self):
pass


class TestStdioAtForkReInit(unittest.TestCase):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dislike such function test, it's fragile and may break tomorrow. I would prefer to remove it.

If you want to keep it, you should:

  • create a new script and run the script from the test. So you better control the Python state when the test is run: which modules are imported, signals, etc.
  • move it to test_io: the tested code lives in the _io module

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you clarify why you think it's fragile? It seems useful as a regression test. I do agree that it would be more stable in a separate script.

Copy link
Contributor Author

@notarealdeveloper notarealdeveloper Feb 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the only part I didn't change yet, b/c I wanted to check with you guys on the details.

@vstinner All the at_fork_reinit handlers in the codebase have your name on them, so I'll trust your judgment here. (Though I'm also curious for the same reasons @JelleZijlstra mentioned).

On the topic of the current test, you said "If you want to keep it, you should..."

What's the "else" condition for that sentence? I would prefer to do whatever you think makes the most sense. I don't have any strong opinions here, aside from a desire to have that one line in there that reinitializes the lock.

Test-wise, would you prefer (1) just moving the current test to a separate script? (2) removing the current test entirely? (3) a different test with certain properties that this one doesn't have?

I'm equally happy to do any of the above. :)


class MockStdio(io.TextIOWrapper):
def __init__(self):
import _io, tempfile
self._file = tempfile.mktemp()
raw = _io.FileIO(self._file, mode='wb')
buffer = _io.BufferedWriter(raw)
super().__init__(buffer)

def __del__(self):
try:
self.close()
os.remove(self._file)
except:
pass

def isatty(self):
# pretend to be a tty, so _PySys_ReInitStdio
# will attempt to reinitialize our buffer lock.
return True

def setUp(self):
# Replace stdout & stderr with mock objects with the
# same underlying buffer type to avoid polluting the
# test output stream.
self._saved_stdout = sys.stdout
self._saved_stderr = sys.stderr
sys.stdout = self.MockStdio()
sys.stderr = self.MockStdio()

def tearDown(self):
sys.stdout = self._saved_stdout
sys.stderr = self._saved_stderr

@unittest.skipUnless(hasattr(os, 'fork'), 'need os.fork')
@threading_helper.reap_threads
def test_stdio_at_fork_reinit(self):

# bpo-46210: Added _PySys_ReInitStdio to prevent children
# from deadlocking here if printer_thread held the stdout
# (or stderr) buffer's lock when one of the children forked.

num_prints = 100
num_forks = 100
msg = 'hello'*10

def printer_thread(stdio):
for n in range(num_prints):
print(msg, file=stdio)
stdio.flush()

def fork_processes(stdio):
pids = []
for n in range(num_forks):
if pid := os.fork():
pids.append(pid)
else:
print(msg, file=stdio)
stdio.flush()
os._exit(0)

return pids

def main(stdio):
thread.start_new_thread(printer_thread, (stdio,))
pids = fork_processes(stdio)
for pid in pids:
support.wait_process(pid, exitcode=0)
return

# Forking below is not necessary to illustrate the bug
# in bpo-46210 and its fix. The issue in bpo-46210 is
# that calling main(sys.stdout) or main(sys.stderr)
# is sufficient to cause a deadlock in print. We fork
# here only to allow us to give a single timeout to the
# main() call, since its failure mode (absent the fix)
# is for some subset of the forked child processes to
# deadlock at the moment when they try to print, rather
# than to raise an exception. Therefore, simply looping
# over the child pids and calling support.wait_process
# with a separate nonzero timeout for each child leads
# to a rather unpredictable total wait time, whereas
# forking again here at the top (though not necessary
# to illustrate the bug) allows us to give a predictable
# timeout to the process of waiting for the children.
#
# bpo-46210 is present if and only if one or more of the
# children forked by main() deadlock when they call print.
#
# pr-30310 proposes a fix following the example of the
# import lock, by providing a function _PySys_ReInitStdio
# that is called alongside the other preexisting lock
# reinitialization functions in PyOS_AfterFork_Child.
for stdio in (sys.stdout, sys.stderr):
with threading_helper.wait_threads_exit():
if main_pid := os.fork():
support.wait_process(main_pid, exitcode=0, timeout=5)
else:
main(stdio)
os._exit(0)

if __name__ == "__main__":
unittest.main()
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
Add :c:func:`_at_fork_reinit` handler to ``buffer->lock`` of stdout and
stderr, to fix a deadlock in print.
14 changes: 14 additions & 0 deletions Modules/_io/bufferedio.c
Original file line number Diff line number Diff line change
Expand Up @@ -414,6 +414,20 @@ buffered_clear(buffered *self)
return 0;
}

#ifdef HAVE_FORK
int
_PyIO_buffered_at_fork_reinit(PyObject *self)
{
if (!Py_IS_TYPE(self, &PyBufferedWriter_Type) &&
!Py_IS_TYPE(self, &PyBufferedReader_Type)) {
return 0;
}
buffered *buffer = (buffered *)self;
buffer->owner = 0;
return _PyThread_at_fork_reinit(&buffer->lock);
}
#endif

/* Because this can call arbitrary code, it shouldn't be called when
the refcount is 0 (that is, not directly from tp_dealloc unless
the refcount has been temporarily re-incremented). */
Expand Down
6 changes: 6 additions & 0 deletions Modules/posixmodule.c
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@
#include "pycore_ceval.h" // _PyEval_ReInitThreads()
#include "pycore_fileutils.h" // _Py_closerange()
#include "pycore_import.h" // _PyImport_ReInitLock()
#include "pycore_sysmodule.h" // _PySys_ReInitStdio()
#include "pycore_initconfig.h" // _PyStatus_EXCEPTION()
#include "pycore_moduleobject.h" // _PyModule_GetState()
#include "pycore_object.h" // _PyObject_LookupSpecial()
Expand Down Expand Up @@ -592,6 +593,11 @@ PyOS_AfterFork_Child(void)
goto fatal_error;
}

status = _PySys_ReInitStdio();
if (_PyStatus_EXCEPTION(status)) {
goto fatal_error;
}

_PySignal_AfterFork();

status = _PyRuntimeState_ReInitThreads(runtime);
Expand Down
47 changes: 47 additions & 0 deletions Python/sysmodule.c
Original file line number Diff line number Diff line change
Expand Up @@ -2895,6 +2895,53 @@ _PySys_InitCore(PyThreadState *tstate, PyObject *sysdict)
return _PyStatus_ERR("can't initialize sys module");
}

#ifdef HAVE_FORK
extern int _PyIO_buffered_at_fork_reinit(PyObject *);

static int
stream_at_fork_reinit(PyObject *stream)
{
if (stream == NULL || Py_IsNone(stream)) {
return 0;
}

/* The buffer attribute is not part of the TextIOBase API
* and may not exist in some implementations. If not present,
* we have no locks to reinitialize. */
PyObject *buffer = PyObject_GetAttr(stream, &_Py_ID(buffer));
if (buffer == NULL) {
PyErr_Clear();
return 0;
}

/* Reinitialize buffer->lock */
int ret = _PyIO_buffered_at_fork_reinit(buffer);
Py_DECREF(buffer);
return ret;
}

/* This function is called from PyOS_AfterFork_Child() to ensure that newly
created child processes do not share locks with the parent. */
PyStatus
_PySys_ReInitStdio(void)
{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sys.stdin has also a buffer which has a lock which must be reinitialized at fork.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, now PyBufferedWriter_Type and PyBufferedReader_Type both get their locks reinitialized.

PyThreadState *tstate = _PyThreadState_GET();

PyObject *sys_stdin = _PySys_GetAttr(tstate, &_Py_ID(stdin));
PyObject *sys_stdout = _PySys_GetAttr(tstate, &_Py_ID(stdout));
PyObject *sys_stderr = _PySys_GetAttr(tstate, &_Py_ID(stderr));

int reinit_stdin = stream_at_fork_reinit(sys_stdin);
int reinit_stdout = stream_at_fork_reinit(sys_stdout);
int reinit_stderr = stream_at_fork_reinit(sys_stderr);

if (reinit_stdin < 0 || reinit_stdout < 0 || reinit_stderr < 0) {
return _PyStatus_ERR("Failed to reinitialize standard streams");
}
return _PyStatus_OK();
}
#endif

static int
sys_add_xoption(PyObject *opts, const wchar_t *s)
{
Expand Down