-
-
Notifications
You must be signed in to change notification settings - Fork 31.6k
bpo-46210: Fix deadlock in print. #30310
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Hello, and thanks for your contribution! I'm a bot set up to make sure that the project can legally accept this contribution by verifying everyone involved has signed the PSF contributor agreement (CLA). CLA MissingOur records indicate the following people have not signed the CLA: For legal reasons we need all the people listed to sign the CLA before we can look at your contribution. Please follow the steps outlined in the CPython devguide to rectify this issue. If you have recently signed the CLA, please wait at least one business day You can check yourself to see if the CLA has been received. Thanks again for the contribution, we look forward to reviewing it! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some superficial things, I'm not that knowledgable about the whole threading/forking situation.
Python/pystate.c
Outdated
/* error */ | ||
ret = -1; | ||
end: | ||
return ret; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there are a few more leaks here. Perhaps it would be simpler to initialize the variables to NULL, and then use Py_XDECREF in the end:/error:
label.
Python/pystate.c
Outdated
if (_PyObject_GetAttrId(stdio, &PyId_isatty) == NULL) { | ||
PyErr_Clear(); | ||
goto end; | ||
} | ||
|
||
isatty = _PyObject_CallMethodIdNoArgs(stdio, &PyId_isatty); | ||
if (Py_IsFalse(isatty)) { | ||
goto end; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this can be just one call to _PyObject_CallMethodIdNoArgs
with a check for a NULL return value.
It would be nice to see a test case added for these methods. If a threading/forking case can be made to pass consistently, that would be nice, but even if not, just a simple exercise of the |
Thanks for the tips! I know exactly what to learn about next. Will push some updates after I learn a bit more. I think the general "unlock the thing" idea probably makes sense, but I'm sure I tripped over myself in 17 different ways in the initial patch. Curious to hear from Victor about whether the overall goal is a sane one. Either way, more coming soon. Happy New Year. :) |
Ok, re: Just updated the pull request with what (I think?) is correct reference counting. All the tests pass, but I wanted to check for reference leaks like you suggested. So I ran several of the relevant tests with Supposing there's a reference leak, would it be noticed by Is there a standard way to check for leaks in situations like that, or is it better to just use a brute force strategy like I did in the comment above (i.e., calling every function twice and checking |
Hi there. :) After spending some time thinking about how to exercise the lock re-initialization function in a test, I realized that the best way to overcome my noobish-ness here is to follow the example of The "meat" of the implementation we've already discussed doesn't change (thanks again for the tips @sweeneyde). I've just moved things to different files, in order to maximize the similarity with what Here's what a grep of the codebase for the two functions looks like:
And here's the
The goal was to get line-by-line correspondence with what @vstinner does with the Now, the only difference arises inside the implementation, because the import lock is a global variable inside As for everything else: (1) the PEP 7 formatting should be good now, (2) @the-knights-who-say-ni seem to have given me the "CLA signed" badge just now, (3) in playing with all this, I stumbled on another small unrelated bug in the test suite and fixed it, so I'll submit that as a separate bpo & pull request. Let me know if there's anything else you'd like me to do. Thanks again for all the help. :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree that this needs tests. It also adds a new Python-accessible method. What happens if Python code calls it?
Re: Helpful comments from @JelleZijlstra
100% agreed, the Python-accessible method wasn't necessary. I needed to learn a bit more before I could do this correctly. Just pushed some changes to address this. The updated approach:
The test's behavior is to cause the deadlock described in bpo-46210 if and only if the lock is not being reinitialized correctly. Just like the other C-level locks in the codebase, this is something that should be reinitialized on every fork, so you (@JelleZijlstra) were absolutely right to point out that a Python-accessible interface was an awkward fit. Further, I want to make it clear to other devs precisely what situation the bugfix is targeting, so the test contains a description of what the bug looks like without the fix, and which specific C function was added to fix it. Summary: the test can be run with |
In addition to the above: In the course of adding the test, I found a small bug in the That won't be relevant to anyone who runs the test as is, since the PR fixes the deadlock. However, if anyone feels like running the above test after removing the bugfix to verify that the test fails, as it should (if we remove That's due to the above mentioned bug in Should I submit another bpo for this small bug, or just send a PR directly? Should be a one-line fix. Just need to change |
Thanks, can you open another issue for the test.support fix? (I don't have time right now to look at this PR in detail, but I'll review it again.) |
Will do! :) Thanks for the help. |
Modules/_io/bufferedio.c
Outdated
int | ||
_PyIO_buffered_at_fork_reinit(PyObject *self) | ||
{ | ||
Py_INCREF(self); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not just do the INCREF after the if statement?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, of course. Still getting a hang of reference counting. Thanks for the tips! (This one should have been obvious, now that I look at it again).
Python/sysmodule.c
Outdated
PyObject *buffer = NULL; | ||
PyObject *result = NULL; | ||
|
||
PyObject *stdio = _PySys_GetObjectId(key); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like this returns a borrowed reference, so there's no need to DECREF it: the code is correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool, thanks. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpick: I suggest the name stream
, but stdio works too ;-)
Python/sysmodule.c
Outdated
|
||
PyObject *isatty = NULL; | ||
PyObject *buffer = NULL; | ||
PyObject *result = NULL; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is never used
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed. Thanks again.
Ok, all the comments above from @JelleZijlstra should be addressed (thanks again for the help!). Also submitted PRs for two minor bugs I found in the test suite while working on this PR (those are #31204 and #31205). Neither of those is a requirement for this PR to work. They're just trivial fixes to a couple edge cases in the test suite. However, in a hypothetical scenario where:
then #31205 ensures that the timeout in our new test is respected when waiting for a deadlocked child process. That's just a minor bit of polish, but I'm trying to think about possible future scenarios, not just the code as it exists now. Let me know if there's anything else I can improve. :) |
Python/sysmodule.c
Outdated
PyObject *stdio = _PySys_GetObjectId(key); | ||
|
||
_Py_IDENTIFIER(isatty); | ||
isatty = _PyObject_CallMethodIdNoArgs(stdio, &PyId_isatty); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see the purpose of testing isatty().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed. :)
Python/sysmodule.c
Outdated
PyObject *buffer = NULL; | ||
|
||
PyObject *stdio = _PySys_GetObjectId(key); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
stdio can be NULL or Py_None. You should do nothing in this case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, done.
Python/sysmodule.c
Outdated
if (_PyObject_LookupAttrId(stdio, &PyId_buffer, &buffer) < 0) { | ||
/* stdout.buffer and stderr.buffer are not part of the | ||
* TextIOBase API and may not exist in some implementations. | ||
* If not present, no need to reinitialize their locks. */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The function can raises an AttributeError, no? If yes, you must clear it: PyErr_Clear().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed! (The function is MUCH simpler now. Thanks for the tips.)
Python/sysmodule.c
Outdated
int ret = 0; | ||
|
||
PyObject *isatty = NULL; | ||
PyObject *buffer = NULL; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer to move variable declaration where the variable is set for the first value, rather than declarating most of them at the top.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed to this style throughout.
Python/sysmodule.c
Outdated
PyObject *buffer = NULL; | ||
PyObject *result = NULL; | ||
|
||
PyObject *stdio = _PySys_GetObjectId(key); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpick: I suggest the name stream
, but stdio works too ;-)
Modules/_io/bufferedio.c
Outdated
if (!Py_IS_TYPE(self, &PyBufferedWriter_Type)) { | ||
return 0; | ||
} | ||
Py_INCREF(self); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you hold a string pointer? _PyThread_at_fork_reinit() doesn't release the GIL, it only allocates memory on the heap. You can remove the INCREF/DECREF dance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Re:
stream
overstdio
. Changed. - Re: INCREF/DECREF dance. Done. (Still getting a feel for refcounting, thanks for the tips.)
@@ -265,5 +267,107 @@ def tearDown(self): | |||
pass | |||
|
|||
|
|||
class TestStdioAtForkReInit(unittest.TestCase): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I dislike such function test, it's fragile and may break tomorrow. I would prefer to remove it.
If you want to keep it, you should:
- create a new script and run the script from the test. So you better control the Python state when the test is run: which modules are imported, signals, etc.
- move it to test_io: the tested code lives in the _io module
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you clarify why you think it's fragile? It seems useful as a regression test. I do agree that it would be more stable in a separate script.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the only part I didn't change yet, b/c I wanted to check with you guys on the details.
@vstinner All the at_fork_reinit
handlers in the codebase have your name on them, so I'll trust your judgment here. (Though I'm also curious for the same reasons @JelleZijlstra mentioned).
On the topic of the current test, you said "If you want to keep it, you should..."
What's the "else" condition for that sentence? I would prefer to do whatever you think makes the most sense. I don't have any strong opinions here, aside from a desire to have that one line in there that reinitializes the lock.
Test-wise, would you prefer (1) just moving the current test to a separate script? (2) removing the current test entirely? (3) a different test with certain properties that this one doesn't have?
I'm equally happy to do any of the above. :)
created child processes do not share locks with the parent. */ | ||
PyStatus | ||
_PySys_ReInitStdio(void) | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sys.stdin has also a buffer which has a lock which must be reinitialized at fork.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool, now PyBufferedWriter_Type
and PyBufferedReader_Type
both get their locks reinitialized.
Python/sysmodule.c
Outdated
int reinit_stderr = stdio_at_fork_reinit(&PyId_stderr); | ||
|
||
if (reinit_stdout < 0 || reinit_stderr < 0) { | ||
return _PyStatus_ERR("Failed to reinitialize stdout and stderr"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return _PyStatus_ERR("Failed to reinitialize stdout and stderr"); | |
return _PyStatus_ERR("Failed to reinitialize standard streams"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
standard streams
it is! :)
This commit: * Properly identifies whether we need to reinit a stream's lock. * Changes 'stdio' to 'stream' in several functions for clarity. * Also reinitialize stdin's buffer->lock at fork.
Mind if I rebase this on main and force push a small update? I fixed the PR to be consistent with #30928 (bpo-46541). i.e., (P.S. Am I imagining things, or do recent changes in main feel very Sam Gross-y? Trying not to get too excited, but I hope that's what's going on. 🎉) |
Generally better to merge main, so that the review history gets preserved. |
Cool, so this pattern?
Apologies for the dumb question. I've used git socially and github personally, but I'm still fairly new to using github socially. :) |
Something like this:
GitHub also lets you do this in the web interface if there's a conflict (the "Resolve conflicts" button at the bottom of the thread). |
Update for consistency with pythongh-30928 (bpo-46541).
Thanks! |
Add
_at_fork_reinit
handler tobuffer->lock
ofstdout
andstderr
, to fix a deadlock in print.https://bugs.python.org/issue46210