Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-126434: Use multiprocessing.Value for multiprocessing.Event to avoid deadlock when there is reentrant usage of set from is_set, e.g. when handling system signals #126437

Draft
wants to merge 14 commits into
base: main
Choose a base branch
from

Conversation

ivarref
Copy link

@ivarref ivarref commented Nov 5, 2024

Please see issue #126434 for details about bug and code to reproduce deadlock.

Kind regards.

@ivarref ivarref requested a review from gpshead as a code owner November 5, 2024 11:03
Copy link

cpython-cla-bot bot commented Nov 5, 2024

All commit authors signed the Contributor License Agreement.
CLA signed

@bedevere-app
Copy link

bedevere-app bot commented Nov 5, 2024

Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool.

If this change has little impact on Python users, wait for a maintainer to apply the skip news label instead.

Copy link
Contributor

@Zheaoli Zheaoli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly LGTM

BTW this PR title need be changed

@ivarref ivarref changed the title gh-126434: Fix multiprocessing.Event to use reentrant lock to support .set()… gh-126434: Use RLock for multiprocessing Event to avoid deadlock when handle system signal (#126437) Nov 5, 2024
@ivarref ivarref changed the title gh-126434: Use RLock for multiprocessing Event to avoid deadlock when handle system signal (#126437) gh-126434: Use RLock for multiprocessing Event to avoid deadlock when handle system signal Nov 5, 2024
@ivarref
Copy link
Author

ivarref commented Nov 5, 2024

Mostly LGTM

BTW this PR title need be changed

I've changed the title @Zheaoli . What do you think? Thanks again.

Copy link
Contributor

@Zheaoli Zheaoli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Zheaoli
Copy link
Contributor

Zheaoli commented Nov 5, 2024

This patch is failed in free threading TSAN test. need to figure out the root cause

@tomasr8
Copy link
Member

tomasr8 commented Nov 5, 2024

This test seems to fail in other PRs as well, I restarted the job again to see if it fails again.

Copy link
Contributor

@colesbury colesbury left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Event implementation is not correct with an RLock because that allows reentrancy (from the same thread) during operations that are not written to be reentrant. For example:

    def is_set(self):
        with self._cond:
            if self._flag.acquire(False):
                self._flag.release()
                return True
            return False

After the self._flag.acquire(False) before the self._flag.release(), reentrant is_set() calls will incorrectly appear to be false.

Event is not written to be safely called from a Python signal handler.

@bedevere-app
Copy link

bedevere-app bot commented Nov 5, 2024

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

@Zheaoli
Copy link
Contributor

Zheaoli commented Nov 5, 2024

Event is not written to be safely called from a Python signal handler.

If this is the origin design behavior. Should we update the docs?

Copy link
Member

@ZeroIntensity ZeroIntensity left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs a test, anyway.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs rewriting to describe the current approach. Also, please update the PR title and description.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I've updated the description now.

What do you think @gpshead ?

@ivarref ivarref changed the title gh-126434: Use RLock for multiprocessing Event to avoid deadlock when handle system signal gh-126434: Use Value for multiprocessing Event to avoid deadlock when handling system signals Nov 11, 2024
@ivarref
Copy link
Author

ivarref commented Nov 11, 2024

Thanks for the feedback from all of you @colesbury, @gpshead, @Zheaoli @tomasr8 and @ZeroIntensity .

I agree with the criticism that my initial suggestion for is_set had a bug.

I've written a new implementation. I believe this new approach solves the deadlock described in the original issue: a reentrant use of set() in combination with is_set().

I also added some asserts. Instead of a deadlock, an assert will now be raised. What do you think about this? Personally I'd rather have an exception than a deadlock.

@ZeroIntensity How would you write a test for such a concurrency issue? I can only think of using some sort of metaprogramming, but at the moment I don't know enough Python to do it.

…to avoid deadlock when there is reentrant usage of `set` from `is_set`, e.g. when handling system signals
@ivarref ivarref changed the title gh-126434: Use Value for multiprocessing Event to avoid deadlock when handling system signals gh-126434: Use multiprocessing.Value for multiprocessing.Event to avoid deadlock when there is reentrant usage of set from is_set, e.g. when handling system signals Nov 11, 2024
@ZeroIntensity
Copy link
Member

@ZeroIntensity How would you write a test for such a concurrency issue? I can only think of using some sort of metaprogramming, but at the moment I don't know enough Python to do it.

Here, you would run the reproducer from gh-126434 in a subprocess and check the return code + stdout/stderr. You can do this pretty easily with subprocess.

As a side note, don't force push--we squash everything down to one commit at the end anyway. Force pushing just makes it harder to keep track of what's going on.

@ivarref
Copy link
Author

ivarref commented Nov 13, 2024

The key word here is stable. I'm not sure switching to Value is worth it for an edge case. If we do end up taking the change, I'm against backporting it for now.

What is the cost of not switching to Value (and this implementation)? Is it better to keep a deadlock so that the code can keep using semaphores? I don't think it is a good idea to keep a deadlock in order to keep the internal implemention only using semaphores.

The newly pushed code will detect and raise an exception when .set() is called from a thread that is already .wait()-ing. It uses a thread local lock to accomplish this. I know that is not very "pretty", but it's better (IMO) than a deadlock, and also the only way to solve this (except spawning a new thread).

Here is an issue describing this wait->set problem: #85772 . The issue is from 2020. It links a stackoverflow question from 2014. I think it's safe to say that these deadlocks have been around and affecting developers/programs for some time.

The newly pushed code also solves #95826 The root cause here was that multiprocessing.Event re-checked the internal flag after returning from the condition wait. It incorrectly ignored the return value from Condition.wait. Thus if another thread executed multiprocessing.Event.clear() before the finally block in Condition.wait, which re-acquires the locks, but after the try block had completed, the code in multiprocessing.Event.wait() would re-check the internal flag and then incorrectly return False, indicating that a timeout had occurred.

I've added tests so that these three issues are covered.

Trying to CC users of referred issues should they still be interested in this topic: @salvatoreingala @marcos4758 @mcclurem @mdboom

Thanks and kind regards.

@ZeroIntensity
Copy link
Member

We're breaking users relying on the private API (which is their fault for doing, but it's not exactly encouraging...), and this implementation doesn't seem correct, because multiprocessing.Value is not thread safe. You'll have to put a lock over it, but won't that result in the same deadlock that we see currently?

@ivarref
Copy link
Author

ivarref commented Nov 14, 2024

Thanks again for your time and comments @ZeroIntensity .

We're breaking users relying on the private API (which is their fault for doing, but it's not exactly encouraging...)

Right, that is fair enough.

I've explicitly taken the locks now. My understanding though was that writes and reads were atomic, but operations involving a read and a write, e.g. an increment, were not. Please correct me if I'm wrong. Source for this claim. Anyhow: locks are now taken explicitly.

I rewrote the implementation now. My claim is that the new implementation is completely reentrant, i.e. it is safe to call from signal handlers. It should thus fix #85772 and #95826, as well as this issue (#126434). No exceptions are raised and no deadlocks should occur. There is a highly theoretical chance of a race condition that is described inside the code. I believe the chance is so low that it is safe to ignore. My guess is that an out of memory exception would occur long before the race condition could occur.

The big change is that instead of waiting on a condition, it now busy waits on a flag and set_id. Using a set_id avoids a race condition between set and clear. set will also work when invoked from a signal handler and the wait-ing thread will continue correctly. See the implementation for details.

The implementation sleeps for 10 milliseconds when busy waiting. On my machine, MacBook Air (15-inch, M3, 2024), CPU usage, when wait()-ing, is 0.5% at most. (Use top -p <pid> on linux and top -pid <pid> on OS X to show CPU usage.) This, the increased CPU usage, is the downside to this implementation (I haven't actually checked the CPU usage on the old implementation, but I assume it is much less). But somewhat increased CPU usage is much better than a deadlock or an exception in my opinion.

What do you think?

Update: I've also updated the blurb.

Thanks and kind regards.

@marcos4758
Copy link

Hello, @ivarref!

A while ago, I decided to be happy and stopped using wait() 😁

Looking at your strategy I noticed that it is similar to what I did:

image

Before that, I tried using a wait on an event, which cost me hours of debugging and drained my patience.

I know it's not elegant, but it works.

@ZeroIntensity
Copy link
Member

I'll look closer at this later today.

…set() call must be made to reach race condition possibility.
@ZeroIntensity ZeroIntensity added the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Nov 14, 2024
@bedevere-bot
Copy link

🤖 New build scheduled with the buildbot fleet by @ZeroIntensity for commit bec9070 🤖

If you want to schedule another build, you need to add the 🔨 test-with-buildbots label again.

@bedevere-bot bedevere-bot removed the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Nov 14, 2024
@ZeroIntensity
Copy link
Member

ZeroIntensity commented Nov 14, 2024

Yeah, the failing buildbots for test_multiprocessing_event make me very uneasy about the stability of the change.

script = support.findfile("is_set_set.py", subdir="multiprocessingdata")
for x in range(10):
try:
assert 0 == subprocess.call([sys.executable, script], timeout=60)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use test.support.assert_python_ok or assert_python_failure rather than launching sys.executable yourself. child interpreters may need flags passed, that takes care of it.

@support.requires_subprocess()
class TestEventSignalHandling(unittest.TestCase):
def test_no_race_for_is_set_set(self):
import subprocess
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Put this import up top, the module is always importable regardless of whether or not it actually works on the platform which requires_subprocess tells you.

elif timeout is not None and (time.monotonic() - start_time) > timeout:
return False
else:
# Fixes https://github.com/python/cpython/issues/85772 by spinning and sleeping.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Iff (lets not assume we should go this route - see my comment on the issue) we're going to abandon use of the native OS platform APIs which properly implement timeouts on inter-process semaphores without a busy loop via _multiprocessing.SemLock in Modules/_multiprocessing/semaphore.c, sleeping in the loop should be done in an exponential back-off fashion as was the case even in threading itself before we started using proper OS APIs there. See https://github.com/python/cpython/blob/v3.1.3/Lib/threading.py#L227 for the old exponential back-off delay example.

Doing a busy loop with sleeps as a low level primitive in 2024 feels very wrong to me. They've always been really unfriendly to both latency due to unnecessary delays and power usage from frequent unnecessary wakes.

I suggested an alternate idea in #126434.

History

Prior to CPython 3.2 threading.Condition was implemented with a back-off in a similar manner. 3.2 improved on that old hack by using the OS APIs for lock timeouts in 7c3e577

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @gpshead and @ZeroIntensity

I will read your comments and suggestions thoroughly tomorrow, and come back to you.

Thanks and kind regards.

@gpshead gpshead marked this pull request as draft November 14, 2024 16:52
@gpshead
Copy link
Member

gpshead commented Nov 14, 2024

I marked this as Draft just to make it clear that the direction to go has not been decided upon yet.

@ZeroIntensity
Copy link
Member

If we do go with this, might it be better to put it under a new feature rather than Event (perhaps something like multiprocessing.SignalSafeEvent)?

@ivarref
Copy link
Author

ivarref commented Nov 28, 2024

I rewrote the approach to use a dedicated thread for signal handing. The initial signal handling is still done on the main thread, but merely pushes the signal to a dedicated thread. This is done using queue.SimpleQueue, of which .put is reentrant. See #126434 (comment) for an example program that uses the new signal handler.

As asked in the issue comment above: is this a viable path going forward? Using only Python to implement this and leaving the current C code as is? Are there other considerations that must be done?

@ivarref
Copy link
Author

ivarref commented Dec 3, 2024

The current approach is now:

  • Run the signal handler on a dedicated daemon thread.
  • Send signals from the main thread to the dedicated signal thread using SimpleQueue.put, which supports reentrant usage.
  • In the face of an unexpected exception inside the signal handler, it will be logged using traceback.print_exc().

Example flow:

sequenceDiagram
    participant OS
    participant MainThread
    participant SignalHandlerThread
    OS->>MainThread: SIGINT
    MainThread->>SignalHandlerThread: send SIGINT using SimpleQueue.put
    SignalHandlerThread->>SignalHandlerThread: invoke user specified handler
    SignalHandlerThread->>SignalHandlerThread: handler calls event.set()
Loading

Discussion: what should be done in the event of a signal handler calling sys.exit()? If nothing is done when sys.exit is invoked from a signal thread, I suppose this will break a decent amount of code?

Current fix: send an exit event to the main thread and send the same signal to interrupt the main thread again, i.e. the reverse flow of the diagram above:

sequenceDiagram
    participant OS
    participant MainThread
    participant SignalHandlerThread
    SignalHandlerThread->>SignalHandlerThread: handler calls sys.exit(0)
    SignalHandlerThread->>MainThread: send SYSEXIT event using SimpleQueue.put
    SignalHandlerThread->>SignalHandlerThread: raise_signal(SIGINT)
    OS->>MainThread: SIGINT
    MainThread->>MainThread: checks SYSEXIT queue
    MainThread->>MainThread: Calls sys.exit(0)
    MainThread->>MainThread: `finally` block in will be executed.
Loading

This fix is the best I could think of. What do you think of this solution?

Not all tests pass at the moment.

CC @ZeroIntensity @gpshead Thanks for your time and patience.

@ZeroIntensity
Copy link
Member

21 files changed

First of all, I'm impressed by your dedication!

Unfortunately, and I hate to say this, but judging by the complexity of this change, and by the fact that we need whole diagrams to explain it, I think we need a PEP. There's a lot that we need to think about here, and we just can't get enough community input from a PR.

@ivarref
Copy link
Author

ivarref commented Dec 4, 2024

Thanks for your kind words @ZeroIntensity

Re 21 files: most of that is adding a line for the signal thread shutdown for a bunch of tests.

A PEP: I am not sure what to say about that. In my opinion this is not a big/huge change, and the diagram was just to illustrate how it is working. I'm sure a ton of work has already been put into the signal module and only handling (as of now) on the main thread. Also a ton of work must have been done with respect to the different platforms (I read about how to do os.kill on windows for example, it was not easy). This makes me sceptical to change any of this.

To me it seems that it is much easier to build this feature on top of the existing Python infrastructure, i.e. in a way that does not involve changing the underlying "mechanics" of C, OS-stuff, etc.. This makes it a moderate change IMO. And keeping the old behavior, it is possible to opt-out if needed.

Perhaps it could break something in unexpected ways. I did not think (or know any details about) sys.exit before trying to fix a test. Is there more cases that would require(?) special handling, or cause breakage for users?

What do you think @gpshead and @colesbury , is this "worthy"/necessary of a PEP?

If you all "vote" for a PEP, that is fine by me, and I could try to draft something.

Thanks and kind regards.

@ZeroIntensity
Copy link
Member

FWIW, we'll likely need changes to the C code anyway. Global state for modules that affect the whole process don't play well with subinterpreters, so I suspect that this will have to get implemented on the runtime structure.

@ivarref
Copy link
Author

ivarref commented Dec 5, 2024

Thanks @ZeroIntensity

Unlike sub-interpreters, the main interpreter has unique process-global responsibilities like signal handling.

from https://docs.python.org/3/c-api/init.html#sub-interpreter-support

I'm reading this as: the main interpreter will be the only interpreter that is responsible for signal handling. The subinterpreters should not need to access the signal module and nor the global state in that module. I suppose the signal functions can assert that they are being called on the main interpreter (or perhaps does so already in C-level?).

That said: I know next to nothing about subinterpreters. Is there some other way that the global state can be corrupted? Or: how does global module state corrupt subinterpreters? (Is there existing global module state? I suppose there is?)

Thanks and kind regards.

@ZeroIntensity
Copy link
Member

Ah, by that it means that the main interpreter is responsible for dealing with signals sent to the Python process, so SIGINT. The signal module can still be imported and used in a subinterpreter (it's re-initialized for every interpreter). This will cause an extra, useless thread for every new interpreter, and that won't scale well. (It's not an impossible problem to solve--you can check the interpreter via _interpreters.get_current--but thinking about these kinds of things is exactly why I'm in favor of PEPing this.)

I suppose the signal functions can assert that they are being called on the main interpreter (or perhaps does so already in C-level?).

I don't think so, but that's probably a bug with subinterpreters themselves. I'll look into that seperately.

Anyways, the experimental, buggy subinterpreters aside, I'm not sure that the implementation even works, because CI is failing across the board. I'd recommend getting tests green before discussing the philosophical side of it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants