Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test_concurrent_futures.test_shutdown: test_interpreter_shutdown() fails randomly (race condition) #109047

Closed
vstinner opened this issue Sep 7, 2023 · 8 comments
Labels
tests Tests in the Lib/test dir

Comments

@vstinner
Copy link
Member

vstinner commented Sep 7, 2023

The test_interpreter_shutdown() test of test_concurrent_futures.test_shutdown has a race condition. On purpose, the test doesn't wait until the executor completes (!). Moreover, it expects the executor to always be able to submit its job, and the job to complete successfully! It's a very optimistic bet.

See also issue #107219: test_concurrent_futures: test_crash_big_data() hangs randomly on Windows.


When Python is shutting down, Py_Finalize() quickly blocks the creation of new threads in _thread.start_new_thread():

if (interp->finalizing) {
PyErr_SetString(PyExc_RuntimeError,
"can't create new thread at interpreter shutdown");
return NULL;
}

This exception was added recently (last June) by commit ce558e6: see issue gh-104690 for the rationale.

The multiprocessing executor spawns _ExecutorManagerThread thread which runs its "main loop" in its run() method:

def run(self):
# Main loop for the executor manager thread.
while True:
self.add_call_item_to_queue()
result_item, is_broken, cause = self.wait_result_broken_or_wakeup()
if is_broken:
self.terminate_broken(cause)
return
if result_item is not None:
self.process_result_item(result_item)
process_exited = result_item.exit_pid is not None
if process_exited:
p = self.processes.pop(result_item.exit_pid)
p.join()
# Delete reference to result_item to avoid keeping references
# while waiting on new results.
del result_item
if executor := self.executor_reference():
if process_exited:
with self.shutdown_lock:
executor._adjust_process_count()
else:
executor._idle_worker_semaphore.release()
del executor
if self.is_shutting_down():
self.flag_executor_shutting_down()
# When only canceled futures remain in pending_work_items, our
# next call to wait_result_broken_or_wakeup would hang forever.
# This makes sure we have some running futures or none at all.
self.add_call_item_to_queue()
# Since no new work items can be added, it is safe to shutdown
# this thread if there are no pending work items.
if not self.pending_work_items:
self.join_executor_internals()
return

It tries to submit new jobs to the worker process through a queue, but oops, the Python main thread is finalizing (called Py_Finalizing())! There is not notification system to notify threads that Python is being finalized.


Moreover, there are 3 "finalization" states:

  • interp->finalizing -- used by _thread.start_new_thread() to block thread creation during Python finazlization
  • runtime->_finalizing -- used by sys.is_finalizing(), Py_IsFinalizing() and _PyRuntimeState_GetFinalizing(runtime)
  • interp->_finalizing -- used by ceval.c to decide if a Python thread "must exit" or not, as soon as it's set, all Python threads must exit as soon as they attempt to acquire the GIL

These 3 states at not set at the same time.

  1. Calling Py_Finalize() sets interp->finalizing to 1 as soon as possible: so spawning new threads is immediately blocked (which is a good thing to get a reliable finalization!)
  2. Py_Finalize() calls threading._shutdown() which blocks until all non-daemon threads completes
  3. Py_Finalize() calls atexit callbacks
  4. And only then, Py_Finalize() sets runtime->_finalizing and interp->_finalizing to the Python thread state (tstate) which calls Py_Finalize()

The delay between (1) and (4) can be quite long, a thread can take several milliseconds, if not seconds, to complete.


Can multiprocessing or concurrent.futures check if Python is finalizing or be notified? Well, did you hear about Time-of-check to time-of-use race conditions? Even if it would be possible, I don't think that we can "check" if it's safe to spawn a thread just before spawning a thread, since the main thread can decide to finalize Python "at any time". It will become even more tricky with Python nogil ;-)

So what's left? Well, multiprocessing and concurrent.futures should be optimistic, call Python functions and only then check for exceptions. Depending on the exceptions, they can decide how to handle it. I would suggest to exit as soon as possible, and try to cleanup resources if possible.

Another option would be to make multiprocessing and concurrent.futures more determistic. Rather than spawning threads and processes in the background "on demand" and hope that everything will be fine, add more synchronization to "wait" until everything is ready to submit jobs. I think that I already tried this approach in the past, but @pitrou didn't like it since it made some workloads slower. You may not always need to actually submits jobs. You may not always need all threads and processes.

Well, I don't know even these complex modules to tell which option is the least bad :-)

Finally, as usually, I beg you to make these APIs less magical, and enforce more explicit resources management! It shouldn't even be possible to not wait until an executor complete. It should be enforced by emitting loudly ResourceWarning warnings :-) Well, that's my opinion. I know that it's not shared by @pitrou :-)

Linked PRs

@vstinner
Copy link
Member Author

vstinner commented Sep 7, 2023

This issue is made of multiple sub-issues.

Here is a script to reproduce the RuntimeError: can't create new thread at interpreter shutdown error:

from concurrent.futures import ProcessPoolExecutor
from multiprocessing import get_context
import os
import atexit
import faulthandler
import random
import time

def sleep_and_print():
    delay = random.random() * 0.4
    print(f"sleep {delay*1e3:.1f} ms", flush=True)
    sleep(delay)
    print("apple", flush=True)

def print_time(start_time):
    dt = time.perf_counter() - start_time
    print(f"exit in {dt*1e3:.1f} ms")

def main():
    print(f"test pid={os.getpid()}")
    start_time = time.perf_counter()
    atexit.register(print_time, start_time)

    faulthandler.dump_traceback_later(5, exit=True)
    context = get_context('spawn')
    executor = ProcessPoolExecutor(5, mp_context=context)
    executor.submit(sleep_and_print)
    #faulthandler.cancel_dump_traceback_later()
    #executor.shutdown(wait=True)
    #print("exit")

if __name__ == "__main__":
    main()

The script is based on test_interpreter_shutdown() of Lib/test/test_concurrent_futures/test_shutdown.py.

To trigger the issue, I run these two commands in parallel in two terminals:

  1. while true; do ./python script.py || break; done
  2. ./python -m test -j4 -r -- Increase -j4 if you cannot reproduce the issue. In fact, any command to stress t the system works.

I easily reproduce the issue on FreeBSD. So far I failed reproduce the bug with this method on Linux.

Logs:

test pid=29139
Exception in thread Thread-1:
Traceback (most recent call last):
  File "/usr/home/vstinner/python/main/Lib/threading.py", line 1059, in _bootstrap_inner
    self.run()
  File "/usr/home/vstinner/python/main/Lib/concurrent/futures/process.py", line 339, in run
    self.add_call_item_to_queue()
  File "/usr/home/vstinner/python/main/Lib/concurrent/futures/process.py", line 394, in add_call_item_to_queue
    self.call_queue.put(_CallItem(work_id,
  File "/usr/home/vstinner/python/main/Lib/multiprocessing/queues.py", line 94, in put
    self._start_thread()
  File "/usr/home/vstinner/python/main/Lib/multiprocessing/queues.py", line 177, in _start_thread
    self._thread.start()
  File "/usr/home/vstinner/python/main/Lib/threading.py", line 978, in start
    _start_new_thread(self._bootstrap, ())
RuntimeError: can't create new thread at interpreter shutdown
exit in 122.7 ms
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/usr/home/vstinner/python/main/Lib/multiprocessing/spawn.py", line 122, in spawn_main
    exitcode = _main(fd, parent_sentinel)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/home/vstinner/python/main/Lib/multiprocessing/spawn.py", line 132, in _main
    self = reduction.pickle.load(from_parent)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/home/vstinner/python/main/Lib/multiprocessing/synchronize.py", line 115, in __setstate__
    self._semlock = _multiprocessing.SemLock._rebuild(*state)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
FileNotFoundError: [Errno 2] No such file or directory

@vstinner
Copy link
Member Author

vstinner commented Sep 7, 2023

I saw two kinds of behavior:

  • test_interpreter_shutdown() hangs on Windows
  • test_interpreter_shutdown() fails with RuntimeError: can't create new thread at interpreter shutdown on Linux on FreeBSD

Errors:

@vstinner
Copy link
Member Author

vstinner commented Sep 7, 2023

I confirm that the commit ce558e6 introduced the regression.

@gpshead
Copy link
Member

gpshead commented Sep 7, 2023

I'm not surprised about the "cause" commit that exposed these problems. I agree with your overall analysis theme: We need to figure out how to make this situation more clear and handle the errors appropriately so that things that blindly thought they could keep running forever get a concrete exception that they can catch and cleanup and shutdown gracefully with. In some cases this probably also means adjusting the expectation of some tests.

(I have noticed a few CI runs and buildbots with these errors popping up)

@vstinner
Copy link
Member Author

Trying to check if Python is being finalized before running an action is unsafe, there is still a risk that Python finalization starts just after the check, and so the action (like creating a thread) will fail.

The safe pattern is to raise an exception and catch the exception. My problem here is that RuntimeError is quite generic.

@ericsnowcurrently @gpshead: What would you think of introducing a new PythonFinalizationError exception uses by the few spots which blocks actions during Python finalization? I'm open to other name suggestions :-)

multiprocessing, threading and other modules could expect such error and cleans everything properly to handle Python finalization smoothly. For example, if a multiprocessing manager gets it, it would be a signal that ah, it's now time to stop everything.

multiprocessing already has special code path for BrokenPipeError which is treated in similar way: see commit a9b1f84 which raises BrokenPipeError if a pipe is closed, so the caller gets BrokenPipeError and can exit properly. See gh-107219 and PR #109244.

@gpshead
Copy link
Member

gpshead commented Sep 23, 2023

PythonFinalizationError is a good a name as any. This is the kind of exception we don't expect anyone (well, users anyways) to ever catch as there really isn't anything they can likely meaningfully do if the did. (like SystemError?)

vstinner added a commit to vstinner/cpython that referenced this issue Sep 24, 2023
Add PythonFinalizationError. This exception derived from RuntimeError
is raised when an operation is blocked during the Python
finalization.
vstinner added a commit to vstinner/cpython that referenced this issue Sep 24, 2023
Add PythonFinalizationError. This exception derived from RuntimeError
is raised when an operation is blocked during the Python
finalization.

The following functions now raise PythonFinalizationError, instead of
RuntimeError:

* _thread.start_new_thread()
* os.fork()
* os.fork1()
* os.forkpty()

Morever, _winapi.Overlapped finalizer now logs an unraisable
PythonFinalizationError, instead of an unraisable RuntimeError.
vstinner added a commit to vstinner/cpython that referenced this issue Sep 25, 2023
Add PythonFinalizationError. This exception derived from RuntimeError
is raised when an operation is blocked during the Python
finalization.

The following functions now raise PythonFinalizationError, instead of
RuntimeError:

* _thread.start_new_thread()
* os.fork()
* os.fork1()
* os.forkpty()

Morever, _winapi.Overlapped finalizer now logs an unraisable
PythonFinalizationError, instead of an unraisable RuntimeError.
vstinner added a commit to vstinner/cpython that referenced this issue Sep 25, 2023
…thonFinalizationError

_ExecutorManagerThread of concurrent.futures now catches
PythonFinalizationError: call terminate_broken(). The exception
occurs when adding an item to the "call_queue" creates a new thread
while Python is being finalized.

Add test_python_finalization_error() to test_concurrent_futures.

Changes:

* _ExecutorManagerThread.terminate_broken() no longer calls
  shutdown_workers() since the queue is no longer working anymore
  (read and write ends of the queue pipe are closed).
* multiprocessing.Queue:

  * Add _terminate_broken() method.
  * _start_thread() starts _thread to None on exception to prevent
    leaking "dangling threads" even if the thread was not started
    yet.

* wait_result_broken_or_wakeup() now uses the short form of
  traceback.format_exception().
vstinner added a commit to vstinner/cpython that referenced this issue Sep 25, 2023
_ExecutorManagerThread of concurrent.futures now catches
PythonFinalizationError: call terminate_broken(). The exception
occurs when adding an item to the "call_queue" creates a new thread
while Python is being finalized.

Add test_python_finalization_error() to test_concurrent_futures.

concurrent.futures._ExecutorManagerThread changes:

* terminate_broken() no longer calls shutdown_workers() since the
  queue is no longer working anymore (read and write ends of the
  queue pipe are closed).
* wait_result_broken_or_wakeup() now uses the short form of
  traceback.format_exception().

multiprocessing.Queue changes:

* Add _terminate_broken() method.
* _start_thread() starts _thread to None on exception to prevent
  leaking "dangling threads" even if the thread was not started
  yet.
vstinner added a commit to vstinner/cpython that referenced this issue Sep 25, 2023
concurrent.futures: The 'executor manager thread' now catches
PythonFinalizationError, it calls terminate_broken(). The exception
occurs while Python is being finalized when adding an item to the
'call queue' tries to create a new 'queue feeder' thread.

Add test_python_finalization_error() to test_concurrent_futures.

concurrent.futures._ExecutorManagerThread changes:

* terminate_broken() no longer calls shutdown_workers() since the
  queue is no longer working anymore (read and write ends of the
  queue pipe are closed).
* wait_result_broken_or_wakeup() now uses the short form of
  traceback.format_exception().

multiprocessing.Queue changes:

* Add _terminate_broken() method.
* _start_thread() starts _thread to None on exception to prevent
  leaking "dangling threads" even if the thread was not started
  yet.
vstinner added a commit to vstinner/cpython that referenced this issue Sep 25, 2023
concurrent.futures: The 'executor manager thread' now catches
PythonFinalizationError, it calls terminate_broken(). The exception
occurs while Python is being finalized when adding an item to the
'call queue' tries to create a new 'queue feeder' thread.

Add test_python_finalization_error() to test_concurrent_futures.

concurrent.futures._ExecutorManagerThread changes:

* terminate_broken() no longer calls shutdown_workers() since the
  queue is no longer working anymore (read and write ends of the
  queue pipe are closed).
* terminate_broken() now terminates child processes.
* wait_result_broken_or_wakeup() now uses the short form of
  traceback.format_exception().

multiprocessing.Queue changes:

* Add _terminate_broken() method.
* _start_thread() starts _thread to None on exception to prevent
  leaking "dangling threads" even if the thread was not started
  yet.
vstinner added a commit to vstinner/cpython that referenced this issue Sep 25, 2023
concurrent.futures: The 'executor manager thread' now catches
PythonFinalizationError, it calls terminate_broken(). The exception
occurs while Python is being finalized when adding an item to the
'call queue' tries to create a new 'queue feeder' thread.

Add test_python_finalization_error() to test_concurrent_futures.

concurrent.futures._ExecutorManagerThread changes:

* terminate_broken() no longer calls shutdown_workers() since the
  queue is no longer working anymore (read and write ends of the
  queue pipe are closed).
* terminate_broken() now terminates child processes.
* wait_result_broken_or_wakeup() now uses the short form of
  traceback.format_exception().

multiprocessing.Queue changes:

* Add _terminate_broken() method.
* _start_thread() sets _thread to None on exception to prevent
  leaking "dangling threads" even if the thread was not started
  yet.
@vstinner
Copy link
Member Author

PythonFinalizationError is a good a name as any.

I created PR #109809 to add the exception.

This is the kind of exception we don't expect anyone (well, users anyways) to ever catch as there really isn't anything they can likely meaningfully do if the did. (like SystemError?)

I'm not sure that nobody will ever use it. We expose sys.is_finalizing() since there are projects which need to know when Python is being finalized. Recently, I added Py_IsFinalizing() to the C API for a similar reason:

vstinner added a commit to vstinner/cpython that referenced this issue Sep 29, 2023
…ueue()

concurrent.futures: The *executor manager thread* now catches
exceptions when adding an item to the *call queue*. During Python
finalization, creating a new thread can now raise RuntimeError. Catch
the exception and call terminate_broken() in this case.

Add test_python_finalization_error() to test_concurrent_futures.

concurrent.futures._ExecutorManagerThread changes:

* terminate_broken() no longer calls shutdown_workers() since the
  queue is no longer working anymore (read and write ends of the
  queue pipe are closed).
* terminate_broken() now terminates child processes.
* wait_result_broken_or_wakeup() now uses the short form of
  traceback.format_exception().
* _ExecutorManagerThread.terminate_broken() now holds shutdown_lock
  to prevent race conditons with ProcessPoolExecutor.submit().

ProcessPoolExecutor changes:

* ProcessPoolExecutor.submit() now starts by checking if the executor
  is broken.

multiprocessing.Queue changes:

* Add _terminate_broken() method.
* _start_thread() sets _thread to None on exception to prevent
  leaking "dangling threads" even if the thread was not started
  yet.
vstinner added a commit to vstinner/cpython that referenced this issue Sep 29, 2023
…ueue()

concurrent.futures: The *executor manager thread* now catches
exceptions when adding an item to the *call queue*. During Python
finalization, creating a new thread can now raise RuntimeError. Catch
the exception and call terminate_broken() in this case.

Add test_python_finalization_error() to test_concurrent_futures.

concurrent.futures._ExecutorManagerThread changes:

* terminate_broken() no longer calls shutdown_workers() since the
  queue is no longer working anymore (read and write ends of the
  queue pipe are closed).
* terminate_broken() now terminates child processes.
* wait_result_broken_or_wakeup() now uses the short form (1 argument,
  not 3) of traceback.format_exception().
* _ExecutorManagerThread.terminate_broken() now holds shutdown_lock
  to prevent race conditons with ProcessPoolExecutor.submit().

multiprocessing.Queue changes:

* Add _terminate_broken() method.
* _start_thread() sets _thread to None on exception to prevent
  leaking "dangling threads" even if the thread was not started
  yet.
vstinner added a commit to vstinner/cpython that referenced this issue Sep 29, 2023
…ueue()

concurrent.futures: The *executor manager thread* now catches
exceptions when adding an item to the *call queue*. During Python
finalization, creating a new thread can now raise RuntimeError. Catch
the exception and call terminate_broken() in this case.

Add test_python_finalization_error() to test_concurrent_futures.

concurrent.futures._ExecutorManagerThread changes:

* terminate_broken() no longer calls shutdown_workers() since the
  queue is no longer working anymore (read and write ends of the
  queue pipe are closed).
* terminate_broken() now terminates child processes.
* wait_result_broken_or_wakeup() now uses the short form (1 argument,
  not 3) of traceback.format_exception().
* _ExecutorManagerThread.terminate_broken() now holds shutdown_lock
  to prevent race conditons with ProcessPoolExecutor.submit().

multiprocessing.Queue changes:

* Add _terminate_broken() method.
* _start_thread() sets _thread to None on exception to prevent
  leaking "dangling threads" even if the thread was not started
  yet.
vstinner added a commit to vstinner/cpython that referenced this issue Sep 29, 2023
…ueue()

concurrent.futures: The *executor manager thread* now catches
exceptions when adding an item to the *call queue*. During Python
finalization, creating a new thread can now raise RuntimeError. Catch
the exception and call terminate_broken() in this case.

Add test_python_finalization_error() to test_concurrent_futures.

concurrent.futures._ExecutorManagerThread changes:

* terminate_broken() no longer calls shutdown_workers() since the
  queue is no longer working anymore (read and write ends of the
  queue pipe are closed).
* terminate_broken() now terminates child processes.
* wait_result_broken_or_wakeup() now uses the short form (1 argument,
  not 3) of traceback.format_exception().
* _ExecutorManagerThread.terminate_broken() now holds shutdown_lock
  to prevent race conditons with ProcessPoolExecutor.submit().

multiprocessing.Queue changes:

* Add _terminate_broken() method.
* _start_thread() sets _thread to None on exception to prevent
  leaking "dangling threads" even if the thread was not started
  yet.
vstinner added a commit that referenced this issue Sep 29, 2023
concurrent.futures: The *executor manager thread* now catches
exceptions when adding an item to the *call queue*. During Python
finalization, creating a new thread can now raise RuntimeError. Catch
the exception and call terminate_broken() in this case.

Add test_python_finalization_error() to test_concurrent_futures.

concurrent.futures._ExecutorManagerThread changes:

* terminate_broken() no longer calls shutdown_workers() since the
  call queue is no longer working anymore (read and write ends of
  the queue pipe are closed).
* terminate_broken() now terminates child processes, not only
  wait until they complete.
* _ExecutorManagerThread.terminate_broken() now holds shutdown_lock
  to prevent race conditons with ProcessPoolExecutor.submit().

multiprocessing.Queue changes:

* Add _terminate_broken() method.
* _start_thread() sets _thread to None on exception to prevent
  leaking "dangling threads" even if the thread was not started
  yet.
vstinner added a commit to vstinner/cpython that referenced this issue Sep 29, 2023
…ython#109810)

concurrent.futures: The *executor manager thread* now catches
exceptions when adding an item to the *call queue*. During Python
finalization, creating a new thread can now raise RuntimeError. Catch
the exception and call terminate_broken() in this case.

Add test_python_finalization_error() to test_concurrent_futures.

concurrent.futures._ExecutorManagerThread changes:

* terminate_broken() no longer calls shutdown_workers() since the
  call queue is no longer working anymore (read and write ends of
  the queue pipe are closed).
* terminate_broken() now terminates child processes, not only
  wait until they complete.
* _ExecutorManagerThread.terminate_broken() now holds shutdown_lock
  to prevent race conditons with ProcessPoolExecutor.submit().

multiprocessing.Queue changes:

* Add _terminate_broken() method.
* _start_thread() sets _thread to None on exception to prevent
  leaking "dangling threads" even if the thread was not started
  yet.

(cherry picked from commit 6351842)
vstinner added a commit to vstinner/cpython that referenced this issue Sep 29, 2023
…ython#109810)

concurrent.futures: The *executor manager thread* now catches
exceptions when adding an item to the *call queue*. During Python
finalization, creating a new thread can now raise RuntimeError. Catch
the exception and call terminate_broken() in this case.

Add test_python_finalization_error() to test_concurrent_futures.

concurrent.futures._ExecutorManagerThread changes:

* terminate_broken() no longer calls shutdown_workers() since the
  call queue is no longer working anymore (read and write ends of
  the queue pipe are closed).
* terminate_broken() now terminates child processes, not only
  wait until they complete.
* _ExecutorManagerThread.terminate_broken() now holds shutdown_lock
  to prevent race conditons with ProcessPoolExecutor.submit().

multiprocessing.Queue changes:

* Add _terminate_broken() method.
* _start_thread() sets _thread to None on exception to prevent
  leaking "dangling threads" even if the thread was not started
  yet.

(cherry picked from commit 6351842)
@vstinner
Copy link
Member Author

Fixed by commit 6351842

I abandoned my idea of adding PythonFinalizationError. It's not strictly needed: #109809 (comment)

Yhg1s pushed a commit that referenced this issue Oct 2, 2023
…110126)

gh-109047: concurrent.futures catches PythonFinalizationError (#109810)

concurrent.futures: The *executor manager thread* now catches
exceptions when adding an item to the *call queue*. During Python
finalization, creating a new thread can now raise RuntimeError. Catch
the exception and call terminate_broken() in this case.

Add test_python_finalization_error() to test_concurrent_futures.

concurrent.futures._ExecutorManagerThread changes:

* terminate_broken() no longer calls shutdown_workers() since the
  call queue is no longer working anymore (read and write ends of
  the queue pipe are closed).
* terminate_broken() now terminates child processes, not only
  wait until they complete.
* _ExecutorManagerThread.terminate_broken() now holds shutdown_lock
  to prevent race conditons with ProcessPoolExecutor.submit().

multiprocessing.Queue changes:

* Add _terminate_broken() method.
* _start_thread() sets _thread to None on exception to prevent
  leaking "dangling threads" even if the thread was not started
  yet.

(cherry picked from commit 6351842)
encukou added a commit to encukou/cpython that referenced this issue Jan 23, 2024
…ot concurrent.futures

This was left out of the 3.12 backport for three related issues:
- pythongh-107219 (which adds `self.call_queue._writer.close()` to `_ExecutorManagerThread` in `concurrent.futures`)
- pythongh-109370 (which changes this to be only called on Windows)
- pythongh-109047 (which moves the call to `multiprocessing.Queue`'s `_terminate_broken`)

Without this change, ProcessPoolExecutor sometimes hangs on Windows
when a worker process is terminated.

Co-authored-by: Victor Stinner <vstinner@python.org>
Co-authored-by:	Serhiy Storchaka <storchaka@gmail.com>
encukou added a commit that referenced this issue Jan 24, 2024
…current.futures (GH-114489)

This was left out of the 3.12 backport for three related issues:
- gh-107219 (which adds `self.call_queue._writer.close()` to `_ExecutorManagerThread` in `concurrent.futures`)
- gh-109370 (which changes this to be only called on Windows)
- gh-109047 (which moves the call to `multiprocessing.Queue`'s `_terminate_broken`)

Without this change, ProcessPoolExecutor sometimes hangs on Windows
when a worker process is terminated.

Co-authored-by: Victor Stinner <vstinner@python.org>
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
naveen521kk pushed a commit to naveen521kk/cpython that referenced this issue Feb 19, 2024
…ot concurrent.futures (pythonGH-114489)

This was left out of the 3.12 backport for three related issues:
- pythongh-107219 (which adds `self.call_queue._writer.close()` to `_ExecutorManagerThread` in `concurrent.futures`)
- pythongh-109370 (which changes this to be only called on Windows)
- pythongh-109047 (which moves the call to `multiprocessing.Queue`'s `_terminate_broken`)

Without this change, ProcessPoolExecutor sometimes hangs on Windows
when a worker process is terminated.

Co-authored-by: Victor Stinner <vstinner@python.org>
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
naveen521kk pushed a commit to naveen521kk/cpython that referenced this issue Feb 19, 2024
…ot concurrent.futures (pythonGH-114489)

This was left out of the 3.12 backport for three related issues:
- pythongh-107219 (which adds `self.call_queue._writer.close()` to `_ExecutorManagerThread` in `concurrent.futures`)
- pythongh-109370 (which changes this to be only called on Windows)
- pythongh-109047 (which moves the call to `multiprocessing.Queue`'s `_terminate_broken`)

Without this change, ProcessPoolExecutor sometimes hangs on Windows
when a worker process is terminated.

Co-authored-by: Victor Stinner <vstinner@python.org>
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
naveen521kk pushed a commit to naveen521kk/cpython that referenced this issue Feb 19, 2024
…ot concurrent.futures (pythonGH-114489)

This was left out of the 3.12 backport for three related issues:
- pythongh-107219 (which adds `self.call_queue._writer.close()` to `_ExecutorManagerThread` in `concurrent.futures`)
- pythongh-109370 (which changes this to be only called on Windows)
- pythongh-109047 (which moves the call to `multiprocessing.Queue`'s `_terminate_broken`)

Without this change, ProcessPoolExecutor sometimes hangs on Windows
when a worker process is terminated.

Co-authored-by: Victor Stinner <vstinner@python.org>
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
naveen521kk pushed a commit to naveen521kk/cpython that referenced this issue Feb 19, 2024
…ot concurrent.futures (pythonGH-114489)

This was left out of the 3.12 backport for three related issues:
- pythongh-107219 (which adds `self.call_queue._writer.close()` to `_ExecutorManagerThread` in `concurrent.futures`)
- pythongh-109370 (which changes this to be only called on Windows)
- pythongh-109047 (which moves the call to `multiprocessing.Queue`'s `_terminate_broken`)

Without this change, ProcessPoolExecutor sometimes hangs on Windows
when a worker process is terminated.

Co-authored-by: Victor Stinner <vstinner@python.org>
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
naveen521kk pushed a commit to naveen521kk/cpython that referenced this issue Feb 21, 2024
…ot concurrent.futures (pythonGH-114489)

This was left out of the 3.12 backport for three related issues:
- pythongh-107219 (which adds `self.call_queue._writer.close()` to `_ExecutorManagerThread` in `concurrent.futures`)
- pythongh-109370 (which changes this to be only called on Windows)
- pythongh-109047 (which moves the call to `multiprocessing.Queue`'s `_terminate_broken`)

Without this change, ProcessPoolExecutor sometimes hangs on Windows
when a worker process is terminated.

Co-authored-by: Victor Stinner <vstinner@python.org>
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
naveen521kk pushed a commit to naveen521kk/cpython that referenced this issue Jul 11, 2024
…ot concurrent.futures (pythonGH-114489)

This was left out of the 3.12 backport for three related issues:
- pythongh-107219 (which adds `self.call_queue._writer.close()` to `_ExecutorManagerThread` in `concurrent.futures`)
- pythongh-109370 (which changes this to be only called on Windows)
- pythongh-109047 (which moves the call to `multiprocessing.Queue`'s `_terminate_broken`)

Without this change, ProcessPoolExecutor sometimes hangs on Windows
when a worker process is terminated.

Co-authored-by: Victor Stinner <vstinner@python.org>
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
naveen521kk pushed a commit to naveen521kk/cpython that referenced this issue Jul 11, 2024
…ot concurrent.futures (pythonGH-114489)

This was left out of the 3.12 backport for three related issues:
- pythongh-107219 (which adds `self.call_queue._writer.close()` to `_ExecutorManagerThread` in `concurrent.futures`)
- pythongh-109370 (which changes this to be only called on Windows)
- pythongh-109047 (which moves the call to `multiprocessing.Queue`'s `_terminate_broken`)

Without this change, ProcessPoolExecutor sometimes hangs on Windows
when a worker process is terminated.

Co-authored-by: Victor Stinner <vstinner@python.org>
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
naveen521kk pushed a commit to naveen521kk/cpython that referenced this issue Jul 11, 2024
…ot concurrent.futures (pythonGH-114489)

This was left out of the 3.12 backport for three related issues:
- pythongh-107219 (which adds `self.call_queue._writer.close()` to `_ExecutorManagerThread` in `concurrent.futures`)
- pythongh-109370 (which changes this to be only called on Windows)
- pythongh-109047 (which moves the call to `multiprocessing.Queue`'s `_terminate_broken`)

Without this change, ProcessPoolExecutor sometimes hangs on Windows
when a worker process is terminated.

Co-authored-by: Victor Stinner <vstinner@python.org>
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
naveen521kk pushed a commit to msys2-contrib/cpython-mingw that referenced this issue Aug 5, 2024
…ot concurrent.futures (pythonGH-114489)

This was left out of the 3.12 backport for three related issues:
- pythongh-107219 (which adds `self.call_queue._writer.close()` to `_ExecutorManagerThread` in `concurrent.futures`)
- pythongh-109370 (which changes this to be only called on Windows)
- pythongh-109047 (which moves the call to `multiprocessing.Queue`'s `_terminate_broken`)

Without this change, ProcessPoolExecutor sometimes hangs on Windows
when a worker process is terminated.

Co-authored-by: Victor Stinner <vstinner@python.org>
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
Glyphack pushed a commit to Glyphack/cpython that referenced this issue Sep 2, 2024
…ython#109810)

concurrent.futures: The *executor manager thread* now catches
exceptions when adding an item to the *call queue*. During Python
finalization, creating a new thread can now raise RuntimeError. Catch
the exception and call terminate_broken() in this case.

Add test_python_finalization_error() to test_concurrent_futures.

concurrent.futures._ExecutorManagerThread changes:

* terminate_broken() no longer calls shutdown_workers() since the
  call queue is no longer working anymore (read and write ends of
  the queue pipe are closed).
* terminate_broken() now terminates child processes, not only
  wait until they complete.
* _ExecutorManagerThread.terminate_broken() now holds shutdown_lock
  to prevent race conditons with ProcessPoolExecutor.submit().

multiprocessing.Queue changes:

* Add _terminate_broken() method.
* _start_thread() sets _thread to None on exception to prevent
  leaking "dangling threads" even if the thread was not started
  yet.
naveen521kk pushed a commit to naveen521kk/cpython that referenced this issue Sep 4, 2024
…ot concurrent.futures (pythonGH-114489)

This was left out of the 3.12 backport for three related issues:
- pythongh-107219 (which adds `self.call_queue._writer.close()` to `_ExecutorManagerThread` in `concurrent.futures`)
- pythongh-109370 (which changes this to be only called on Windows)
- pythongh-109047 (which moves the call to `multiprocessing.Queue`'s `_terminate_broken`)

Without this change, ProcessPoolExecutor sometimes hangs on Windows
when a worker process is terminated.

Co-authored-by: Victor Stinner <vstinner@python.org>
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
tests Tests in the Lib/test dir
Projects
Development

No branches or pull requests

2 participants