Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Worker processes #1781

Closed

Conversation

richardsheridan
Copy link
Contributor

@richardsheridan richardsheridan commented Oct 31, 2020

This PR provides the missing piece to close #5. This solution is based on multiprocessing and mimics the modern Trio thread cache. (You will recognize a lot of re-used code!)

My motivation for writing this is that most of my projects end up cpu-bound. I iterated on a bunch of ways to interface with multiprocessing and concurrent.futures, and actually found some nearly trivial solutions that met my current needs. But, I could see some gaps that I thought really needed filling, and just couldn't let go. So, here we are!

The workers have the following features "requested" in #5:

  • Bypass GIL for CPU bound work
  • Cache processes LIFO
  • Minimal internal complexity (modulo multiprocessing)

And some extras I feel are very important

  • Cross platform
  • Cancel seriously misbehaving code via SIGKILL/TerminateProcess
  • Convert segfaults and other scary things to catchable errors

If those are the good, then the bad and ugly are:

  • Uses threads to make multiprocessing async
  • Has all the usual multiprocessing caveats (freeze_support, pickleable objects only)
  • Daemon processes can't have children, but they can talk to each other with e.g. multiprocessing.Manager

In a sentence, the case for basing these workers on multiprocessing is that it keeps a lot of complexity outside of Trio while offering a set of quirks that users are likely already familiar with.

From #5 I understand that enthusiasm for (1) including a to_process.run_sync API in Trio and (2) basing it on multiprocessing has dampened over the years, but I wanted to offer you the first bite at the apple (and use your CI!) before I go off and try to learn all the bits to make a trio-cookiecutter project work.

TODO:

  • Write tests
  • Expose public API
  • Write docs
  • Write newsfragment

@codecov
Copy link

codecov bot commented Oct 31, 2020

Codecov Report

Merging #1781 (21d802f) into master (f751897) will increase coverage by 0.00%.
The diff coverage is 100.00%.

@@           Coverage Diff            @@
##           master    #1781    +/-   ##
========================================
  Coverage   99.64%   99.64%            
========================================
  Files         114      117     +3     
  Lines       14503    14759   +256     
  Branches     1105     1112     +7     
========================================
+ Hits        14451    14707   +256     
  Misses         37       37            
  Partials       15       15            
Impacted Files Coverage Δ
trio/__init__.py 100.00% <100.00%> (ø)
trio/_worker_processes.py 100.00% <100.00%> (ø)
trio/tests/test_worker_process.py 100.00% <100.00%> (ø)
trio/to_process.py 100.00% <100.00%> (ø)

@richardsheridan
Copy link
Contributor Author

I would really appreciate some advice on how to make the segfault test not flaky! 😅

@belm0
Copy link
Member

belm0 commented Nov 23, 2020

interesting PR!

In our Trio experience, the use cases we've had for subprocess have been 1) some synchronous code like a heavy scipy call can't be divided, or otherwise we're playing games with "manual scheduling", i.e. trio.sleep() calls to avoid large step sizes disrupting the responsiveness of other tasks; and 2) need a heterogeneous app to expand beyond one CPU core.

I once did make a primitive subprocess for the app, simply using the subprocess module and communicating pickled objects manually over a websocket. It worked well, though was eventually retired due to broader changes in the system. If the case comes up again I may give this a try.

Copy link

@auvipy auvipy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this would be great stuff

@richardsheridan
Copy link
Contributor Author

Closing in favor of trio-parallel. Feel free to reopen someday though!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Missing piece: worker process pools, run_in_worker_process
4 participants