Make get_event_loop() return the current loop if called from coroutines/callbacks #452

1st1 · 2016-11-03T21:35:43Z

As discussed in [1] and [2], this PR makes asyncio.get_event_loop() to always return the currently running event loop when it is called from a coroutine.

The patch adds two new functions: asyncio._set_current_loop() and asyncio._get_current_loop(). Both functions are thread-specific and are considered as low-level APIs, intended to be used by third-party loop implementations only. Even though they have a leading underscore, they will be documented as public asyncio API.

loop.run_forever() is modified to call asyncio._set_current_loop() and asyncio._get_current_loop().

asyncio.get_event_loop() is modified to check if there is a current loop set by calling asyncio._get_current_loop(). If there is no current event loop, it returns the result of get_event_loop_policy().get_event_loop() (old behaviour).

The biggest part of the patch is fixing unit tests: all test classes call super().setUp(). test_utils.TestCase.setUp patches asyncio._get_current_loop() to always return None. This change is to make sure that the loop is still being passed explicitly within asyncio.

[1] PR: #355
[2] https://groups.google.com/d/msg/python-tulip/yF9C-rFpiKk/tk5oA3GLHAAJ

asvetlov · 2016-11-03T21:41:36Z

Looks good!

gvanrossum · 2016-11-03T22:27:33Z

asyncio/base_events.py

            sys.set_asyncgen_hooks(firstiter=self._asyncgen_firstiter_hook,
                                   finalizer=self._asyncgen_finalizer_hook)
+        old_loop = events._get_current_loop()
+        events._set_current_loop(self)


This should go inside the try block, so that if it executes partially the finally will still restore the old loop.

Sure, will fix this.

gvanrossum · 2016-11-03T22:28:26Z

asyncio/events.py

 _lock = threading.Lock()

+# A TLS for the current event loop, used by get_current_loop.
+_current_loop = threading.local()


Why not use a subclass (like in BaseDefaultEventLoopPolicy)?

OK, will add a subclass. I guess avoiding a getattr call makes the code cleaner.

gvanrossum · 2016-11-03T22:31:41Z

asyncio/test_utils.py


+    def setUp(self):
+        self._get_current_loop = events._get_current_loop
+        events._get_current_loop = lambda: None


Why not use mock.patch() as a class decorator? If I understand the docs correctly then you don't need to change everything to call super().setUp().

mock.patch doesn't apply to subclasses of TestCase. So we'll need to decorate all TestCases, and we have a lot of them. IMO it's cleaner to fix setUp calls.

gvanrossum · 2016-11-03T22:34:36Z

tests/test_events.py

+        loop = None
+
+        old_policy = asyncio.get_event_loop_policy()
+        asyncio.set_event_loop_policy(Policy())


Again, this ought to be inside the try.

gvanrossum · 2016-11-03T22:57:01Z

OK, nm on the mock.patch() thing then.

1st1 · 2016-11-03T23:05:45Z

@gvanrossum I've updated the PR.

gvanrossum · 2016-11-03T23:30:45Z

What about my suggestion (in the previous PR, sorry) of using "active" or "running" instead of "current"? (I am coming back to this because in my head I keep missing the distinction between get_event_loop() and get_current_loop(). :-)

1st1 · 2016-11-04T00:13:22Z

I like _get_running_loop! "Running" is less ambiguous than "current" and "active".

gvanrossum · 2016-11-04T00:38:08Z

Make it so!

1st1 · 2016-11-04T00:47:57Z

Make it so!

Done :) Please see the latest patch.

vxgmichel · 2016-11-04T10:05:49Z

Looks good, better than the previous PR.

I have a question though. Why does run_forever store the old loop? It seems like this allows to run a loop inside another one. In the previous PR, setting a loop when another one is already running raises a RuntimeError('A loop is already running'). Would that be doable here?

gvanrossum · 2016-11-04T15:13:54Z

asyncio/events.py

-    """Equivalent to calling get_event_loop_policy().get_event_loop()."""
+    """Return an asyncio event loop.
+
+    When called from coroutines, this function will always return the


I'd say "When called from a coroutine or a callback (e.g. scheduled with call_soon or a similar API), this function ..."

gvanrossum · 2016-11-04T15:18:05Z

@vxgmichel

I have a question though. Why does run_forever store the old loop? It seems like this allows to run a loop inside another one. In the previous PR, setting a loop when another one is already running raises a RuntimeError('A loop is already running'). Would that be doable here?

The more I think about it the more I agree. This would be one scenario where we have a chance to detect accidental blocking inside a coroutine or callback. @1st1 what do you think?

asvetlov · 2016-11-04T15:32:10Z

Agree. Stackoverflow has several examples where authors tries to hide high level api under loop.run_until_complete().
Explicit forbidding running another loop from coroutine is great idea.

gvanrossum · 2016-11-04T15:39:24Z

(off-topic) We should have examples of a better recommended practice. IMO such high level APIs should fork a background thread to run the event loop and use concurrent.futures.Future instances to schedule calls and be able to wait for them.

vxgmichel · 2016-11-04T16:40:44Z

@gvanrossum

IMO such high level APIs should fork a background thread to run the event loop and use concurrent.futures.Future instances to schedule calls and be able to wait for them.

I have an asyncio executor example that does pretty much that. I don't think it would fit in the official documentation, but maybe on asyncio-doc?

1st1 · 2016-11-04T17:12:06Z

The more I think about it the more I agree. This would be one scenario where we have a chance to detect accidental blocking inside a coroutine or callback. @1st1 what do you think?

Yeah, let's do it. I'll update the PR.

1st1 · 2016-11-04T17:32:54Z

Updated.

gvanrossum · 2016-11-04T17:57:09Z

asyncio/base_events.py

        """Run until stop() is called."""
        self._check_closed()
        if self.is_running():
            raise RuntimeError('Event loop is running.')


Would it be easy to change this error to "This event loop is already running." ?

Would it be easy to change this error to "This event loop is already running." ?

Done. Now it's more clear.

gvanrossum · 2016-11-04T18:10:50Z

LGTM! Go ahead and merge.

Also: * Raise when a user attempts to run a loop within another loop; * Clarify error message when the loop is already running.

1st1 · 2016-11-04T18:24:03Z

Thank you for reviews @gvanrossum @vxgmichel @asvetlov!

gvanrossum · 2016-11-04T18:56:12Z

PS. Why did the final squashed commit claim failing tests? I didn't see a Travis CI link.

1st1 · 2016-11-04T18:59:47Z

Short answer: race between Travis & GH (at least that's what it looks like).

Long answer: I squashed everything and force pushed it to the PR. When I merged it to the master branch. The PR was automatically closed by GH and it then aborted the builds (that were just started that time). Travis aborted the build marking it as a fail.

1st1 · 2016-11-04T19:02:53Z

The master looks green: https://travis-ci.org/python/asyncio/builds/173340363

gvanrossum · 2016-11-04T19:05:54Z

Oh, your workflow is so different from mine... I use the web squash flow.

…

--Guido (mobile)

thehesiod · 2017-02-22T02:21:15Z

this broke being able to use asyncio from forked processes created from a run loop :( testcase:

import asyncio
import multiprocessing


def sub_proc():
    async def doit():
        print("hello")

    loop = asyncio.new_event_loop()
    asyncio.set_event_loop(loop)
    loop.run_until_complete(doit())


async def main():
    proc = multiprocessing.Process(target=sub_proc)
    proc.start()
    proc.join()


if __name__ == '__main__':
    _loop = asyncio.get_event_loop()
    _loop.run_until_complete(main())

results in:

Traceback (most recent call last):
  File "/usr/local/lib/python3.5/multiprocessing/process.py", line 249, in _bootstrap
    self.run()
  File "/usr/local/lib/python3.5/multiprocessing/process.py", line 93, in run
    self._target(*self._args, **self._kwargs)
  File "/tmp/test.py", line 12, in sub_proc
    loop.run_until_complete(doit())
  File "/usr/local/lib/python3.5/asyncio/base_events.py", line 454, in run_until_complete
    self.run_forever()
  File "/usr/local/lib/python3.5/asyncio/base_events.py", line 411, in run_forever
    'Cannot run the event loop while another loop is running')
RuntimeError: Cannot run the event loop while another loop is running

this works in python 3.5.2 but fails in 3.5.3. I'm guessing this exacerbated https://bugs.python.org/issue22087

1st1 · 2017-02-22T04:46:44Z

Forking from a running event loop triggers many undefined behaviours. Just don't do it.

thehesiod · 2017-02-22T20:09:41Z

this is a really unfortunate response as this is a common workflow for async unittests (https://github.com/Martiusweb/asynctest) and would require ALL workers to be created before your first async method, which for things like pools cannot be guaranteed. Further, it would break scenarios where you have a sub-processes re-spawned after serving x number of requests. I've been using this workflow for over a year in a production environment without issue. Unless the point of this PR was to break this workflow I think it should addressed. What this PR is saying is that async processes cannot launch async sub-processes, which was never previously stated or enforced.

Further, in general, async forking support has been mothballed for more than a year and really needs to be addressed per the issue I referenced above.

1st1 · 2017-02-22T20:35:45Z

this is a really unfortunate response as this is a common workflow for async unittests (https://github.com/Martiusweb/asynctest) and would require ALL workers to be created before your first async method, which for things like pools cannot be guaranteed.

If one package does something in some particular way, it doesn't mean that it's a correct way. It means that we need to fix that one package.

[..] What this PR is saying is that async processes cannot launch async sub-processes, which was never previously stated or enforced.

No, it doesn't say that. It just happened to expose a bug (forking from a running event loop was never officially supported). Forking with a running event loop is discouraged in virtually all async frameworks. There are dozens of side effects that can't be reliably resolved in a cross-platform case. Alternative asyncio event loop implementations such as uvloop will simply segfault if you do this.

There is a way to fork reliably, it just requires some boilerplate code. The key is to fork with the loop being paused. For example (untested):

fork_cb = None

def schedule_fork(new_proc):
    global fork_cb
    fork_cb = new_proc
    asyncio.get_event_loop().stop()

async def coro():
    # some asyncio code...
    # some asyncio code...
    # some asyncio code...

    def new_proc():
        print('forked process', os.getpid())

    print('forking from', os.getpid())
    schedule_fork(new_proc)


loop = asyncio.get_event_loop()
loop.create_task(coro())
# Create more tasks...

while True:
    loop.run_forever()
    if fork_cb is not None:
        cb, fork_cb = fork_cb, None
        if not os.fork():
            try:
                cb()
            finally:
                os.exit()
    else:
        break

A workaround would be to store PID in the threadlocal object that get_event_loop() is using. But that would make the call slower and will only mask the real issue and enable bad practices.

Further, in general, async forking support has been mothballed for more than a year and really needs to be addressed per the issue I referenced above.

I think we actually can add built-in support for forking to asyncio: a coroutine asyncio.fork(cb):

async def some_coro():
    await asyncio.fork(code_for_child_process)

I'll experiment with the idea and maybe propose this in 3.7.

1st1 · 2017-02-22T20:49:15Z

@Martiusweb Would you be able to fix forking in asynctest by using the approach outlined in the above ^^ comment?

thehesiod · 2017-02-22T21:09:22Z

FYI asynctest doesn't directly fork, it's just users of asynctest like myself who happen to fork (launch aiohttp server in sub-process and then run tests). IMHO I believe the workaround you mentioned is not robust in that one may not control the main run loop (like when running a aiohttp server) and further I worry of the heavy hand of stopping the event loop for complicated applications which have many tasks which are blocked waiting for input, timers, and tasks waiting of various events. In my example I run an aiohttp server that dynamically launches processes from a worker-pool to process tens of thousands of S3 requests.

IMO if uvloop crashes that would be a problem in its run-loop impl and should be addressed separately and not affect the BaseEvent loop impl. You also say forking from async from run-loop was never officially supported, but by the same argument it was never officially not-supported either from what I gather from the docs (https://docs.python.org/3/library/asyncio-eventloop.html), and it worked. You say there are many problems with this workflow but there's been a proposed patch for over a year that has sat idle. IMHO I believe this is being punted for "being too hard" without justification given how easily it was worked around in the past. I would imagine the easiest solution is to have forked processes NOT inherit the event loop from the parent...anything else I can see being uber complicated.

I realize I have not dug into the innards of the BaseLoop impl but I'm trying to fight for what I think is a very useful workflow that should not be trivially waved as unimportant. The argument given against this to me feels like a reason for people to version lock to minor releases. It changes something from being trivial to use, to very complicated and cumbersome without a very icky hack from sub-processes I had to come up with:

def reset_running_loop(new_loop):
    current_loop = asyncio.get_event_loop()
    asyncio.set_event_loop(new_loop)

    if PY_353 and current_loop.is_running():
        asyncio.events._running_loop = asyncio.events._RunningLoop()

thehesiod · 2017-02-22T21:35:07Z

FYI another example that just uses asyncio to fork:

import asyncio
from concurrent.futures import ProcessPoolExecutor


def sub_proc():
    async def doit():
        print("hello")

    loop = asyncio.new_event_loop()
    asyncio.set_event_loop(loop)
    loop.run_until_complete(doit())


async def main():
    ppool = ProcessPoolExecutor()
    loop = asyncio.get_event_loop()
    await loop.run_in_executor(ppool, sub_proc)


if __name__ == '__main__':
    _loop = asyncio.get_event_loop()
    _loop.run_until_complete(main())

I think the justification needs to be made if python's base modules are fork safe or not. If asyncio is not fork safe this seems to go against the other python base modules. What work is required to be able to subprocess the BaseEventLoop to make it fork safe?

1st1 · 2017-02-22T21:55:34Z

IMHO I believe the workaround you mentioned is not robust in that one may not control the main run loop (like when running a aiohttp server) ..

.. In my example I run an aiohttp server that dynamically launches processes from a worker-pool to process tens of thousands of S3 requests.

Forking from within a running HTTP server coroutine is a very bad idea. It's a blocking operation that might take non-trivial amount of time. You better pre-fork, or have a master process that controls its children processes and creates new ones when needed.

You can also use loop.subprocess_exec to safely run your S3 logic.

and further I worry of the heavy hand of stopping the event loop for complicated applications which have many tasks which are blocked waiting for input, timers, and tasks waiting of various events.

Event loop can safely be stopped and resumed, no matter how many timers/tasks you have. If you care about timeouts being triggered because you pause the loop to fork -- the same will happen when you use os.fork() (again, it's not a cheap-n-fast operation).

IMO if uvloop crashes that would be a problem in its run-loop impl and should be addressed separately and not affect the BaseEvent loop impl.

True (and that segfault will soon be fixed).

But there are problems even with forking pure python asyncio programs. For instance, epoll is fundamentally not fork-safe. If you continue to use the same event loop in the forked child process you will encounter bugs/crashes/wrong behaviour.

One solution is to use the multiprocessing module, which is supported by both asyncio and uvloop, or to do fork+exec manually.

FYI another example that just uses asyncio to fork:
[..] ProcessPoolExecutor example [..]

You can also use run_in_executor API with process pools -- that is also fully supported (because concurrent.futures uses multiprocessing).

Calling bare os.fork() is fundamentally unsafe, you simply should not use it. It's a low level syscall, and asyncio event loop is at least one level above it. What you are asking is to fix one particular aspect of the API, so that you can continue to use what worked by accident. But even if we make get_event_loop work you will eventually have other problems with os.fork.

gevent, for instance, monkey patches os.fork to make it work safe. As I said in my previous comment, we might want to add a specialized asyncio.fork method, but bare os.fork is very unlikely to to be fully supported ever.

I think the justification needs to be made if python's base modules are fork safe or not. If asyncio is not fork safe this seems to go against the other python base modules. What work is required to be able to subprocess the BaseEventLoop to make it fork safe?

Strictly speaking any network application (blocking or non-blocking) is not os.fork friendly. You have to do the forking with extra care, and generally people only do fork+exec. The officially recommended way to do multiprocessing is to use the multiprocessing package.

You also say forking from async from run-loop was never officially supported, but by the same argument it was never officially not-supported either from what I gather from the docs (https://docs.python.org/3/library/asyncio-eventloop.html), and it worked.

True, this is something we will hopefully fix very soon.

thehesiod · 2017-02-22T22:06:54Z

thanks for the insight, but I think our streams aren't quite crossing yet :) I never stated the workflows involved directly forking, I gave two examples that this PR broke without using fork directly. Also you can create a sub-process without blocking the parent process by using loop.run_in_executor(None, ... so that issue is moot. Also as stated I don't want to use the parent's base event loop. I want to be able to fork a process and have it use it's own event loop like I'm able in 3.5.2 per the first example. Thank you for your time thinking about this workflow!

1st1 · 2017-02-22T22:16:01Z

thanks for the insight, but I think our streams aren't quite crossing yet :) I never stated the workflows involved directly forking, I gave two examples that this PR broke without using fork directly. [..]

Right. I tried your script from #452 (comment) and indeed what should work doesn't work! Seems like it's a bug after all, I'll open a PR.

thehesiod · 2017-02-22T22:17:51Z

yay =) ya it's the same problem as the first example.

See PR python#452 for context.

gvanrossum reviewed Nov 3, 2016

View reviewed changes

1st1 force-pushed the current_loop branch from 002db98 to 0b8d15d Compare November 3, 2016 23:04

1st1 force-pushed the current_loop branch from 0b8d15d to 80971b0 Compare November 4, 2016 00:47

1st1 force-pushed the current_loop branch from 80971b0 to fb96b0d Compare November 4, 2016 00:49

gvanrossum reviewed Nov 4, 2016

View reviewed changes

Make get_event_loop() return the running loop if called from coroutines

07ac834

Also: * Raise when a user attempts to run a loop within another loop; * Clarify error message when the loop is already running.

1st1 force-pushed the current_loop branch from 266b4cd to 07ac834 Compare November 4, 2016 18:22

1st1 merged commit 07ac834 into python:master Nov 4, 2016

1st1 changed the title ~~Make get_event_loop() return the current loop if called from coroutines~~ Make get_event_loop() return the current loop if called from coroutines/callbacks Nov 4, 2016

vxgmichel mentioned this pull request Nov 7, 2016

Promote the use of a main coroutine asyncio-docs/asyncio-doc#13

Closed

1st1 mentioned this pull request Nov 8, 2016

Coroutine with asyncio.sleep() hangs when run in a different thread #312

Closed

tailhook mentioned this pull request Dec 14, 2016

Add tokio_core::reactor::current() tokio-rs/tokio-core#84

Closed

1st1 added a commit to 1st1/asyncio that referenced this pull request Feb 22, 2017

Fix asyncio to support instantiation of new event loops in subprocesses

135de6e

See PR python#452 for context.

1st1 mentioned this pull request Feb 22, 2017

Fix asyncio to support instantiation of new event loops in subprocesses #497

Merged

1st1 added a commit to 1st1/asyncio that referenced this pull request Feb 22, 2017

Fix asyncio to support instantiation of new event loops in subprocesses

a791d88

See PR python#452 for context.

asvetlov mentioned this pull request Jul 27, 2017

Add events task aio-libs/aiodocker#76

Merged

murilopolese mentioned this pull request Aug 29, 2018

Can't call rpc_request from event callbacks KanoComputing/community-sdk#3

Open

thehesiod mentioned this pull request Jan 26, 2019

_sock_connect_cb can be called twice resulting in InvalidStateError aio-libs/aiohttp#633

Closed

lidizheng mentioned this pull request Oct 24, 2019

[AIO] AsyncIO Tests Kokoro Integration grpc/grpc#20771

Merged

pshelar mentioned this pull request May 6, 2021

AGW: ubuntu: python: Fix linter issue related loop arg magma/magma#6815

Merged

1 task

kumaraditya303 mentioned this pull request Oct 2, 2022

Add documentation for asyncio._set_running_loop() python/cpython#85134

Closed

Uh oh!

Make get_event_loop() return the current loop if called from coroutines/callbacks #452

Make get_event_loop() return the current loop if called from coroutines/callbacks #452

Uh oh!

Conversation

1st1 commented Nov 3, 2016

Uh oh!

asvetlov commented Nov 3, 2016

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gvanrossum commented Nov 3, 2016 via email

Uh oh!

1st1 commented Nov 3, 2016

Uh oh!

gvanrossum commented Nov 3, 2016

Uh oh!

1st1 commented Nov 4, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gvanrossum commented Nov 4, 2016 via email

Uh oh!

1st1 commented Nov 4, 2016

Uh oh!

vxgmichel commented Nov 4, 2016

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gvanrossum commented Nov 4, 2016

Uh oh!

asvetlov commented Nov 4, 2016

Uh oh!

gvanrossum commented Nov 4, 2016

Uh oh!

vxgmichel commented Nov 4, 2016

Uh oh!

1st1 commented Nov 4, 2016

Uh oh!

1st1 commented Nov 4, 2016

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gvanrossum commented Nov 4, 2016

Uh oh!

1st1 commented Nov 4, 2016

Uh oh!

gvanrossum commented Nov 4, 2016

Uh oh!

1st1 commented Nov 4, 2016

Uh oh!

1st1 commented Nov 4, 2016

Uh oh!

gvanrossum commented Nov 4, 2016 via email

Uh oh!

thehesiod commented Feb 22, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

1st1 commented Feb 22, 2017

Uh oh!

thehesiod commented Feb 22, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

1st1 commented Feb 22, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

1st1 commented Feb 22, 2017

1st1 commented Nov 4, 2016 •

edited

Loading

thehesiod commented Feb 22, 2017 •

edited

Loading

thehesiod commented Feb 22, 2017 •

edited

Loading

1st1 commented Feb 22, 2017 •

edited

Loading