Skip to content
This repository was archived by the owner on Nov 23, 2017. It is now read-only.

Ensure get_event_loop returns the running loop when called in a coroutine #355

Closed
wants to merge 18 commits into from

Conversation

vxgmichel
Copy link

@vxgmichel vxgmichel commented May 25, 2016

This PR follows up on the issue 26969.

It is meant to ensure that asyncio.get_event_loop returns the running loop when called in a coroutine (or loop callbacks). This could simplify all the coroutines that includes a loop optional argument by removing it.

This is done by modifying the interface of AbstractEventLoopPolicy:

  • get/set_event_loop is renamed to get/set_default_loop
  • get/set_running_loop is added
  • get_event_loop now uses the running loop if available, and the default loop otherwise

BaseEventLoopPolicy is updated, and adds a few features:

  • it does not allow to set a running loop if another one is already set
  • it issues warnings if the running loop doesn't correspond to the default loop

A context manager is also added to AbstractEventLoopPolicy and AbstractEventLoop to set and clear the running loop in a safe manner. This might help for other event loop implementations.

Vincent Michel added 4 commits May 25, 2016 18:11
- get/set_event_loop is renamed to get/set_default_loop
- get/set_running_loop is added
- get_event_loop now uses the running loop if available,
  and the default loop otherwise

The BaseEventLoopPolicy is updated, and adds a few features:
- it does not allow to set a running loop if another one is
  already set
- it issues warnings if the running loop doesn't correspond
  to the default loop

A context manager is also added to AbstractEventLoopPolicy and
AbstractEventLoop to set and unset the running loop in a safe
manner. This might help for other event loop implementations.
@@ -507,28 +509,52 @@ def get_debug(self):
def set_debug(self, enabled):
raise NotImplementedError

# Running context

def _running_context(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't think we need this as a separate method, this is something that users shouldn't ever call directly (even though it's a "private" name they will).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, it is probably safer!

@1st1
Copy link
Member

1st1 commented Jun 3, 2016

@vxgmichel Thanks for working on this! There is a chance we can merge this before 3.5.2. Please update the PR.

@vxgmichel
Copy link
Author

@1st1 I just removed the context manager and the warnings. Should I also revert the simplification for SleepTests and TimeoutTests?

# Non-abstract methods

def get_event_loop(self):
"""Return the running event loop if any, and the default event
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first docstring line should be one-line sentence under 72 characters long (see https://www.python.org/dev/peps/pep-0257/#multi-line-docstrings). Please fix all docstrings that this PR adds/modifies.

@1st1
Copy link
Member

1st1 commented Jun 3, 2016

@1st1 I just removed the context manager and the warnings. Should I also revert the simplification for SleepTests and TimeoutTests?

Yes, absolutely. That code is there for a reason.

I also don't think we need set_default_loop and get_default_loop methods. Again, good ideas, but not for 3.5.2.

@1st1
Copy link
Member

1st1 commented Jun 3, 2016

@1st1 I just removed the context manager and the warnings. Should I also revert the simplification for SleepTests and TimeoutTests?

Yes, absolutely. That code is there for a reason.

Just realized that with get_event_loop using get_running_loop, that code is indeed becoming meaningless. The problem is that I still want unittests to actually check that event loop is passed explicitly everywhere in asyncio...

@1st1
Copy link
Member

1st1 commented Jun 3, 2016

I think we need to add a new policy specifically for unittests (the majority of them), that would simply raise an exception in get_event_loop and get_running_loop.

@vxgmichel
Copy link
Author

vxgmichel commented Jun 3, 2016

Yes, absolutely. That code is there for a reason.

It seems like it used to be there for a reason, but not anymore since TestCase.tearDown basically does the same thing. Anyway, it has nothing to do with this PR (the modified tests pass on the master branch as well), so I'll just revert it.

@1st1
Copy link
Member

1st1 commented Jun 3, 2016

Yes, absolutely. That code is there for a reason.

I seems like it used to be there for a reason, but not anymore since TestCase.tearDown basically does the same thing. Anyway, it has nothing to do with this PR (the modified tests pass on the master branch as well), so I'll just revert it.

TestCase.setUp sets the current loop to None, so that get_event_loop fails with a RuntimeError if the loop isn't passed explicitly somewhere in the codebase.

TestCase.tearDown explicitly closes the event loop that was used in the tests.

Both should stay the same way.

Now, the way the updated get_event_loop works, is that it won't fail if set_event_loop(None) is called. We need a new event policy specifically for tests, that has to be a part of this PR.

@1st1
Copy link
Member

1st1 commented Jun 3, 2016

In addition to the above: tests on Travis spit out this:

sys:1: ResourceWarning: gc: 12 uncollectable objects at shutdown; 
use gc.set_debug(gc.DEBUG_UNCOLLECTABLE) to list them

@vxgmichel
Copy link
Author

vxgmichel commented Jun 3, 2016

TestCase.setUp sets the current loop to None, so that get_event_loop fails with a RuntimeError if the loop isn't passed explicitly somewhere in the codebase.

OK I finally got it, we want to make sure the optional loop argument is never silently ignored. It makes sense, but it might be a problem in the long run. For instance, coroutines such as asyncio.sleep will probably drop their loop argument at some point:

@coroutine
def sleep(delay, result=None):
    loop = get_event_loop()
   [...]

So get_event_loop() is expected to return the running loop here, even during the tests. However, the RuntimeError is still very useful for objects that are not coroutines, such as locks or queues. One way to deal with this issue is to instantiate such objects outside the running loop to perform this specific test. It turns out it is already the case for some of them (maybe all of them?), see the locks for instance:

    def test_ctor_loop(self):
        loop = mock.Mock()
        lock = asyncio.Lock(loop=loop)
        self.assertIs(lock._loop, loop)

        lock = asyncio.Lock(loop=self.loop)
        self.assertIs(lock._loop, self.loop)

    def test_ctor_noloop(self):
        asyncio.set_event_loop(self.loop)
        lock = asyncio.Lock()
        self.assertIs(lock._loop, self.loop)

Is that a workable solution or am I missing something?

@1st1
Copy link
Member

1st1 commented Jun 3, 2016

For instance, coroutines such as asyncio.sleep will probably drop their loop argument at some point:

That's debatable and out of the scope of this PR ;)

Is that a workable solution or am I missing something?

The correct solution is to create a new event loop policy for tests and update the tests to use it:

class Policy:
    def get_event_loop(self):
        raise RuntimeError

@vxgmichel
Copy link
Author

That's debatable and out of the scope of this PR ;)

All right!

The correct solution is to create a new event loop policy for tests and update the tests to use it

It turns that it's quite painful to do because of cyclic imports (a TestEventLoopPolicy would inherit from DefaultEventLoopPolicy that is imported by __init__.py from either unix_events or windows_events. I came up with another solution, dirty but well contained:

 class TestCase(unittest.TestCase):
     def set_event_loop(self, loop, *, cleanup=True):
         assert loop is not None
+        # patch policy so get_event_loop doesn't return the running loop
+        policy = events.get_event_loop_policy()
+        policy.get_event_loop = policy.get_default_loop
+        def clean_policy():
+            try:
+                del policy.get_event_loop
+            except AttributeError:
+                 pass
+        self.addCleanup(clean_policy)
         # ensure that the event loop is passed explicitly in asyncio
         events.set_event_loop(None)
         if cleanup:
             self.addCleanup(loop.close)

Is it acceptable? Can you think of a better solution?

@1st1
Copy link
Member

1st1 commented Jun 3, 2016

I'd add a function like this to test_utils (untested code):

def disable_get_event_loop(test_case):
    assert isinstance(test_case, unittest.TestCase)

    policy = events.get_event_loop_policy()
    if hasattr(policy, '_patched_get_event_loop'):
        return

    def get_event_loop():
        raise RuntimeError(
            'asyncio.get_event_loop() is disabled in asyncio tests')

    def reset_event_loop_method():
        policy.get_event_loop = old_get_event_loop
        del policy._patched_get_event_loop

    old_get_event_loop = policy.get_event_loop
    policy.get_event_loop = old_get_event_loop
    policy._patched_get_event_loop = True

    test_case.addCleanup(reset_event_loop_method)

And then I'd add a call to it to test_utils.TestCase.new_test_loop, test_utils.TestCase.set_test_loop, and all other test cases in asyncio tests that aren't inherited from test_utils.TestCase.

Also, in of the earlier messages, I asked you to remove get_default_loop and set_default_loop from this PR.

@vxgmichel
Copy link
Author

vxgmichel commented Jun 3, 2016

I just removed get/set_default_loop, and added the policy patch. I changed your example though: I patched get_running_loop instead of get_event_loop so that the behavior of get_event_loop is the same as before this PR.

@gvanrossum
Copy link
Member

There's no way I can follow all the conversation that happened here. Why are we now collapsing get_event_loop() and get_running_loop() into one function? This seems to defeat the purpose of verifying in the unittests that there are no implicit dependencies on get_event_loop(). Should we just change all code that needs an event loop to assume get_event_loop() never raises? (E.g. sleep().)

It's possible there's a good reason, but it better be explained well, not hidden in a PR. I think if we're really changing this we should have a more public debate on the tulip list. There are a lot of words devoted to this topic in PEP 3156, and I'd hate for the PEP to become out of date like this.

@vxgmichel
Copy link
Author

vxgmichel commented Jun 5, 2016

@gvanrossum

Why are we now collapsing get_event_loop() and get_running_loop() into one function?

Let's assume get_event_loop and get_running_loop are differentiated. Now, let's write a coroutine:

async def coro():
    loop = asyncio.get_running_loop()
    queue = asyncio.Queue(loop=loop)
    [...]

This is good, because coro doesn't need an optional loop=None argument, and loop is guaranteed to be the loop in which coro() is running. However, we have to pass the loop to queue explicitly, because loop might be different from get_event_loop(). It'd be much nicer to write:

async def coro():
    queue = asyncio.Queue()
    [...]

and rely on the fact that using loop=None (i.e get_event_loop()) ensures that queue uses the one running loop.

There are a lot of words devoted to this topic in PEP 3156, and I'd hate for the PEP to become out of date like this.

It does change a few things in PEP 3156. What is said in Passing an Event Loop Around Explicitly still holds, but only outside the running loop. Inside, loop=None and get_event_loop simply use the running loop.

I think if we're really changing this we should have a more public debate on the tulip list.

It is indeed a big change that needs to be discussed, but I think the idea is worth considering. Here's the list of the coroutines in asyncio that could benefit from this:

  • sleep
  • wait
  • wait_for
  • create_subprocess_shell
  • create_subprocess_exec
  • open_connection
  • start_server
  • open_unix_connection
  • start_unix_server

self._set_coroutine_wrapper(self._debug)
self._thread_id = threading.get_ident()
try:
policy.set_running_loop(self)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has to be outside of the try statement.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we do that, this test (not commited yet) would fail:

    def test_run_loop_inside_loop(self):

        @asyncio.coroutine
        def coro():
            loop2 = asyncio.new_event_loop()
            self.assertRaises(RuntimeError, loop2.run_forever)
            self.assertFalse(loop2.is_running())  # AssertionError: True is not false
            loop2.close()

        loop = asyncio.new_event_loop()
        loop.run_until_complete(coro())
        loop.close()

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm beginning to think that this is all one big misunderstanding. I never
meant get_event_loop() to return a loop that's different from the running
loop. If you have multiple loops associated with the same thread you need
to implement a new EventloopPolicy that keeps track of which one is running
(e.g. via an explicit stack, or perhaps an implicit one using a local
variable to save the event loop and restoring from that in a finally
clause).

The only thing I meant to be possible is for get_event_loop() to raise an
exception, under special circumstances (especially unit tests that are
checking that no code accidentally relies on the default loop).

For library code I think there are two cases it should support:

  • No default event loop, loop must always passed in
  • Default event loop == current event loop, used when no loop is passed in

For application code I think it is totally fine to assume that
get_event_loop() returns the current, running loop. An application written
with this assumption never needs to pass a loop to something it calls
(since library code should always default to using get_event_loop()).

Note all the words devoted in the PEP to "context".

In any case I think we should stop coding and start discussing the use case
you are trying to address on the tulip list, to see whether it's real or
whether you are worrying too much (or whether maybe you should just write a
custom EventloopPolicy for your use case).

Copy link
Author

@vxgmichel vxgmichel Jun 6, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gvanrossum
I actually don't have a use case. I simply noticed that asyncio, 3rd party libraries and users tend to pass the running loop explicitely to their coroutines, because "explicit is better than implicit". That, in my opinion, is unecessary and could be simplified.

Sadly, such a simplification conflicts with the PEP and the way asyncio unittesting currently works. I understand that those constraints makes it hard, and maybe impossible to implement. I orginally meant this PR as a proposal and a starting point towards this simplification.

However, this idea didn't seem to get a lot of support and I don't feel like trying to push it forward anymore. Maybe more people will come up with the same rationale later on, and I'll be happy to help at that time.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gvanrossum

I'm beginning to think that this is all one big misunderstanding. I never
meant get_event_loop() to return a loop that's different from the running
loop. If you have multiple loops associated with the same thread you need
to implement a new EventloopPolicy that keeps track of which one is running
(e.g. via an explicit stack, or perhaps an implicit one using a local
variable to save the event loop and restoring from that in a finally
clause).

But here's the problem: if you have multiple loops associated with the same thread, policy doesn't know which one is currently running.

That's why I like the idea of adding policy.set_running_loop and policy.get_running_loop functions, because it makes policies more capable and allows you to worry less about multiple running loops event with the default policy.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you have multiple loops associated with the same thread, policy doesn't know which one is currently running.

Then write a new policy. The whole point of having policies is that you can write different ones. The default policy does not really cater to this use case, but a policy is a really simple object.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without set_running_loop and get_running_loop how will the policy know precisely what's going on? All I'm saying is that get_event_loop and set_event_loop are kind of detached from the loop's lifecycle; policy doesn't receive any notification if a loop was stopped or run again.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ambv
Copy link

ambv commented Jun 6, 2016

Correct me if I'm wrong but it seems to me the reason of this entire debate is as simple as:

Having to pass loop= around everywhere is brittle boilerplate.

We see this pattern emerging internally at Facebook because:

  • we want to control the event loop during unit tests
  • we sometimes run asyncio threads in a bigger threaded application

Sadly, as I said, having to pass the loop around everywhere is tiringly verbose and brittle (all it takes is for me to forget to do it once and I'm back on the default loop again). The current state of things reminds me of logging frameworks that require you to pass the logger around everywhere. Correct but brittle and inconvenient.

get_running_loop() is a possible way out of this. But also is Ben's "what is the current coroutine runner" proposal on async-sig, which I think is more general. I don't think there's a solid proposal for this yet but I imagine it would return an instance of asyncio.Task for asyncio, which is tied to the current running event loop.

@1st1
Copy link
Member

1st1 commented Jun 6, 2016

get_running_loop() is a possible way out of this. But also is Ben's "what is the current coroutine runner" proposal on async-sig, which I think is more general.

I'm not so sure about the "what is the current coroutine runner" proposal. It seems that Ben wants the user to call it and depending on the result decide what to use -- tornado.gen.multi or asyncio.gather.

I don't think there's a solid proposal for this yet but I imagine it would return an instance of asyncio.Task for asyncio, which is tied to the current running event loop.

If the task is created manually, asyncio.Task may get its loop reference through get_event_loop too.

@ambv
Copy link

ambv commented Jun 6, 2016

I'm not worried about asyncio.Task instances created by hand. If you use the "low-level" approach, you're responsible for it. I'm a little worried about code like:

async def some_coro():
    asyncio.ensure_future(other_coro())
    ...

It'd be helpful if ensure_future defaulted to the currently running loop of some_coro() and not to get_event_loop(). However, I understand that by Guido's design, the two were never meant to be different. Maybe we should revisit if this is actually enough?

For unit tests for example, I think it would be enough if setUp() ensured to set_event_loop() every time and tearDown() ensured the event loop has nothing pending left on it. For threaded applications, I can see having a custom loop policy that creates the event loop just-in-time and warns, effectively removing the posibility of get_event_loop() to return None.

@ambv
Copy link

ambv commented Jun 6, 2016

One case where we found the current get_event_loop() behavior for threading to be tricky is when you have secondary threads that schedule things on the main event loop using call_soon_threadsafe(). But we understood since that we STILL want get_event_loop() to return None in this case to ensure the user is not scheduling things on a loop that will never get run. Instead, we should only pass the bound method loop.call_soon_threadsafe to the secondary threads, which is both cleaner (states intent better) and makes it harder to do the wrong thing.

So... I'm warming up to the idea that we should stop worrying (and passing loop= around) and just embrace get_event_loop().

@gvanrossum
Copy link
Member

On Mon, Jun 6, 2016 at 12:21 PM, Łukasz Langa notifications@github.com
wrote:

One case where we found the current get_event_loop() behavior for
threading to be tricky is when you have secondary threads that schedule
things on the main event loop using call_soon_threadsafe(). But we
understood since that we STILL want get_event_loop() to return None in
this case to ensure the user is not scheduling things on a loop that will
never get run. Instead, we should only pass the bound method
loop.call_soon_threadsafe to the secondary threads, which is both cleaner
(states intent better) and makes it harder to do the wrong thing.

So... I'm warming up to the idea that we should stop worrying (and passing
loop= around) and just embrace get_event_loop().

Yes!

Or write your own event loop policy (easier than you think, it's a really
simple API).

--Guido van Rossum (python.org/~guido)

@vxgmichel
Copy link
Author

vxgmichel commented Jun 9, 2016

I never meant get_event_loop() to return a loop that's different from the running
loop. [...] The policy should also be in charge of deciding which loop runs [...]

This is something I did not realize about get_event_loop and it does make a lot of sense. In fact, it makes the whole get_running_loop idea unnecessary. Maybe those two statements could be made more obvious by comparing the loop against the policy before running it:

    def run_forever(self):
        try:
            if asyncio.get_event_loop() != self:
                raise RuntimeError('Loop not approved by the policy')
        except asyncio.NoLoopSetError:
            # Now running in 'explicit' mode, used for unit-testing
            pass
        else:
            # The loop has been approved by the policy
            pass         
        [...]

The following example lists the different use cases:

# Typical use case
loop = asyncio.get_event_loop()
loop.run_until_complete(asyncio.sleep(1)) # OK!

# Special use case
loop = CustomLoop()
asyncio.set_event_loop(loop)
loop.run_until_complete(asyncio.sleep(1)) # OK!

# Unit-testing
loop = asyncio.test_utils.TestLoop()
asyncio.set_event_loop(None)
loop.run_until_complete(asyncio.sleep(1, loop=loop)) # OK!

# Invalid use case
loop = asyncio.new_event_loop()
loop.run_until_complete(asyncio.sleep(1, loop=loop)) # Raise!

@gvanrossum
Copy link
Member

gvanrossum commented Jun 9, 2016 via email

@vxgmichel
Copy link
Author

@1st1 I guess I can close this PR right?

By the way, a somehow related question has been asked on stackoverflow.

@1st1
Copy link
Member

1st1 commented Jun 13, 2016

@vxgmichel Yes, let's close this PR for now. We should indeed focus on the docs; want to work on that? ;)

@1st1 1st1 closed this Jun 13, 2016
@vxgmichel
Copy link
Author

vxgmichel commented Jun 13, 2016

@1st1 I'd be happy to help, but I don't really know what the best approach is. For instance, how do we address this comment:

I'm wondering though, since explicit is better than implicit, should explicitly passing the event loop be the preferred style?

@1st1
Copy link
Member

1st1 commented Nov 3, 2016

@gvanrossum Guido, are you OK if I merge this PR? (as per https://groups.google.com/d/msg/python-tulip/yF9C-rFpiKk/tk5oA3GLHAAJ)

@1st1 1st1 reopened this Nov 3, 2016
@gvanrossum
Copy link
Member

If you're asking do I want this feature? Yes. If you're asking me is this PR perfect? I have no idea, but I trust your review. (I don't want to have to re-understand the long discussion we had here earlier.)

@1st1
Copy link
Member

1st1 commented Nov 3, 2016

If you're asking me is this PR perfect? I have no idea, but I trust your review.

Thank you. I'll spend some time testing the patch (with uvloop tests too) and will merge it soon.

@vxgmichel
Copy link
Author

@1st1
Before you decide to merge this PR, I have to say I don't really like it anymore.

A few month ago, @gvanrossum wrote:

I never meant get_event_loop() to return a loop that's different from the running
loop. [...] The policy should also be in charge of deciding which loop runs [...]

This made me realize the whole get_running_loop idea is unnecessary. Instead of modifying the policy interface, we could simply compare the loop against the policy before running it:

    def run_forever(self):
        try:
            if asyncio.get_event_loop() != self:
                raise RuntimeError('Loop not approved by the policy')
        except asyncio.NoLoopSetError:
            # Now running in 'explicit' mode, used for unit-testing
            pass
        else:
            # The loop has been approved by the policy
            pass

Then we can advertise in the docs that get_event_loop() is safe to use inside coroutines.

Would that work, or I am missing something?

@asvetlov
Copy link

asvetlov commented Nov 3, 2016

People may write

loop = asyncio.new_event_loop()
loop.run_forever()

Without changing default loop or even with disabling it by asyncio.set_event_loop(None) call.
At least I do it constantly in unit tests.

@1st1
Copy link
Member

1st1 commented Nov 3, 2016

@vxgmichel Please read the thread: https://groups.google.com/d/msg/python-tulip/yF9C-rFpiKk/tk5oA3GLHAAJ.

The point is to make get_event_loop() to always return the correct loop when it is called from a coroutine. I'm working on an overhaul of asyncio documentation right now, to deemphasize the importance of event loop in asyncio programs: don't pass it around, design APIs that don't even accept it etc.

So we want to modify one single aspect of get_event_loop: when called from coroutine always return the current event loop, no exceptions can be raised. The call must always succeed.

The point of modifying get_event_loop is to make all current asyncio code benefit from the change.

Since asyncio.get_event_loop is ultimately controlled by the policy, there should be a mechanism for event loops to notify the policy of what loop is currently running. To me, the most obvious way is to add policy.set_running_loop and policy.get_running_loop. These must only be used by loop implementations and will be documented as such.

I like the design of the current PR.

@gvanrossum
Copy link
Member

gvanrossum commented Nov 3, 2016 via email

@1st1
Copy link
Member

1st1 commented Nov 3, 2016

There could be a different way to implement this, making the "running" loop
a thread-local that is returned by the global get_event_loop() function in
preference over calling the policy's event loop.

I like this approach more. Does this make sense to you:

  1. We add asyncio.set_current_loop() and asyncio.get_current_loop() (will use a global asyncio TLS object).
  2. We modify asyncio.get_event_loop() to first call asyncio.get_current_loop(). If there's no current loop, we default to policy.get_event_loop().

This way there is no API change for policies, and it's easy to add support of this to third-patry loop implementations.

I think I'll close this PR and make a new one implementing this strategy.

@1st1 1st1 closed this Nov 3, 2016
@gvanrossum
Copy link
Member

gvanrossum commented Nov 3, 2016 via email

@asvetlov
Copy link

asvetlov commented Nov 3, 2016

I vote on underscores for both getter and setter -- this API is really private.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants