-
-
Notifications
You must be signed in to change notification settings - Fork 81
Document tradeoffs #36
Comments
Yeah, I'd second that. This library is very widely referenced across stackoverflow, @erdewit After using lua coroutines, I have to say the asyncio folks really need to rethink things. |
Interesting rebuttal to dynamic coroutines: https://mail.python.org/pipermail/python-dev/2015-April/139695.html The issue boils down to the fact that when you allow the event_loop to be accessed wherever, the author feels you start to lose grasp of how execution control is being passed. But I'm not entirely sure the control flow is always obvious with lexical coroutines either. Coroutines, at least for me, enable the benefits of concurrency (mostly with IO, but there are others) while reducing race conditions and increasing performance. Like threads, I never expected to know exactly what code is executing at any particular time. For those that can follow their coroutine code flow perfectly, that's great, but I severely doubt that is the common case among those responsible for doing actual real world work, especially as third party libraries are integrated into larger projects. So, really I think the issue is less about being able step through all the code, and more about expectations of safety, fairness and starvation. This library doesn't use threads, so I don't see how safety could be impacted. However, with coroutine the issue of fairness can be tricky, especially when its usage becomes dynamic like lua. And on the issue of fairness, I don't think this patch quite works. But let me know if my code is making the wrong assumptions about what re-entrant means. Try these two out:
The latter unpatched approach runs as expected. Unfortunately, something very unexpected is going on with the former. Looking at the patch code, run is supposed to use the already created event_loop rather making a new one. But the scheduled tasks are clearly not being executed concurrently. |
That all said, I would really like dynamic / re-entrant coroutines. Without it, supporting asyncio will require significant refactoring of a lot of libraries, and for those trying to just get things done, that seems more important than philosophical discussions on what levels of theoretical code complexity are acceptable. ..which looks like gevent fits the bill. |
The performance impact depends entirely on the situation. It's negligible for what I use it for. With PyPy the nested variant can actually be quite a bit faster than the unnested variant for some reason. Best thing is to measure it for your own code.
I don't know really. This library gets used in ways that I'd never expected and if that leads to any issues then they would supposedly be filed here in the issue tracker. |
@erdewit I think most people are using as a hack to get things to work in jupyter and aren't measuring the fairness / performance. Is there something wrong with the code above? As far as I can tell the hack doesn't work. Which isn't surprising, as I dug into the code asyncio is really not meant to be nested. I also looked at another repo that you were using asyncio in nested manner, and there are bugs, quite possibly because of this contradicting use. I'd suggest recommending that users look at gevent. It is meant to be nested and is similar to the lua approach to coroutines. Also, I wouldn't just go by lack of reports. It's like the sherlock holmes mystery when the dogs didn't bark. |
@blazespinnaker Or in other words, to get cooperative concurrency to work, the coroutines must be cooperative. |
@erdewit Thanks. How would you write the code then? With a custom run routine? I was going to try that, but as I looked into the asyncio code itself I saw too many stumbling blocks. Many which would likely increase in number as asyncio is uprgaded in the stdlib. My sleep above uses run, which I assumed (wrongly I guess) that it would turn control over to the event_loop similar to what happens in lua / gevent. I get what you're trying to do, but I think the forces of asyncio are going to work against this. I'd use gevent which is specifically designed to make this use case work. |
The The coroutines can be made cooperative by using To be clear, this library does not solve Python's colored function problem in the way gevent (or lua) has it solved. |
Well, there are no other tasks only if we're not using the same event_loop where the previous ones were created. I guess there is a new event loop in there somewhere I missed. If so, that can lead to serious starvation. |
No, there's only one event loop. There's also only one queue for the scheduled callbacks. It has time/insert-order priority, so every task gets its turn - no starvation there. |
Hmm, we seem to be going back and forth a bit here. If you look at the code above, create_task will schedule a task to run soon. Before sleep is executed in inner(), there should be about 10 tasks scheduled for execution on the event loop. Is the problem get_event_loop() call in asyncio.get_event_loop().create_task()? If, as you say above: "The sleep does actually let the event loop run, which can then run other cooperative tasks. In your example there are no other cooperative tasks (there's nobody awaiting) so it appears nothing happens." Then the already scheduled tasks should run. I think the confusion here is that you are saying "there are no other cooperative tasks" and I'm not sure what that means, because asyncio.get_event_loop().create_task() should have scheduled 10 on the thread local event loop. And indeed, when you run the code above, it runs the first 10 tasks very immediately on the first sleep but then starvation occurs when the internal nested sleeps in each task is hit, because they are not giving time / returning control to the thread local event loop during their sleeping period. Note that I pasted in the patch code when I posted the bug, you can run it after the import asyncio. |
It's mentioned a few times already, but it means that your coroutines contain no
|
Ah, ok I see the confusion: "My sleep above uses run, which I assumed (wrongly I guess) that it would turn control over to the event_loop similar to what happens in lua / gevent." The sleep does actually let the event loop run, which can then run other cooperative tasks. In your example there are no other cooperative tasks (there's nobody awaiting) so it appears nothing happens. Rather, I think what you meant to say was sleep would have worked if I used await instead of run. The bit about there not being other cooperative tasks threw me as there are tasks scheduled on the event_loop. As I mentioned, I wrongly assumed that the patch library would allow for nested calls to the async library and use the same event loop. And of course I can't use await in a nested call. Using run, however, leads to the starvation I observed above. Hopefully this makes sense to us both now. |
fwiw, this causes starvation as well:
As I mentioned above, I believe that the nested run is not using the same global event loop or at at the very least it's not slicing out time to tasks already scheduled on it. When I say starvation, you just need to imagine any long running blocking IO being called on a nested run. Any tasks scheduled outside the run won't be fairly executed because the nested run doesn't give them any time. |
The problem that I see with this library is your run into things like this: Better would be just checking for event_loop running, and if so use await, otherwise run: |
I think most people are using it upgrade legacy projects, and avoid having to inject async keywords through their entire codebase. The idea that we can:
... and not have to deal with a massive (backwards compatibility break) refactor is very appealing. That Python asyncio is was an exclusive-or proposition was a bad design choice. It forces projects into two camps - those that use it, and use it exclusively, and those that do not. I like the middle-ground here. |
Perhaps, I think the behavior you see is what I would expect, unless I knew sleep was monkey-patched. |
Some people might, but I fail to why'd they want to use the library except in narrow circumstances. Any long running nested async task will deadlock their code. eg - https://github.com/tensorflow/federated/issues/869 |
Yeah -- clarity about what's monkey patched, and what is not seems essential. |
Just like plain asyncio, gevent, lua, or any other type of cooperative concurrency. It seems you're confused with preemptive concurrency (as used in multithreading for example). |
It might be less confusing for all involved if we are both more specific. I am claiming that nested run calls with this library will not execute tasks scheduled before the nested run call. So, if you have async library X which schedules async tasks in its event loop (after calling asyncio.run()) and then calls Library Y, and Library Y calls asyncio.run in a nested manner, it will block any scheduled tasks in library X until it completes in Library Y. Agree or disagree? |
Btw, I am confused by this statement: ...or at at the very least it's not slicing out time to tasks already scheduled on it. Just like plain asyncio, gevent, lua, or any other type of cooperative concurrency. Gevent will absolutely execute the event_loop for tasks that are scheduled on it! This can be done from anywhere in your application, there is no notion of what is an async function. All you have to do is call gevent.sleep() and the event loop and attached tasks will execute. If my phrasing "slicing out time" is what you are being semantic about, please don't :) I only use that term because many of my IO tasks are written in a way that they operate quickly and yield back control, and so only 'slices' of their over all execution lifetime occur before the next task in the event loop is executed - somewhat similar to the example I gave above, where each iteration of the for loop is a slice. Giving each other the benefit of the doubt and focusing on the specific concerns will likely resolve this discussion much faster. I have written event loop execution engines for lua coroutines and I'm fairly familiar with the issues involved here. In particular, starvation and fairness has always been the trickiest and often overlooked part. |
If true, can you show failure case. The previous example you posted was ambiguous. Are you calling time.sleep? if so it will block all cooperative concurrency libraries? |
@stuz5000 sleep(z) is calling asyncio.run(asyncio.sleep(z)) asyncio.sleep() can be seen as a proxy for a long running blocking io task. Normally when the call isn't in a nested run, it will turn control over to the event_loop and run all scheduled async tasks. However, with this patch, the nested run doesn't execute any scheduled tasks that were created before on the 'outer loop'. You can just copy/paste this code into python I think @erdewit understands this, but he has yet to confirm clearly that he does. That said, I don't think he appreciates the full implications of this problem and the subtle bugs that it is causing throughout the python ecosystem. As a hack for jupyter, I suppose it has its uses, but much better would be to recommend users simply call asyncio routines like await instead of run. |
Here's another example - pyppeteer/pyppeteer#99 |
Going to submit a bug on python.org, but I thought it'd make sense to give @erdewit the opportunity to address it here first. Contents as follows: "I'm seeing a high degree of confusion around nested asyncio calls in the ecosystem on both github and stackoverflow, where even large code base maintainers are not providing the correct advice Eg, the error here - https://stackoverflow.com/questions/57639751/how-to-fix-runtime-error-cannot-close-a-running-event-loop-python-discord-bot This has lead to a less optimal approach to addressing the error, nest_asyncio, rather than the correct approach of simply calling async routines via await. The patch, which sort of works in narrow circumstances, has only exacerbated the community confusion and left code bases with subtle and hidden bugs which frequently appear on any library upgrades. Also, few of the patch users seem to appreciate that nest_asyncio blocks the outer event_loop when a nested run is called and is not re-entrant. The patch maintainer is so far unwilling to document and make clear this behavior. Note also that that there are some members in the community that have a misguided desire to undermine asyncio because of a philosophical disagreement with how it was designed. This is unfortunate and advocating a sub optimal monkey patch which reduces the value of asyncio is not the correct way to address these theoretical differences of opinion. I would recommend tweaking the nesting errors in asyncio and then documenting the new version early so search engines will pick it up. Having an early and well placed stackoverflow question would help significantly as well. Ideally the official documentation/answers will explain succinctly and simply the logic and philosophy behind the so called "colored functions" constraint of asyncio to reduce code complexity as well as genuinely discuss alternatives (such as gevent) for those who wish to try a different approach. This would be a mature and transparent solution which would not only educate the community but also allow for the best approach to succeed. Differences of opinions can be healthy when addressed openly and head on rather than via methods that obfuscate their intentions. Listing alternatives provide an important safety valve that reduces heat and increases the light brought to a technical debate." |
If that starves a loop, that does smell like a bug. |
The same example seems to have been re-posted, making this this whole thread a bit circular. There's still only synchronous tasks, meaning with no await in them. It has been mentioned about four times already now to used Now these tasks, while being executed synchronously, will each spin the event loop. This is easily proven by adding a proper asynchronous task, one that actually awaits. Lets add a task called import asyncio
import time
import nest_asyncio
nest_asyncio.apply()
async def coro(x):
for i in range(0, 5):
print(f'coro{x}: {i}')
sleep(0.1)
async def ping():
t0 = time.time()
while True:
await asyncio.sleep(0.1) # look I'm awaiting!
print('PING', time.time() - t0)
def sleep(z):
asyncio.run(asyncio.sleep(z))
asyncio.ensure_future(ping())
for i in range(0, 10):
asyncio.ensure_future(coro(i))
sleep(5) The example is also rewritten a bit for better clarity. It gives the following output:
|
|
Please provide complete examples. Which module is sleep(20) from here?
If time.sleep, it fails as expected.
…On Sun, Nov 1, 2020 at 7:15 AM blazespinnaker ***@***.***> wrote:
Hmm, I had tried something like that but I had got the same results with
my previous code that didn't use await.
import asyncio
import nest_asyncio
nest_asyncio.apply()
async def echo(x):
for i in range(0,5):
print(x,i)
await asyncio.sleep(1)
def testNested():
for i in range(0,5):
asyncio.ensure_future(echo(i))
inner()
def inner():
for i in range(5,10):
asyncio.ensure_future(echo(i))
sleep(20)
def sleep(z):
asyncio.run(asyncio.sleep(z))
testNested()
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#36 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AB3QJLC4GZ2W4DNIJSLJOOTSNV3QBANCNFSM4SHIYT2A>
.
|
That's great that works, however you are not nesting asyncio in that case, or at least you're doing it in a cherry picked manner. Sorry, I had tried using 'await' but it didn't work. Starvation still occurs in the way I describe above (several times) It completes in about 50 seconds, which makes sense because there are 50 sleeps of 1 seconds, all executed serially, blocking everything else as they go. For example:
@stuz5000 just copy/paste code into latest python (3.8?) an it worked for me. |
This does work however, but I'm not nesting the sleep. It completes in about 5 seconds. Which makes sense, as echo takes 5 seconds to execute (5 sleeps), and all the echos are executed in parallel fashion.
|
Actually, your code exhibits the same behavior as mine and performs the nested run starvation I mentioned. Try increasing the sleep delay, it's more obvious. |
async def echo(x):
for i in range(0,5):
print(x,i)
# ....
mySleep(1) Apart from a bit of renaming and obfuscation, it's still the same as the first example, still no
|
@erdewit Sigh. Your code does the same thing as my code. Have we just been saying the same thing all along? Probably, but.. I am saying that this is undesired behavior. Nested calls which starve any previous tasks created on the thread global event loop is not what people would expect and would only be useful in very narrow circumstances. I would say documenting this and making it clear would be a very good idea. |
If people want to make nested calls, a much better library for that is gevent. Which doesn't starve previously created tasks |
Gevent has the exact same type of behavior for non-cooperative (non-yielding) tasks. Good luck with it. |
!? Absolutely it does not have the same type of behavior:
I don't think you understand coroutines or how they are being used, and this has lead to a monkey patch which is causing a lot of chaos in the ecosystem. |
Now try it with import gevent
import time
def echo(x):
for i in range(0, 5):
print(x, i)
time.sleep(1)
t0 = time.time()
for i in range(0, 10):
gevent.spawn(echo, i)
gevent.sleep(0)
print("Time: ", time.time() - t0) Output:
It is seen that the tasks are executing serially, exactly as asyncio did with non-cooperative tasks. |
You would never use time.sleep(1) in a cooperative concurrency program, for the very obvious reason that it would block the thread. It would be almost as bad as doing a long running spin lock. You use gevent.sleep or asyncio.sleep to sleep, as it will pass control to the event_loop handler and coroutines will be given execution time. The whole reason people use nest_asyncio is because programs like jupyter (or spyder or tensorflow or etc) are running potentially sync functions in the context of thread which is currency running async. Async functions can, however, call sync functions. Once that happens, the wise lords of asyncio have decided to not allow you to run async. Rather they throw a runtime error. nest_asyncio gets rid of the runtime error and allows you to do this. However, if you try to use nest_asyncio and run a coroutine from a sync function with either asyncio.run or loop.run_until_complete, it will block or starve every other coroutine created before the run call, even if you call asyncio.sleep() within the coroutine being ran. Gevent doesn't do this to you. You can absolutely run a sleep coroutine regardless of where you are in the code and it will pass control to the event_loop handler. Every spawned coroutine will get a chance to execute. Gevent does not have the notion of run() that asyncio has. |
That is obvious, but why then expect anything different from
The |
Yeah, I think I said that about 4 or 5 times above. This is not very useful, as run is the only way to do nested calls. Looking at the patch code, run is supposed to use the already created event_loop rather making a new one. But the scheduled tasks are clearly not being executed concurrently. Well, there are no other tasks only if we're not using the same event_loop where the previous ones were created. I guess there is a new event loop in there somewhere I missed. If so, that can lead to serious starvation. .. When I say starvation, you just need to imagine any long running blocking IO being called on a nested run. Any tasks scheduled outside the run won't be fairly executed because the nested run doesn't give them any time. .. Also, few of the patch users seem to appreciate that nest_asyncio blocks the outer event_loop when a nested run is called and is not re-entrant. The patch maintainer is so far unwilling to document and make clear this behavior. .. Actually, your code exhibits the same behavior as mine and performs the nested run starvation I mentioned. .. I am saying that this is undesired behavior. Nested calls which starve any previous tasks created on the thread global event loop is not what people would expect and would only be useful in very narrow circumstances. Gevent, however, let's you nest calls to coroutines and it doesn't spin a new event loop or starve any previously spawned coroutines. |
Perhaps related: |
I had the same concern that nested Maybe I missed something as I have not tried to run other examples of this discussion, or maybe I'm in an edge case that happens to work for "wrong" reasons. Tested on Python 3.7, 3.8, 3.9 and 3.10
|
Yes, absolutely true. This also holds for "live patching", that is applying nest_asyncio after a task has already started. |
Why would you bother using nest patching? Your code is all async already - just use asyncio. The point of using something like nest async is not to have async everywhere, but rather being able to return control to event loop handler from synchronous code. This is what things like gevent provide. Also, who would write code with asyncio.run in an async function? That makes zero sense. That should error out, IMHO. The fact that patching gets rid of that error seems more confusing than helpful. |
Ok, so hopefully final version. This is async:
This is sync / blocking (even after monkey patching):
To meet folks half way, agreed, doesn't starve previous, but because it blocks it can't nest libraries that use asyncio.run and get cooperative behavior. The patch is great if you have to just make something work and there is no other way, but folks should be aware of the limitations as compared to other cooperative concurrency approaches. Sadly, the repo owner refuses to alert people to this. |
Literally the first line of my post:
One can be interested of gradually converting blocking code to async code. In that context, an coroutine function could well still call the blocking API, regardless of how the blocking API is implemented (i.e. actually blocking or as a blocking shim on top of a coroutine function). Anyhow, all I wanted is to clear out for myself and for others wondering the same question whether this package was starving other tasks. Turns out it does not. Since the package does not pretend to provide a way to spawn tasks concurrently other than the standard
One might think that Thanks @erdewit for uploading this package on PyPI and maintaining it. |
Agreed, as a hack to make things work, it's a useful tool. It is however
absolutely not a solution for concurrency.
Anyone who uses it will block on the nested calls to asyncio.run(). Most people who are doing this, are likely uninformed about these consequences and don't realize that the so called patched async library they are using is NOT going to cooperate with other async code they might have.
The async folks specifically saw these extremely difficult to debug issues which is why they made the whole onerous colored functions regime a requirement. While I still much prefer gevent, this library seems to be confirming their fears.
Whether it 'pretends' to be something or not, we'll just have to agree to disagree.
|
Hi @erdewit I'm considering using this library to help adapt the Sparkmagic Jupyter Kernel to support Are there any obvious issues I should be aware of? What are the trade-offs between this library and https://pypi.org/project/asgiref/? Any help is appreciated! |
asgiref's AsyncToSync is probably not suitable for your use case, since it can't run in one thread that already has a running event loop. For nest_asyncio I do not see any obvious issues. Perhaps it can be done without nesting, using straight asyncio. |
Thanks for the quick reply @erdewit! How would you suggest using straight asyncio without nesting? |
@devstein I don't know, perhaps look how other kernel magic projects do it. |
SqlAlchemy does something similar with greenlets, allowing you to run a coroutine without starting a new thread. |
Relevant commit to a jupyter/notebook fork: penguinolog/notebook@11615d2 |
Please add some explanation on what tradeoffs one takes on when using this library.
For example from very briefly looking at the code it seems that using this forces pure Python asyncio code to used.
Is there a performance (or other) impact?
How can things break with nested loops?
Thanks.
The text was updated successfully, but these errors were encountered: