-
Notifications
You must be signed in to change notification settings - Fork 243
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[rfc] formalizing a split between "synchronous" traps and "blocking" traps #101
Comments
Can you say a bit more about what you're thinking about? One thing that's been on my mind generally is just the interaction between running tasks and the Curio kernel itself. At the moment, I have taken a very strict approach by which all Task->Kernel interaction occurs via traps (and hence the await/async mechanism). There is a certain consistency to doing it that way. It makes it more like an operating system. It also solves the problem of having to carry a reference to the kernel around (like needing the loop in asyncio). Are you thinking that synchronous operations would occur in some other way? |
Oh, sorry, no, I'm not suggesting changing the mechanics of the On Nov 11, 2016 8:19 AM, "David Beazley" notifications@github.com wrote:
|
I say "go for it!" |
I was just thinking about the problem of knowing which Curio calls are indeed synchronous and which are not, with the first hat @njsmith mentioned on. I think formalizing the traps the way it was done helps, but it does not help with curio's end user facing coroutines for which traps used are an implementation detail. For instance, nowhere does it say in the docs that The fact that there's no differences in calling conventions between potentially blocking and synchronous coroutines bothers me a bit, as I would say it negates in part the advantages in using async/await (as compared to threads) as underlined in Unyielding. I feel that, ideally, it should be obvious which end-user facing coroutines only use synchronous traps and are guaranteed in the future to do so. At the minimum, I guess those for which it is guaranteed should be documented as such, which they are not right now. I also have a crazier idea, that seems unlikely to be good, but still sharing. Here is a list of coroutines that should be guaranteed to be synchronous, and that would need to be converted, as far as I can tell:
But a solution should be found for the async context managers that do block in aenter:
This seems to be especially problematic for Locks and Semaphores, where context managers are the preferred interface, and there does not seem to be any good solution there. |
I think this is being overthought. I don't want to make any guarantee one way or the other about any operation involving async/await other than any such operation might block (or context switch). The operative word here is "might." The fact that certain low-level operations don't block is purely an implementation detail--and one that could change at any moment. It could be different using different kernel implementations or different scheduling priorities. Most of these "synchronous traps" are low-level operations that are used to assist in the implementation of some higher-level feature. They are not something that an end-user would typically be using directly in their own code. Honestly, I could probably just make them methods on the curio If some particular feature is formally specified to be non-blocking, then it can be documented as such. I'm fine with that. |
...I made a long post about this on the curio forum last night, and now it seems to have disappeared. Or maybe it's trapped in some sort of moderation hades?
Agreed -- my end goal is to figure out what guarantees end-user APIs are or aren't guaranteed to provide, but figuring out the right system for traps seems like a good first step :-).
This is... really weird :-). Context managers already have a clear and well-motivated use case that has nothing to do with synchronous/non-synchronous, and like you note, some of the most important use cases for async context managers are ones that involve blocking in What might make sense to me is to have a rule that whenever we can make a function non-async, we should, because that is a clear signal to the user about what it can and can't do. So for example, maybe
As a general principle, I strongly disagree. I started writing an erlang-style task supervisor, and it turns out that this is really annoying if you aren't allowed to even know whether Obviously there's a whole substantive conversation to have about exactly which operations should make which guarantees, though. I just think the rule should be more nuanced than "any operation could do anything". That's nice for devs, but not so nice for users. Good APIs have to balance those competing interests, plus things like usability, teachability, etc. Another general rule for API design is that if the code makes a guarantee but the docs don't, then this is often the worst of all worlds, because user can't count on it, but they end up depending on it anyway, so that the devs lose the freedom to change it. So for example, I like this note from Linus mentioning how some quirks of how Linux schedules processes after Also also, even if we, say, go ahead and document that |
Trying to decide where to respond... here or the forum. I guess I'll do it here for now. In the big picture, I agree that it will be useful to more precisely specify which operations are synchronous and which aren't. So, any patch that addresses that will be welcome. Certain details such as the behavior of With regard to Just on this note: One of the things I'm exploring (experimentally) is how one bridges the divide between async and sync code. I have found that it is very easy for asynchronous code to escape that world and start calling lots of a synchronous functions where you no longer have the ability to use Some further things to think about: Right now One other note on cancel: It is an error to cancel any task that happens to be sitting on the curio ready queue. If a task is sitting on the ready queue, it means that something happened to make it go there. It needs to run again to fully finish whatever that might have been. A particularly nasty situation concerns locks and synchronization primitives. One reason a task might be on the ready queue is that it has just been granted access to a lock. When the task runs again, it's assumed that it now has the lock. You can't just brute-force cancel a task like that. If you do, it won't get a chance the release the lock and everything will deadlock. Yes, this means that a cancellation request might be deferred until the task makes a future request to the kernel. Those are the breaks--I just don't know any other way around this. |
@dabeaz: when you have a chance could you check if there's some way to rescue my forum post? Looking at the forum it still doesn't appear to be visible, and while I can probably reconstruct it I'd rather not retype all that again :-) |
For some reason that message got flagged as spam. I've restored it (I think). Will need to fiddle with forum settings perhaps. Sorry about that. |
@njsmith : Can you provide a 'compact' use case where currently establisted semantics are not enough? And/or any similar semantics that currently exists in alternative async libs? |
My main message on the forum still seems to be hidden, and I have a message from the forum software claiming that "Your post was flagged as spam: the community feels it is an advertisement [...] Multiple community members flagged this post before it was hidden". Which seems... unlikely? But it does look like I can at least retrieve the content, so I'll repost here (with some small edits): Hi all, This is more of a high-level design discussion, so I guess I'll try putting it here instead of github issues and see what happens! I've recently been trying to understand the exact semantics of curio's primitive operations around rescheduling/blocking/cancellation. Like, when I do
In general these questions apply to all of curio's public API, but as a starting point I've just been looking at the low-level traps. It turns out that currently, traps fall roughly into three categories:
There are also some weirdos, like the
I'd like to propose the following API design principle: every trap should be either "synchronous" or "blocking", with the semantics described above. And, traps should have the "synchronous" semantics whenever possible. (Rationale: synchronous traps are much easier to reason about; yield points and cancellation points are both brain-melting, especially when you have too many of them. So better to leave them out by default when possible, and then let users add them back in if necessary.) If this is accepted, then here are the traps that would need to change (they don't currently fit into either category), and some discussion of the trade-offs:
Making Thoughts? |
See #116 for how the sync-vs-blocking approach could work. |
Closing this. However, might modify behavior of spawn() back to a child-first arrangement. Ran into something recently where that would have been advantageous. Also, it's documented as doing this. |
I'd like to modify the trap handling to formalize a distinction between:
synchronous traps: traps which always return immediately, which never trigger a reschedule (so you are guaranteed that other tasks haven't touched any shared state), and are not cancellation points.
(potentially) blocking traps: traps which might block, or might trigger a reschedule (
curio.sleep(0)
), and are cancellation points.The two motivations are: (a) With my user trying to write cancellation-safe code hat on, it's very helpful if some kinds of calls are documented to be synchronous, and to know which ones those are and to have a uniform set of guarantees around them. This has even bitten curio itself recently --
curio.Event
was assuming that_queue_reschedule_function
was synchronous, but it wasn't; the fix was to make it synchronous. This strongly suggests that synchronicity is and should be a public API guarantee for some operations. (b) With my poking around in curio's internals hat on, I think this would let us simplify some of the code. In particular, synchronous traps could be implemented by calling a function to do the trap-specific work and then having some standard code to take that function's return value/exception, inject that into the invoking task, and reschedule it; this would be an improvement over the current situation where every trap implements that logic in an ad hoc way withready_appendleft
calls scattered around everywhere. It might also make things more efficient in that if we know that we're going to reinvoke the same task, we can skip enqueueing/looping/dequeueing entirely, and it potentially simplifies things like alternative task scheduling policies if the queues only have to handle real rescheduling, not trivial reinvoke-the-same-task "rescheduling".Posting this here first because I want to get @dabeaz's opinion before I go refactoring a big chunk of curio's internals :-)
The text was updated successfully, but these errors were encountered: