-
-
Notifications
You must be signed in to change notification settings - Fork 345
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to start up a complex task tree in an orderly way, given spawn's non-determinism? #284
Comments
Right, what It looks like #136 will lead to some simplification in the nursery API, which may also make this a little easier to explain, because the body of the In any case, your problem – wanting to run initialization code in a deterministic order – is an important one; it comes up all over the place. (Another common example: in a test suite, if you start a server and then start a client that connects to that server, you want to know that the server is fully started and ready to accept connections before you start the client.) Here's one possible pattern: define a So in your case, if I'm understanding your pseudo-code right, this might look something like: class Child:
def start(self, nursery): # or 'async def', if the initialization might be async
self.initialize_this_object()
nursery.spawn(self._main)
async def _main(self):
async with trio.open_nursery() as nursery:
for child in self.children:
child.start(nursery)
async with trio.open_nursery() as nursery:
root.start(nursery) The requirement to have an extra single-child nursery at the top level is a bit unfortunate. I guess it could be wrapped into a Does that make sense? Does it actually address your problem? I'm not sure I fully understood the constraints. |
(@mathan: not 100% sure github's notifications are working properly after my shenanigans -- I did reply to your comment here, in case you didn't see it.) |
I did see, just didn't get a chance to reply yet :)
…On Aug 14, 2017 9:56 PM, "Nathaniel J. Smith" ***@***.***> wrote:
***@***.*** <https://github.com/mathan>: not 100% sure github's
notifications are working properly after my shenanigans -- I did reply to
your comment here, in case you didn't see it.)
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#284 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABkW_n8GIaH82ORQ-ml7Y1a2CT3DiGDdks5sYPq2gaJpZM4O1AF4>
.
|
Cool, no worries then :-) |
I had a better idea last night! This is pretty exciting, I've been frustrated about the waiting-for-task-to-start problem since well before trio even existed. Idea:
That's... basically it. It's very simple, actually. But if it's baked in as a convention then people will use it, and it's super easy to use. And much easier to understand than some mess like I thought for a bit about whether we could make this automatic – like if you want to spawn some function that doesn't notify readiness, it won't wait, but if you do then it does. (Unless you explicitly opt out of waiting somehow.) But on further thought I think this is both impossible and undesirable. The "impossible" part is that to do this we need to somehow immediately detect when a function is not going to provide any kind of readiness notification. This requires the ability to see into the future. To make this work we'd need to base our decision off things we can immediately observe, i.e. the signature or attributes of the function being called, and in a world full of decorators and lambdas and stuff then that just can't be done reliably. And the undesireable part is, well, generally you want to know whether something is going to block and what ordering guarantees your code is making. Plus the version that waits is async and the version that doesn't wait is sync, and we would want to have an explicitly no-wait version anyway, so... yeah, not a big loss. It has to be an explicitly passed token instead of something ambient ( Should it be Cute enhancement: if the new task raises an error before notifying, then we should raise that in the waiting spawner instead of propagating it into the nursery. What should we do if the spawner is waiting for the child to declare readiness, and instead the child exits without raising an exception? Obviously the spawner should stop waiting. But there are a few options:
If our general metaphor is that the new task is like a subroutine of the spawner until it calls In general names are really important here. Things that need names are:
If you accidentally use the no-wait version when you wanted the wait version, then it will seem to work, but you have unexpected concurrency. Unexpected concurrency is a pernicious bug; we don't want that. So if we pick bad names there's a danger of the no-wait version becoming a footgun (attractive nuisance variety). Plus the two methods are pretty similar, so we need to make sure it's easy to remember which is which or we'll get confused. For example, We can't do Probably worth considering this and #136 together since they both involve overhauling nursery semantics. |
Idea for naming: maybe make some connection between the spawn-and-don't-wait and the notification method, like |
Some ideas for names: I like how No idea whether other people would read |
So I'm leaning towards Regarding the API for tasks reporting back: I did a little more research on the status interfaces that service managers provide. Systemd has statuses "READY", "RELOADING" (followed by "READY"), "STOPPING", plus you can set a free-form status string with "STATUS=". Windows services have statuses I think this suggests that even if we want to start with just a "yeah I'm started" method, we should leave the door open to adding more methods like this in the future, both because this is evidence that they might be useful in general (the relationship between a task and its parent is fairly analogous to the relationship between a service and a service manager), and because in particular we might want to pun this interface to directly talk to service managers. So I'm leaning towards making the kwarg async def awesome_server(*, task_status=trio.STATUS_IGNORED):
...
task_status.started() (and maybe in the future we will add |
Oh hmm. The interaction between cancellation and blocking- The simple case is simple: if we're just doing async with open_nursery() as nursery:
await nursery.start(...) then any cancellation that affects the call to What if the call to Also, what if the nursery gets cancelled? What affect should this have on the spawner blocked in Maybe one way to do it would be to literally have an @attr.s
class _TaskStatus:
_final_nursery = attr.ib()
_child = attr.ib(default=None)
_spawner = attr.ib(default=None)
async def start(self, async_fn):
# remember that 'self' here is the eventual nursery
task_status = _TaskStatus(self)
async with open_nursery() as nursery:
task = nursery.start_soon(partial(async_fn, task_status=task_status))
task_status._child = task
return await task_status._wait() where Would this This approach would also fix something that's a little tricky with the current nursery interface: if the task exits/crashes before calling This proposed semantics also makes it even more unclear what And it does seem a bit odd to allow tasks to be blithely spun all the way up to "started" and report success to their spawner and then immediately be cancelled. Hmm. |
There's another issue to worry about with the "adoption" approach: before the adoption, only the spawner's cancellations can affect the child task, and after the adoption, only the new nursery's cancellations can affect the child task, so that's good. But at the moment of the adoption, we could potentially switch over a task that has a cancellation in flight. I think the way this would need to work is: The other approach would be to do something complicated with a special cancel scope inside the new task whose only purpose is to let us manually propagate a cancel from For "real" erlang-style supervisors that have restart policies etc., on further thought I think we're actually OK. First, there's nothing that says they need to have the same API as nurseries... I was thinking that at first, but then I realized that was partly a hold-over from when I was thinking that we needed a more elaborate way of getting serialization of task startup, where each task that supported this would need some custom code that took a duck-nursery and spawned stuff into it. But that constraint doesn't exist in this new design. Also, given that supervisors are just a different thing than regular nurseries, I'd feel pretty comfortable saying something like "the |
More edge cases to worry about: Edge case 1: I've gone back and forth on what to do when the child exits cleanly before calling async def child(*, task_status=STATUS_IGNORED):
return "hi"
async with trio.open_nursery() as nursery:
await nursery.start(child) The two obvious options are to either act like they called Edge case 2: normally, a nursery stays open until all of its tasks have exited, and then it becomes "closed", and it's impossible to spawn new tasks in it. So this sets up a weird possibility: what if the nursery is open when we call |
start_soon is just a new name for spawn, except it doesn't return the new task (in preparation for python-triogh-136, where we're going to stop emphasizing task objects in the main api) start is a major new feature: it provides a very simple way to start up a long running task, while blocking until it's finished whatever initialization it wants to do. At least... it's simple from the user's point of view. Internally it's quite tricky indeed. The whole _run.py file probably needs some refactoring and splitting up, but this is one of those cases where I think it's best to first get the new functionality working and nailed down, and then we can see what shape the new abstractions should be. Fixes python-triogh-284.
start_soon is just a new name for spawn, except it doesn't return the new task (in preparation for python-triogh-136, where we're going to stop emphasizing task objects in the main api) start is a major new feature: it provides a very simple way to start up a long running task, while blocking until it's finished whatever initialization it wants to do. At least... it's simple from the user's point of view. Internally it's quite tricky indeed. The whole _run.py file probably needs some refactoring and splitting up, but this is one of those cases where I think it's best to first get the new functionality working and nailed down, and then we can see what shape the new abstractions should be. Fixes python-triogh-284.
Getting this to work required a slightly annoying hack: before, the nursery monitor queue received each task as it died; now, it still receives those, but it also receives a |
This is prefect!!! Sorry I didn't get back until now, but I really like this approach as it makes things much easier to reason about and leads to cleaner code. I'm not sure what I could have added anyway as you seem to understand the issues involved like 100x more than I do :) In my original issue I wanted to be able to start a tree of async tasks and a do on them either in depth-first pre order or depth-first post order. Following is an example of how it works using your newly added changes and it's super simple to switch between pre and post. Using spawn by comparison, it'd need to add a lot more complexity to make it work. import trio
class Node(object):
children = []
parent = None
depth = 'A'
index = 0
init_order = []
compute_order = []
def __init__(self, parent=None, index=0):
super(Node, self).__init__()
self.parent = parent
self.children = []
if parent is not None:
self.depth = chr(ord(parent.depth) + 1)
self.index = 2 * self.parent.index + index
@property
def name(self):
return '{}{}'.format(self.depth, self.index)
async def init(self, task_status=trio.STATUS_IGNORED):
await trio.sleep(0)
self.init_order.append(self.name)
task_status.started()
async def compute(self):
await trio.sleep(0)
self.compute_order.append(self.name)
async def start_node_spawn_dfo_pre(self):
await self.init()
async with trio.open_nursery() as nursery:
# problem here is that the children will init in random order
# but our compute may happen before children all init
for child in self.children:
nursery.spawn(child.start_node_spawn_dfo_pre)
nursery.spawn(self.compute)
async def start_node_dfo_pre(self, task_status=trio.STATUS_IGNORED):
await self.init()
async with trio.open_nursery() as nursery:
for child in self.children:
await nursery.start(child.start_node_dfo_pre)
task_status.started()
nursery.start_soon(self.compute)
async def start_node_dfo(self, task_status=trio.STATUS_IGNORED):
async with trio.open_nursery() as nursery:
for child in self.children:
await nursery.start(child.start_node_dfo)
await nursery.start(self.init)
task_status.started()
nursery.start_soon(self.compute)
if __name__ == '__main__':
def create_children(root, n, root_depth):
dfo_pre.append(root.name)
if root_depth >= max_depth:
dfo.append(root.name)
return
for i in range(n):
node = Node(parent=root, index=i)
root.children.append(node)
create_children(node, n, root_depth + 1)
dfo.append(root.name)
max_depth = 4
dfo_pre = []
dfo = []
root = Node()
create_children(root, 2, 1)
print('Depth first-post tree ordering is')
print(dfo)
print('Depth first-pre tree ordering is')
print(dfo_pre)
trio.run(root.start_node_spawn_dfo_pre)
print('spawn dfs-pre init')
print(root.init_order)
Node.init_order = []
trio.run(root.start_node_dfo_pre)
print('start dfs-pre init')
print(root.init_order)
Node.init_order = []
trio.run(root.start_node_dfo)
print('start dfs-post init')
print(root.init_order)
|
Oh good! I meant to post a message here to check if this ended up being useful to you or if I'd just wandered off in some other direction entirely, but then I got distracted :-). |
(Text below was posted by @matham as #32 (comment); I'm splitting it off into a new issue to keep the threads tidy)
So I kinda just ran into this. I have a parent and a bunch of children (ordered).
Naively I tried:
I was hoping that the children would spawn deterministically in order and at least run until the first await in
wait_to_finish_work
. This would start the tree in depth first order. However, as I noticed in the result and in the design above, the initial spawning is also not in order.Specifically, you mention that newly spawned tasks may inherit their parent's
vtime
. I wonder if it may not be better that the nursery initially birth its children and start their life until the first suspension in the deterministic order. Spawn looks deterministic and the docs don't really say much either way.The alternative for a case like mine where you want to start off each spawnee in order in current trio would be to re-write my example as:
The second version is not that different, but it does make it more complex, especially for cleanup as the parent starts it in two methods. But also, this is now breadth first order, where at each level a node is picked randomly to start off its children.
Edit: I'm thinking that maybe my model of spawn is wrong. E.g.
Looks to me like
Where the magic collects the tasks at the
with
exit and runs them side by side. Perhaps I'm wrong and the better analogy is:because with threads order is completely undefined.
The text was updated successfully, but these errors were encountered: