-
Notifications
You must be signed in to change notification settings - Fork 94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve interaction between task state, output and prerequisite #2329
Comments
Initial thoughts:
|
See also #1689. The bulk of the content of a task proxy object is only needed at job submit time, to generate the job script. Aside from that, in memory we only need prerequisites/outputs (for scheduling) and submit number (and timers...). Your proposal seems to be going some way toward this? |
Yes, I am hoping that the task proxy object will become as light weight as possible. |
Note #2600 (comment) through #2600 (comment) - regarding persistence of prerequisite state and whether or not task and prerequisite state should be allowed to diverge. |
@matthewrmshin - I really like the simplicity of your proposed implementation above. But we need to consider the implications, e.g. for task state, in light of manual triggering requirements etc. I'll come back with some thoughts on this... |
This - by definition - makes task prerequisites and outputs absolutely consistent. (There's still a housekeeping problem: how to determine when each output can be forgotten and dropped from the dict, but presumably that's solvable). But I think we still need task state and prerequisite/output state to be "divorceable" because we cannot assume that if a task is in a post-triggered state that it's prerequisites are/were satisfied. E.g. if on manually triggering a task, or resetting it to a post-triggered state, forcing its prerequisites to be artificially completed would (under this proposal) force the outputs of its depended-on tasks to be artificially completed, which might have unintended consequences for other tasks that also depend on those. [or, would that be OK?! ... or it could be optional: So, it seems to me, if a task gets manually triggered or reset, we should automatically complete its outputs (which will complete the prerequisites of downstream tasks that depend on it - but that is what's needed) but not its prerequisites [unless optionally, as per prev paragraph]. |
On housekeep. My most naive assumption is that we can do what we do now with task proxies, i.e. we'll housekeep any outputs that can no longer be prerequisites of any downstream tasks. On prerequisites. I think the key concept of the proposal here is the separation of task states and prerequisite-output objects. An action/event on a task can only affect its outputs, but should have no effect on its prerequisites. See also #1314. |
On housekeep: agreed, the same logic should work. On prerequisites: also agreed - I just mentioned this in light of the related discussion on #2600 about whether or not prerequisite state can also be consistent with, or inferred from, task state. The conclusion would appear to be no, it can't. |
I might be missing something, but why would the prerequisites ever need to be artificially set to completed? If the task gets manually triggered, you could equally well have some flag to signify this fact rather than changing the state of its prerequisites. Isn't the very purpose of the manual trigger to run a task regardless of the state of its prerequisites? Regarding outputs housekeep - to avoid writing your own garbage collector for that you could keep the outputs in WeakValueDictionary. As soon as all task proxies that hold or reference (via one of the prerequisites) a given output are housekept, python would be free to garbage collect that dictionary entry. |
@TomekTrzeciak - well then, apologies - I guess I misunderstood your comments here: #2600 (comment). I thought you were suggesting that (a) task state and prerequisite states should not diverge. and (b) prerequisites could be reconstructed from outputs - of the same task (like what we do already from task state on restart). However, on (b) at least I now see you probably did not mean "of the same task"! Interesting idea on the housekeeping, sounds promising. |
Just to note, we do currently have one important use-case for artificially setting prerequisites: reset to waiting sets the task state to waiting and its prerequisites to not-satisfied (to force them to get satisfied again, which might involve waiting on re-running upstream tasks) - under this proposal setting the prerequisites to not-satisfied would set upstream task outputs to not-completed. However, I think just resetting the task state to waiting will do. Then the task will either trigger again immediately (if its prerequisites are still satisfied) or wait (if the upstream tasks have been retriggered or reset, which will unset their outputs). In fact, that seems more sensible than what we're currently doing. |
@hjoliver, sorry for the confusion, but from your latest comments I reckon we are pretty much on the same page now. With prerequisites reading rather than holding the state of other tasks' outputs, I think that the mental model of task behaviour (and the code too) will become simpler. |
Yes agreed - and I concede that my original implementation of prerequisites and outputs (which dates back a rather long time) lacked a certain purity of thought! |
@matthewrmshin - I think this essentially supersedes #1902, no? There would be no need to retain succeeded task proxies if their outputs are held in the new outputs dict (so long as we also solve #2143). (And this should also mean no need to even consider satisfying prerequisites from the DB, as per #1428). |
Just to note: when this issue gets done, we need to check that #1392 is solved. |
This issue needs to be re-evaluated post #3515 (spawn on demand): dependency matching is no longer relevant, but it might still be possible to handle prerequisites and outputs more cleanly. |
Continue of #1794 and #2157. See also #1392 and #2348.
The text was updated successfully, but these errors were encountered: