Improve interaction between task state, output and prerequisite #2329

matthewrmshin · 2017-06-22T18:51:08Z

Continue of #1794 and #2157. See also #1392 and #2348.

matthewrmshin · 2017-07-07T08:38:19Z

Initial thoughts:

Task pool to hold a dict of available (and relevant) output objects in memory.
Prerequisites of each task proxy will simply be a list of output objects selected from the main dict.
On receiving task messages, the output objects will be updated, and task prerequisites will be satisfied automatically.

hjoliver · 2017-07-09T22:06:57Z

See also #1689. The bulk of the content of a task proxy object is only needed at job submit time, to generate the job script. Aside from that, in memory we only need prerequisites/outputs (for scheduling) and submit number (and timers...). Your proposal seems to be going some way toward this?

matthewrmshin · 2017-07-10T07:52:26Z

Yes, I am hoping that the task proxy object will become as light weight as possible.

hjoliver · 2018-03-12T20:00:13Z

Note #2600 (comment) through #2600 (comment) - regarding persistence of prerequisite state and whether or not task and prerequisite state should be allowed to diverge.

hjoliver · 2018-03-14T01:58:35Z

@matthewrmshin - I really like the simplicity of your proposed implementation above. But we need to consider the implications, e.g. for task state, in light of manual triggering requirements etc. I'll come back with some thoughts on this...

hjoliver · 2018-03-14T03:37:46Z

This - by definition - makes task prerequisites and outputs absolutely consistent. (There's still a housekeeping problem: how to determine when each output can be forgotten and dropped from the dict, but presumably that's solvable).

But I think we still need task state and prerequisite/output state to be "divorceable" because we cannot assume that if a task is in a post-triggered state that it's prerequisites are/were satisfied. E.g. if on manually triggering a task, or resetting it to a post-triggered state, forcing its prerequisites to be artificially completed would (under this proposal) force the outputs of its depended-on tasks to be artificially completed, which might have unintended consequences for other tasks that also depend on those. [or, would that be OK?! ... or it could be optional: cylc trigger --complete-prereqs?]

So, it seems to me, if a task gets manually triggered or reset, we should automatically complete its outputs (which will complete the prerequisites of downstream tasks that depend on it - but that is what's needed) but not its prerequisites [unless optionally, as per prev paragraph].

matthewrmshin · 2018-03-14T10:11:18Z

On housekeep. My most naive assumption is that we can do what we do now with task proxies, i.e. we'll housekeep any outputs that can no longer be prerequisites of any downstream tasks.

On prerequisites. I think the key concept of the proposal here is the separation of task states and prerequisite-output objects. An action/event on a task can only affect its outputs, but should have no effect on its prerequisites.

See also #1314.

hjoliver · 2018-03-14T10:53:32Z

On housekeep: agreed, the same logic should work.

On prerequisites: also agreed - I just mentioned this in light of the related discussion on #2600 about whether or not prerequisite state can also be consistent with, or inferred from, task state. The conclusion would appear to be no, it can't.

TomekTrzeciak · 2018-03-14T10:57:32Z

I might be missing something, but why would the prerequisites ever need to be artificially set to completed? If the task gets manually triggered, you could equally well have some flag to signify this fact rather than changing the state of its prerequisites. Isn't the very purpose of the manual trigger to run a task regardless of the state of its prerequisites?

Regarding outputs housekeep - to avoid writing your own garbage collector for that you could keep the outputs in WeakValueDictionary. As soon as all task proxies that hold or reference (via one of the prerequisites) a given output are housekept, python would be free to garbage collect that dictionary entry.

hjoliver · 2018-03-14T11:05:18Z

@TomekTrzeciak - well then, apologies - I guess I misunderstood your comments here: #2600 (comment). I thought you were suggesting that (a) task state and prerequisite states should not diverge. and (b) prerequisites could be reconstructed from outputs - of the same task (like what we do already from task state on restart). However, on (b) at least I now see you probably did not mean "of the same task"!

Interesting idea on the housekeeping, sounds promising.

hjoliver · 2018-03-14T19:53:56Z

Just to note, we do currently have one important use-case for artificially setting prerequisites: reset to waiting sets the task state to waiting and its prerequisites to not-satisfied (to force them to get satisfied again, which might involve waiting on re-running upstream tasks) - under this proposal setting the prerequisites to not-satisfied would set upstream task outputs to not-completed. However, I think just resetting the task state to waiting will do. Then the task will either trigger again immediately (if its prerequisites are still satisfied) or wait (if the upstream tasks have been retriggered or reset, which will unset their outputs). In fact, that seems more sensible than what we're currently doing.

TomekTrzeciak · 2018-03-19T10:46:17Z

@hjoliver, sorry for the confusion, but from your latest comments I reckon we are pretty much on the same page now. With prerequisites reading rather than holding the state of other tasks' outputs, I think that the mental model of task behaviour (and the code too) will become simpler.

hjoliver · 2018-03-19T19:55:47Z

Yes agreed - and I concede that my original implementation of prerequisites and outputs (which dates back a rather long time) lacked a certain purity of thought!

hjoliver · 2018-05-08T03:21:47Z

@matthewrmshin - I think this essentially supersedes #1902, no? There would be no need to retain succeeded task proxies if their outputs are held in the new outputs dict (so long as we also solve #2143). (And this should also mean no need to even consider satisfying prerequisites from the DB, as per #1428).

matthewrmshin · 2018-05-08T08:23:12Z

Yes, I think this supersedes #1902 (and #1392?).

hjoliver · 2018-05-08T20:43:54Z

Just to note: when this issue gets done, we need to check that #1392 is solved.

hjoliver · 2020-07-16T00:34:53Z

This issue needs to be re-evaluated post #3515 (spawn on demand): dependency matching is no longer relevant, but it might still be possible to handle prerequisites and outputs more cleanly.

matthewrmshin added this to the later milestone Jun 22, 2017

matthewrmshin self-assigned this Jun 22, 2017

hjoliver mentioned this issue Jun 22, 2017

spent task cleanup can be broken by some cylc-6 graphs #1392

Closed

matthewrmshin added the efficiency For notable efficiency improvements label Jun 22, 2017

hjoliver mentioned this issue Jul 7, 2017

parallel scheduling #2348

Closed

hjoliver mentioned this issue Aug 28, 2017

Universal trigger handling #2413

Open

matthewrmshin mentioned this issue Feb 27, 2018

add dependencies variable to job script #2561

Merged

matthewrmshin mentioned this issue Mar 12, 2018

Fix prerequisite and output manipulation on state changes. #2600

Merged

hjoliver mentioned this issue May 8, 2018

Follow up on persistent broker state #1902

Closed

matthewrmshin mentioned this issue Nov 16, 2018

Add missing value cycling_mode to cylc.flags #2865

Closed

matthewrmshin mentioned this issue Aug 23, 2019

spawn on demand / event driven graph #3304

Open

matthewrmshin removed their assignment Aug 28, 2019

matthewrmshin modified the milestones: later, cylc-9 Aug 28, 2019

oliver-sanders mentioned this issue Jul 15, 2020

Task proxy spawn on demand. #3515

Merged

11 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve interaction between task state, output and prerequisite #2329

Improve interaction between task state, output and prerequisite #2329

matthewrmshin commented Jun 22, 2017 •

edited

Loading

matthewrmshin commented Jul 7, 2017

hjoliver commented Jul 9, 2017

matthewrmshin commented Jul 10, 2017

hjoliver commented Mar 12, 2018 •

edited

Loading

hjoliver commented Mar 14, 2018 •

edited

Loading

hjoliver commented Mar 14, 2018 •

edited

Loading

matthewrmshin commented Mar 14, 2018

hjoliver commented Mar 14, 2018

TomekTrzeciak commented Mar 14, 2018 •

edited

Loading

hjoliver commented Mar 14, 2018 •

edited

Loading

hjoliver commented Mar 14, 2018 •

edited

Loading

TomekTrzeciak commented Mar 19, 2018

hjoliver commented Mar 19, 2018

hjoliver commented May 8, 2018

matthewrmshin commented May 8, 2018

hjoliver commented May 8, 2018

hjoliver commented Jul 16, 2020

Improve interaction between task state, output and prerequisite #2329

Improve interaction between task state, output and prerequisite #2329

Comments

matthewrmshin commented Jun 22, 2017 • edited Loading

matthewrmshin commented Jul 7, 2017

hjoliver commented Jul 9, 2017

matthewrmshin commented Jul 10, 2017

hjoliver commented Mar 12, 2018 • edited Loading

hjoliver commented Mar 14, 2018 • edited Loading

hjoliver commented Mar 14, 2018 • edited Loading

matthewrmshin commented Mar 14, 2018

hjoliver commented Mar 14, 2018

TomekTrzeciak commented Mar 14, 2018 • edited Loading

hjoliver commented Mar 14, 2018 • edited Loading

hjoliver commented Mar 14, 2018 • edited Loading

TomekTrzeciak commented Mar 19, 2018

hjoliver commented Mar 19, 2018

hjoliver commented May 8, 2018

matthewrmshin commented May 8, 2018

hjoliver commented May 8, 2018

hjoliver commented Jul 16, 2020

matthewrmshin commented Jun 22, 2017 •

edited

Loading

hjoliver commented Mar 12, 2018 •

edited

Loading

hjoliver commented Mar 14, 2018 •

edited

Loading

hjoliver commented Mar 14, 2018 •

edited

Loading

TomekTrzeciak commented Mar 14, 2018 •

edited

Loading

hjoliver commented Mar 14, 2018 •

edited

Loading

hjoliver commented Mar 14, 2018 •

edited

Loading