-
Notifications
You must be signed in to change notification settings - Fork 94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Flow labels and (re)flow metadata #3744
Comments
How do task retry and reflow relate? Isn't reflow just series of task retries? Could they be combined conceptually somehow to avoid the extra labels? |
No, a reflow is essentially another instance of the workflow, triggered at a different point in the graph, managed by the same scheduler instance. Think of multiple "wave fronts" traveling along the graph concurrently. Each reflow can have normal task retries within it. (I just tested this myself to make sure it works as advertised - and it does). |
That sounds quite complicated (another dimension to think about on top of everything else interacting in the workflow). Are these flow labels user visible or just purely internal implementation detail? Can one reflow only a part of the graph (set some reflow stopping criteria or stop it manually)? |
Not really. Firstly, you don't have to use reflow (and it won't happen by default). Secondly, it's what should happen: if you manually trigger a task and it generates outputs, the downstream tasks that depend on those outputs should follow on as normal (and so on after those tasks). The flow labels for now are only visible via the log, but we'll probably want to expose them in the UI somehow if multiple flows are in process at once. Users will need the label if they want to stop that flow from continuing, for example.
Yes. you can trigger any task with The documentation so far is: https://cylc.github.io/cylc-admin/proposal-spawn-on-d.html#reflow Example use case: re-run a whole product-generation sub-tree from a previous cycle point (after changing some input data manually, say) simply by re-triggering the first task in that sub-tree, while the main flow carries on unaffected. |
It would be nice to issue flow labels in order, it's more obvious whats going on when flow |
We could do that, but it might be hard to maintain order after a while, in workflow with rampant use of reflows (note that flow merge can be incremental, and in general you can't predict which of multiple flows might end up being "the one" that carries on after others have stopped or merged ... but I suppose we can always choose the next label in an ordered list, rather than a random one, for the next flow, even if the list of currently-unused labels has some holes in it). |
Current flow label implementation supports up to 52 concurrent flows within a single workflow, plus partially merged flows in progress.
My thinking was: that's probably enough(?), and the simple character-based labels are very easy to log and to use. And we could re-implement if necessary, e.g. with sets of UUIDs, to support an arbitrary number of concurrent flows, plus user-supplied metadata for ease of use (to avoid having to type the UUIDs).
However, we probably want flow metadata even for the current flow labels (which while simple, are arbitrary) so that users can keep track of the purpose of each flow.
The text was updated successfully, but these errors were encountered: