-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
reflow #468
Comments
Only if we need history of proxies? I think the latest flow will overwrite the old (with job submit incrementing sequentially across flows) Also, If true would imply flow specific edges and probably flow specific data-stores... But I think we backed off of this, as people can already run multiple workflows. |
@dwsutherland is right - this would result in a single proxy with a merged flow ID. |
That will be interesting for the UI. We will have an HTML element attached to the DOM representing the proxy and job with some data (progress/status/started time/etc). Then I think the UI will receive a delta either for added or updated with the new task proxy. We will have to then merge the data, and display only once task proxy? Will the user be aware that the one visible in the UI has been updated by a reflow? |
A flow badge(s) that appears on nodes for any non-default flow label? |
When we have two flows how does the status of a task change depending on the flow you are viewing. E.g. for the flow If we run from
So long as the second flow doesn't result in the creation of waiting tasks this will still be comprehendable-ish from the UI. You will just see completed tasks being rerun which makes sense. Otherwise we will need a full |
Not sure I understand the conflict. The second flow will create a waiting b that replaces the first flow b (that is no longer in the pool anyway). |
Yeah a reflow really just extends the concept of retrigger - so we already had the same problem (UI-wise) with multiple submits of a task. Now instead of a just a new submit number, we get a new flow label and a new submit number (and unlike old-style retrigger, the flow continues downstream from the retriggered task). So the n-distance window is kind of agnostic to flow. It just has tasks in it, and those takes have whatever flow labels they have, just like submit numbers. Presumably by default the UI should show, for a particular task, the latest submit/flow that occurred (luckily submit number increments linearly so that makes that easy) and it could highlight somehow (a flow badge?) what flow that task belongs to. If we want to be able to filter by flow, regardless of latest submit number, that's slightly more interesting but it should work fine I think. |
Well yes, that is true, however trigger only effects one task so it is much easier to understand it's effect from the displayed information. The difficulty is making it clear to the user that there are two flows and what that entails for their workflows. So reflow is much bigger problem than re-triggering individual tasks.
It's a representation problem not a data problem, how the task pool and data store handle this is the domain of Cylc Flow and UI Server. I'll try and explain the representation problem with a diagram. Here are three alternative representations of a reflow in graph form: Composite: There are other options of course, and the "composite" option has two variants:
Thanks to the task-job separation the composite representation is much easier to understand than it would be otherwise, however, we would require "visual filtering" (e.g. colour coding) to tell between one flow and another. In the diagram I have shown (1), if the Scheduler inserts waiting task proxies for the second flow then I think we would get (2). (3) would just be confusing as heck. Branched: The branched option "looks" like the best, however, for the main expected use cases reflow isn't really a branching problem, as the user intends to overwrite the results of the first flow with the second so the branched representation is an un-necessary complication as these graphs could become very large making it extremely difficult to associate a task from one flow with the same task in another. |
I think the kind of questions users will want answers to are:
I think some nifty visual filtering can provide answers to these questions, I'll try and sketch something up soon... |
I actually think your "composite" sketch is fine at least as first cut, perhaps with latest flow label attached to tasks. Then filtering should allow you to see particular flows (one or other) of your "parallel" sketch. The composite view actually corresponds to my mental model, which is a single abstract graph that you can trigger real flows on in multiple places at once. The trouble with representing different flows as entirely separate is that it might give the impression they are entirely independent rather than merging if they catch up with one another. Note that in less linear graphs merging happens gradually, and the merge point can't be really be anticipated before it happens. |
I've not yet created any sketches for "visual filtering" so this is a little course, but as a rough guide, there would be a visual filtering dialogue which could be used to change the appearance of nodes based on different factors e.g. family, parameterisation, (re)flow, etc. You would only be able to "visually filter" for one thing at a time (e.g. families OR (re)flows) and filtering can be toggled on or off independently for each tab. When a new flow is created (providing the view doesn't have a pre-existing filter) we could automatically activate a "visual filter" for (re)flow, when they merge we can disable it. |
A "composite" view in combination with "visual filtering" should work for most use cases I can think of as when we create a new flow we are anticipating/intending it to catch up and merge with an earlier flow. Parallel flow use-cases are not intended or supported? If so the last nasty question is this:
|
I like the "visual filtering" idea, looks really good. Would we still need (text) flow labels attached to tasks in the unflitered case though? Especially if there are (god forbid) a large number of flows.
What do you mean by that? Disjoint parallel graph streams that will never merge?
Do we really need to allow that? The visualization works fine so long as a held task is held regardless of flow. |
So what? Not a data problem as in the UI will separate the identical nodes by flow?
Well if the the UI separates the node deltas by flow, then problem solved? However, if it actually needs to be represented in the data-store.. Then we would need to have a data-structure per flow, i.e.:
|
The idea is that this is a design issue focused exclusively on the representation of reflow in the UI, how we get the required data to the UI is a matter for future issues in other repositories.
Attaching text labels to nodes would be pretty ugly so if we can avoid this entirely that would be better.
"Disjoint parallel graph streams" are confusing when represented in a composite graph, it's hard to tell what will run next. This is a case where a "parallel" view would make much more sense.
From VC:
|
Closed by #2016 - see #2016 (comment) |
SoD (cylc/cylc-flow#3515) brings with it the concept of reflow, whereby a single running workflow can have multiple parallel executions which can be stopped (or potentially held?) independently and merge back together.
This amazing functionality is going to be an interesting design challenge for the UI:
Initial thoughts:
Questions:
Pull requests welcome!
The text was updated successfully, but these errors were encountered: