-
Notifications
You must be signed in to change notification settings - Fork 94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
future task globbing #5763
Comments
Not reservations so much as complications and a lack of mental bandwith to flush out the details. The history:
Globing over future tasks can probably be done, but globbing over future tasks gets tricky as:
If we're not careful then There are multiple duplicates of this issue which I think should be closed as superseded e.g:
I've argued on these issues for a consistent approach employed in the central globbing interface rather than trying to implement this on a command by command basis (which you may have mistaken for objection?). Note, some commands (e.g. trigger) auto-insert tasks if not found in the pool. I think hold maintains a list of tasks which were requested to be held. See also: |
Yes, this creates a potential (but likely small) memory leak for
That's why I said above we don't need to (and probably should not!) support cycle point globbing. But globbing over task and family names at particular cycle points will be needed.
Not sure this is a problem anymore. #5698 already makes hold/release flow-specific. For triggering you can assign flows (or use the default).
Agreed, gotta be careful with this, but as above I'm suggesting we do NOT support future cycle-point glob - so infinity ceases to be a problem!
Yes I think that's what I was vaguely recalling, makes sense. I do agree with that point in general, of course, but only so far as the central globbing interface does actually apply to different commands. See for instance #5752 -
Yes we're pretty strapped at the moment. But the reason I have raised this and related issues already is it they are coming up as support problems already, and in important contexts. One of Tom C's problems on the forum, for example: if a family has internal dependencies it is currently very difficult to hold all members at once. |
Yes sorry about that, I wanted to get this one up and didn't have time to find all related ones (they're not exact duplicates, this is more general)... I'll take a look ASAP and close them if possible. |
See also the closely related #5416 which is about selecting the start-tasks of a cycle which is essentially a subset of future task selection. Kinds of task selection:
Also consider the possibility of historical task selection (e.g. Now to try and work out a way to convey that in a consistent way that makes sense and doesn't break existing interfaces... |
I'm not familiar with SoD (spawn on demand) or SoS (spawn on submit) and I'm still reading through the Spawn on Demand Proposal, so I apologize of this is totally out of left field, but can spawned tasks inherit certain attributes from their parents, "Parent" here being either the cycle point or family used in the selection? Something like an overall state (active, held, killed, etc) so even when new tasks are spawned they'll inherit that attribute and some init code checks that state to determine what should be done (i.e., nothing if active, hold if held, or despawn if killed). |
@retro486 - good for you, reading the SoD proposal!! It was really not aimed at Cylc users, so don't feel bad if it was hard to follow! [Also, ~2 years post implementation it is now somewhat out of date].
Unfortunately the word "parent" is now "overloaded" in coding parlance. It can mean one of two entirely orthogonal concepts:
When talking about anything to do with the scheduling algorithm, it'll be the latter concept.
We have an as-yet unresolved discussion in the team on whether or not held tasks should should spawn held children. At the moment, they do not. (In Cylc 8 a running task can be "held", which means it will not submit another job even if it fails and has retries lined up - but if it generates any outputs while running, they may still spawn tasks that are not themselves held).
SoD means "spawn on demand" - if you kill a task, there's no need to "despawn" downstream tasks that depend on its success, because they will not have been spawned in the first place. |
I think I got the gist, it was clear enough about the differences on how tasks spawn in Cylc 8 and now I have a better understanding on the behavior I'm seeing.
Got it.
Ok so it sounds like this isn't an issue of "how can we" but "should we". I would say from my own use that in testing suites, troubleshooting runs, etc, it would be very helpful to be able to re-run tasks that have previously succeeded without spawning a whole new flow. From what I could tell, there isn't a way to do that right now. So at this point I think I see where the issue is if tasks that were paused spawn paused children, causing the suite the stall which may not be what we want. I think a better description of what I'm looking for would be the ability to re-run successful tasks (in a new flow?) without spawning the rest of the downstream tasks. Something like:
This might be too specific for this particular issue, sorry about that... |
No, all good, and we can do what you need already:
How's that? (we should switch this back to discourse - it's a bit off topic now for this issue) |
Superseded by #5827 |
On current master, task commands (such as
hold
andtrigger
):We don't need to support future task point globs: that would be dangerous, and rarely if ever useful (
cylc trigger "*/*"
- yikes! ... although if turns out there is a valid case for this I guess we could require explicit opt-in and stop at the runahead limit)But we do need to support family name, and name globbing, for future tasks. If I want to trigger or hold a bunch of upcoming tasks, having to target each one individually is painful and unnecessary. Easy to do:
Ping @oliver-sanders - I can't see any good reason not to do this, but I have a vague recollection that you have reservations?
The reason might have been (?) that in the
cylc hold
case, the future "tasks-to-hold" list can potentially cause a memory leak (e.g. you might hold "future" tasks that are actually in the past, so they stay in the list forever)? But even so, adding multiple tasks-to-hold at once, by globbing, is no different in principle than doing the same thing one task at a time. And with #5750 we can release them all again just as easily.The text was updated successfully, but these errors were encountered: