Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement cylc remove proposal #6472

Merged
merged 31 commits into from
Nov 28, 2024
Merged

Conversation

MetRonnie
Copy link
Member

@MetRonnie MetRonnie commented Nov 11, 2024

Closes #5643. Supersedes #6370.

Summary

This fully implements the "Cylc Remove Extension" proposal.

Flow numbers

cylc remove now has a --flow option for removing a task from specific flows.

If not used, it will remove the task from all flows that it belongs to.

If the removed task is active/waiting, if it is removed from a subset of flows that it belongs to, it will remain in the task pool; if it is removed from all flows that it belongs to, it will be removed from the task pool (as is the current behaviour).

If a task is removed from all flows that it belongs to, it will become a no-flow task (flow=None).

For ease of reviewing, you can use my UI branch that displays flow numbers: https://github.com/MetRonnie/cylc-ui/tree/flow-nums 1.

Historical tasks

cylc remove now can remove tasks that are no longer active, making it look like they never ran. It does this by removing the task from the specified flows in the database (in the task_states and task_outputs tables)2, and un-setting any prerequisites of active tasks that the removed task had naturally satisfied3. If a task is removed from all flows that it belongs to, a no-flow task is left in the DB for provenance.

The above also applies to active/waiting tasks that cylc remove is used on.

Kill submitted/running tasks

Using cylc remove on a submitted/running task will now kill it if you are removing the task from all flows that it belongs to.

Unlike with cylc kill, downstream tasks will not spawn off the :fail or :submit-fail outputs as the task is in flow=none, and also the failed and submission failed handlers will not run.

Check List

  • I have read CONTRIBUTING.md and added my name as a Code Contributor.
  • Contains logically grouped changes (else tidy your branch by rebase).
  • Does not contain off-topic changes (use other PRs for other changes).
  • No dependency changes
  • Tests are included
  • Changelog entry included if this is a change that can affect users
  • Cylc-Doc pull request opened if required at Update Major Changes Page cylc-doc#791
  • If this is a bug fix, PR should be raised against the relevant ?.?.x branch.

Footnotes

  1. Waiting tasks that are not yet in the pool have greyed out flow numbers at the moment.

  2. If removing flows would result in two rows in the DB no longer being unique, the SQLite UPDATE OR REPLACE statement is used, so the first entry will be removed and the most recent entry will remain.

  3. Prerequisites manually satisfied by cylc set --pre are not affected by cylc remove.

`json.dumps()`/`json.loads()` are relatively slow (~1us). But these functions are likely to be called many times with `flow={1}`.
- Update data store with changed prereqs
- Don't un-queue downstream task if:
  - the task is already preparing
  - the task exists in flows other than that being removed
  - the task's prereqs are still satisfied overall
- Remove the downstream task from the pool if it no longer has any satisfied prerequisite tasks
This will allow it to call the method to kill submitted/running tasks
Copy link
Member

@hjoliver hjoliver left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still part way through, looks great so far.

cylc/flow/task_state.py Outdated Show resolved Hide resolved
cylc/flow/command_validation.py Outdated Show resolved Hide resolved
cylc/flow/command_validation.py Outdated Show resolved Hide resolved
cylc/flow/command_validation.py Outdated Show resolved Hide resolved
cylc/flow/platforms.py Show resolved Hide resolved
cylc/flow/scheduler.py Outdated Show resolved Hide resolved
cylc/flow/scheduler.py Outdated Show resolved Hide resolved
cylc/flow/scripts/remove.py Outdated Show resolved Hide resolved
cylc/flow/scripts/remove.py Outdated Show resolved Hide resolved
cylc/flow/scripts/remove.py Outdated Show resolved Hide resolved
@oliver-sanders
Copy link
Member

oliver-sanders commented Nov 26, 2024

(Does that mean you now agree with the behavior on this branch?)

Ish, there are two schools of thought, but I'm happy to go ahead with this.

If downstream tasks get removed, I think that should be logged at INFO level rather than DEBUG. It's potentially quite important information.

Note: We are only talking about SoD pre-spawning here (as we only remove children with partially satisfied prereqs).

SoD pre-spawning is really an implementation detail, so we could argue that these tasks should not be logged (I completely understand why Ronnie has logged this at DEBUG level). However, on the other hand, the n-window is now part of the model, it defines what the GUI sees and what commands users can issue against the task, so it is pertinent information.

Given that we log tasks that are pre-spawned by SoD at the INFO level, I agree that we should log their removal at the INFO level too.

Comment on lines 1071 to 1098
removed: Dict[str, FlowNums] = {}
not_removed: Set[str] = set()
to_kill: List[TaskProxy] = []

for itask in active:
fnums_to_remove = itask.match_flows(flow_nums)
if not fnums_to_remove:
not_removed.add(itask.identity)
continue
removed[itask.identity] = fnums_to_remove
if fnums_to_remove == itask.flow_nums:
# Need to remove the task from the pool.
# Spawn next occurrence of xtrigger sequential task (otherwise
# this would not happen after removing this occurrence):
self.pool.check_spawn_psx_task(itask)
self.pool.remove(itask, 'request')
to_kill.append(itask)
itask.removed = True
itask.flow_nums.difference_update(fnums_to_remove)

matched_task_ids = {
*removed.keys(),
*(quick_relative_id(cycle, task) for task, cycle in inactive),
}

for id_ in matched_task_ids:
point_str, name = id_.split('/', 1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not necessary to form these cycle/task components into IDs only to split them back into components, we can just store them as tuples for later access.

(untested)

Suggested change
removed: Dict[str, FlowNums] = {}
not_removed: Set[str] = set()
to_kill: List[TaskProxy] = []
for itask in active:
fnums_to_remove = itask.match_flows(flow_nums)
if not fnums_to_remove:
not_removed.add(itask.identity)
continue
removed[itask.identity] = fnums_to_remove
if fnums_to_remove == itask.flow_nums:
# Need to remove the task from the pool.
# Spawn next occurrence of xtrigger sequential task (otherwise
# this would not happen after removing this occurrence):
self.pool.check_spawn_psx_task(itask)
self.pool.remove(itask, 'request')
to_kill.append(itask)
itask.removed = True
itask.flow_nums.difference_update(fnums_to_remove)
matched_task_ids = {
*removed.keys(),
*(quick_relative_id(cycle, task) for task, cycle in inactive),
}
for id_ in matched_task_ids:
point_str, name = id_.split('/', 1)
removed: Dict[Tuple(str, str), FlowNums] = {}
not_removed: Set[Tuple(str, str)] = set()
to_kill: List[TaskProxy] = []
for itask in active:
key = (itask.tokens.cycle, itask.tokens.task)
fnums_to_remove = itask.match_flows(flow_nums)
if not fnums_to_remove:
not_removed.add(key)
continue
removed[key] = fnums_to_remove
if fnums_to_remove == itask.flow_nums:
# Need to remove the task from the pool.
# Spawn next occurrence of xtrigger sequential task (otherwise
# this would not happen after removing this occurrence):
self.pool.check_spawn_psx_task(itask)
self.pool.remove(itask, 'request')
to_kill.append(itask)
itask.removed = True
itask.flow_nums.difference_update(fnums_to_remove)
matched_task_ids = {
*removed.keys(),
*((cycle, task) for task, cycle in inactive),
}
for cycle, task in matched_task_ids:

Copy link
Member Author

@MetRonnie MetRonnie Nov 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was my intention to begin with, however it ended up being cleaner IMO to work with the ID as the ID is what is being used for the majority of things

Copy link
Member

@oliver-sanders oliver-sanders Nov 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code I'm highlighting here converts from components to IDs back to components, i.e. the IDs are an unnecessary intermediary.

Otherwise, use Tokens objects which can do the conversion for you.

cylc/flow/scheduler.py Outdated Show resolved Hide resolved
@hjoliver
Copy link
Member

hjoliver commented Nov 26, 2024

Note: We are only talking about SoD pre-spawning here (as we only remove children with partially satisfied prereqs).

SoD pre-spawning is really an implementation detail, so we could argue that these tasks should not be logged (I completely understand why Ronnie has logged this at DEBUG level). However, on the other hand, the n-window is now part of the model, it defines what the GUI sees and what commands users can issue against the task, so it is pertinent information.

Given that we log tasks that are pre-spawned by SoD at the INFO level, I agree that we should log their removal at the INFO level too

Yes, my thinking exackery.

Except that I sort of disagree that pre-spawning is just an implementation detail. Pre-spawned tasks are visible in n=0 because we decided to take partially satisfied prerequisites as an indicator of something wrong that should stall the workflow. And given that they are visible in n=0, their removal should be logged visibly too.

Copy link
Member

@hjoliver hjoliver left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm approving this sucker, conditional on the small logging change. Nice work 🎖️

@MetRonnie
Copy link
Member Author

MetRonnie commented Nov 27, 2024

If downstream tasks get removed, I think that should be logged at INFO level rather than DEBUG. It's potentially quite important information.

... (I completely understand why Ronnie has logged this at DEBUG level). However, on the other hand...

N.B. The logging at DEBUG level is not something I have touched; this is the default for TaskPool.remove(itask), unless the task is preparing, submitted or running in which case it becomes WARNING.

But I'm happy to change to make it INFO for the case in question

cylc/flow/scheduler.py Outdated Show resolved Hide resolved
Copy link
Member

@wxtim wxtim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've had a decent look at the code, and made a real effort to break it in practice, without success. 🎉

@oliver-sanders
Copy link
Member

Just had a shock reading back through the proposal:

At present (8.1.4) the Cylc 8 intervention is:

8.1.4!

Copy link
Member

@oliver-sanders oliver-sanders left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice 🎉!

This is a biggie, well done @MetRonnie!

@oliver-sanders oliver-sanders merged commit 9ff50f8 into cylc:master Nov 28, 2024
27 checks passed
@MetRonnie MetRonnie deleted the cylc-remove branch November 28, 2024 18:46
@MetRonnie MetRonnie mentioned this pull request Nov 29, 2024
8 tasks
@MetRonnie MetRonnie added the schema change Change to the Cylc GraphQL schema label Dec 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
schema change Change to the Cylc GraphQL schema
Projects
None yet
Development

Successfully merging this pull request may close these issues.

remove: extend beyond the task pool
4 participants