Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make cylc remove flow-aware and extend to historical tasks #6370

Closed
wants to merge 18 commits into from

Conversation

MetRonnie
Copy link
Member

@MetRonnie MetRonnie commented Sep 11, 2024

Note

Superseded by #6472

Partially addresses #5643

Summary

This mostly implements the "Cylc Remove Extension" proposal.

Flow numbers

cylc remove now has a --flow option for removing a task from specific flows.

If not used, it will remove the task from all flows that it belongs to.

If the removed task is active/waiting, if it is removed from a subset of flows that it belongs to, it will remain in the task pool; if it is removed from all flows that it belongs to, it will be removed from the task pool (as is the current behaviour).

If a task is removed from all flows that it belongs to, it will become a no-flow task (flow=None).

For ease of reviewing, you can use my UI branch that displays flow numbers: https://github.com/MetRonnie/cylc-ui/tree/flow-nums 1.

Historical tasks

cylc remove now can remove tasks that are no longer active, making it look like they never ran. It does this by removing the task from the specified flows in the database (in the task_states and task_outputs tables)2, and un-setting any prerequisites of active tasks that the removed task had naturally satisfied3. If a task is removed from all flows that it belongs to, a no-flow task is left in the DB for provenance.

The above also applies to active/waiting tasks that cylc remove is used on.

What's left to do

  • When removing an active task from all its flows, kill the task.

  • Should probably add a functional test with the --flow option.

  • Need to check this one:

    If removing a task causes all of the prerequisites on a downstream task to be unset (i.e. if the downstream task was spawned as a result of outputs from this task alone) then the downstream task shall be removed from the pool.

Check List

  • I have read CONTRIBUTING.md and added my name as a Code Contributor.
  • Contains logically grouped changes (else tidy your branch by rebase).
  • Does not contain off-topic changes (use other PRs for other changes).
  • No dependency changes
  • Tests are included
  • Changelog entry included if this is a change that can affect users
  • Cylc-Doc pull request opened if required at cylc/cylc-doc/pull/XXXX.
  • If this is a bug fix, PR should be raised against the relevant ?.?.x branch.

Footnotes

  1. Waiting tasks that are not yet in the pool have greyed out flow numbers at the moment.

  2. If removing flows would result in two rows in the DB no longer being unique, the SQLite UPDATE OR REPLACE statement is used, so the first entry will be removed and the most recent entry will remain.

  3. Prerequisites manually satisfied by cylc set --pre are not affected by cylc remove.

@MetRonnie MetRonnie added this to the 8.4.0 milestone Sep 11, 2024
@MetRonnie MetRonnie self-assigned this Sep 11, 2024
@MetRonnie MetRonnie marked this pull request as draft September 11, 2024 15:42
@MetRonnie MetRonnie changed the title cylc remove: make flow aware and extend to historical tasks Make cylc remove flow-aware and extend to historical tasks Sep 12, 2024
oliver-sanders
oliver-sanders previously approved these changes Sep 12, 2024
@oliver-sanders oliver-sanders dismissed their stale review September 12, 2024 14:45

Wrong PR, sorry.

@MetRonnie MetRonnie marked this pull request as ready for review September 13, 2024 16:43
Copy link
Member

@oliver-sanders oliver-sanders left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code looks good!

tests/conftest.py Outdated Show resolved Hide resolved
tests/integration/conftest.py Outdated Show resolved Hide resolved
cylc/flow/workflow_db_mgr.py Show resolved Hide resolved
cylc/flow/task_pool.py Outdated Show resolved Hide resolved
cylc/flow/task_pool.py Outdated Show resolved Hide resolved
@MetRonnie MetRonnie marked this pull request as draft September 25, 2024 16:23
@MetRonnie MetRonnie marked this pull request as ready for review September 27, 2024 16:28
@oliver-sanders

This comment was marked as resolved.

@MetRonnie MetRonnie marked this pull request as draft October 2, 2024 14:56
@MetRonnie

This comment was marked as outdated.

@MetRonnie MetRonnie marked this pull request as ready for review October 10, 2024 16:13
@MetRonnie MetRonnie requested a review from hjoliver October 10, 2024 16:13
@oliver-sanders
Copy link
Member

oliver-sanders commented Oct 11, 2024

Have been trying this out for sub-graph re-run use cases. The remove functionality is all working correctly 👍, I am able to re-run sub-graphs cleanly without using new flows 🚀.

I have encountered some hitches (not related to this PR, all for consideration in follow-on work):

  1. I've been hitting this issue a lot (pool: check prereqs on task spawn #6143) with any dependencies which span more than one "rank" in the graph. In these situations, the GUI and DB say one thing, but the task pool another which is highly confusing. Restarting (or possibly reloading?) the workflow should fix it, but we need a solution to this before remove really becomes a viable solution to re-running sub-graphs. Note this is a very similar limitation to the need to satisfy off-flow prereqs when re-running via new flows.
  2. Removing a task doesn't necessarily change anything in the GUI (i.e. there is no visual feedback). This was anticipated when the proposal was written and is why visual filtering cylc-ui#470 is on the Cylc roadmap. I wonder whether a global change to the appearance of n=0 tasks (e.g. making them translucent as originally proposed) would provide an easier workaround and also help with other things?
  3. Removed tasks retain their state in the GUI. E.G. if a task succeeded and was removed, it continues to be shown with a filled circle. It makes sense to preserve the job state, however, I wonder if we should reset the task state back to waiting when we remove the task to make it clearer?

@MetRonnie MetRonnie force-pushed the cylc-remove branch 2 times, most recently from fb49bcf to 8f72cb4 Compare October 16, 2024 11:13
@oliver-sanders
Copy link
Member

oliver-sanders commented Oct 17, 2024

What's the status on these TODO items from the OP:

What's left to do

When removing an active task from all its flows, kill the task.

Should probably add a functional test with the --flow option.

Need to check this one:

@MetRonnie
Copy link
Member Author

MetRonnie commented Oct 17, 2024

Will either come in a follow-up PR or this one depending on how soon @hjoliver reviews this

@MetRonnie MetRonnie requested a review from wxtim October 17, 2024 15:35
- Update data store with changed prereqs
- Don't un-queue downstream task if:
  - the task is already preparing
  - the task exists in flows other than that being removed
  - the task's prereqs are still satisfied overall
- Remove the downstream task from the pool if it no longer has any satisfied prerequisite tasks
@MetRonnie MetRonnie marked this pull request as ready for review November 1, 2024 18:48
@MetRonnie MetRonnie marked this pull request as draft November 4, 2024 15:08
@hjoliver
Copy link
Member

Getting in review mode here - sorry for the delay!

Comment on lines -337 to +345
def get_resolved_dependencies(self) -> List[str]:
def get_satisfied_dependencies(self) -> List[str]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Vaguely interesting historical aside: I think use of "resolved" here goes all the way back to Cylc 2, when we first got the ability to visualize a graph - not from the workflow definition because there was no workflow definition, just a bunch of task definition files with inputs and outputs - from dependencies "resolved" at runtime.]

Comment on lines +524 to +525
NOTE: If `flow_nums` is empty, it means 'all', whereas
if `self.flow_nums` is empty, it means this task is in the 'none' flow
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK nice way to handle that.

Comment on lines +533 to +534
"""Return True if any of this task's prerequisite tasks are
satisfied."""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"""Return True if any of this task's prerequisite tasks are
satisfied."""
"""Return True if any of this task's prerequisites are satisfied."""

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't fit

Copy link
Member

@hjoliver hjoliver Nov 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(my main point was the wording ... I'll transfer to the superseding PR)

Comment on lines -1720 to 1743
# we can infer it was deliberately removed, so don't respawn it.
# we can infer it has just been deliberately removed (N.B. not
# by `cylc remove`), so don't immediately respawn it.
# TODO (follow-up work):
# - this logic fails if task removed after some outputs completed
# - this is does not conform to future "cylc remove" flow-erasure
# behaviour which would result in respawning of the removed task
# See github.com/cylc/cylc-flow/pull/6186/#discussion_r1669727292
LOG.debug(f"Not respawning {point}/{name} - task was removed")
Copy link
Member

@hjoliver hjoliver Nov 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I wrote this comment myself, some time ago. Presumably it does not refer to prevention of "conditional respawning" because those tasks would have been removed with completed outputs. The only other only non-cylc-remove case is to prevent respawning of a task removed by suicide trigger, I think. Can we adapt the new machinery on this branch to handle that properly too - i.e. record in the DB that the task was removed by suicide trigger, to avoid having to imperfectly infer it? Could be follow-up work of course, if I'm not barking up the wrong tree.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I'll transfer this to the superseding PR...)

Copy link
Member

@hjoliver hjoliver left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been through almost all of the non-test code, looks good. I still need to read the task_pool changes more carefully and do some testing.

@hjoliver
Copy link
Member

@MetRonnie - is this still in draft because of the two "what's left to do" items at the top?

@MetRonnie
Copy link
Member Author

Yes, because Oliver wanted to wait for the whole proposal to be addressed before merging. (FYI I'm going to close this PR and open a new one to stop GitHub collapsing recent review comments)

@MetRonnie MetRonnie closed this Nov 11, 2024
@MetRonnie MetRonnie mentioned this pull request Nov 11, 2024
8 tasks
@MetRonnie
Copy link
Member Author

Superseded by #6472

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants