Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix failure to clean on remote host with leftover contact file #4978

Merged
merged 4 commits into from
Jul 20, 2022

Conversation

MetRonnie
Copy link
Member

@MetRonnie MetRonnie commented Jul 11, 2022

This is a small change with no associated Issue.

From Teams:

Cylc clean bug report... it looks like it is trying to perform the whole "is this workflow running" check on the remote side as well as the invocation side which isn't right:

$ cylc clean u-co906/run1
INFO - Cleaning u-co906/run1 on install target: ab
ERROR - Could not clean u-co906/run1 on install target: ab
   platform: abc - clean up did not complete
   COMMAND:
       ssh CYLC_VERSION=8.0rc4.dev CYLC_ENV_NAME=cylc-8.0rc4.dev-3 \
           bash --login -c 'exec "$0" "$@"' timeout 120 cylc clean \
           --local-only u-co906/run1
   RETURN CODE:
       1
   STDOUT:
   STDERR:
       CylcError: Cannot determine whether workflow is running on <scheduler-host>.
       ~fcm/cylc-8.0rc4.dev-3/bin/python3.9 ~fcm/cylc-8.0rc4.dev-3/bin/cylc play u-co906 --host=localhost

CylcError: Remote clean failed for u-co906/run1

Note: We cannot necessarily SSH back from remote platforms to the scheduler host.

Solution

If there us a leftover contact file when doing cylc clean, if the host field gives a host that is not the local host, it means we are doing re-invoked cylc clean --local-only on the remote host and we should not check if the scheduler process is still running (it can't be).

The workflow DB only exists on the scheduler host. So if we are doing cylc clean --local-only and there is a leftover contact file but no DB, it probably means this is cylc clean being re-invoked on a remote host and we can ignore the contact.

Even if the fix gets hit when someone genuinely someone runs cylc clean --local-only on the scheduler host, it shouldn't matter.

Requirements check-list

  • I have read CONTRIBUTING.md and added my name as a Code Contributor.
  • Contains logically grouped changes (else tidy your branch by rebase).
  • Does not contain off-topic changes (use other PRs for other changes).
  • Applied any dependency changes to both setup.cfg and conda-environment.yml.
  • Appropriate tests are included (functional).
  • Appropriate change log entry included.
  • No documentation update required.

@MetRonnie MetRonnie added bug Something is wrong :( small labels Jul 11, 2022
@MetRonnie MetRonnie added this to the cylc-8.0.0 milestone Jul 11, 2022
@MetRonnie MetRonnie self-assigned this Jul 11, 2022
and not (path / 'workflow').exists()
and not (path / 'scheduler').exists()
Copy link
Member Author

@MetRonnie MetRonnie Jul 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated change but spotted this leftover today

Copy link
Member

@wxtim wxtim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy with this - add a changelog entry before merging.

@MetRonnie MetRonnie marked this pull request as draft July 13, 2022 09:39
@MetRonnie
Copy link
Member Author

Dang, just realised this approach is flawed - if there are configured remote run hosts, it will allow cylc clean to delete a running workflow

@MetRonnie MetRonnie marked this pull request as ready for review July 13, 2022 13:50
@MetRonnie
Copy link
Member Author

Changed approach, relying on the fact that the workflow DB only exists on the scheduler host.

@oliver-sanders oliver-sanders self-requested a review July 14, 2022 16:13
@MetRonnie
Copy link
Member Author

Re-done as per Oliver's suggestion

Copy link
Member

@hjoliver hjoliver left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(A very quick review but LGTM; I haven't tested it).

Copy link
Member

@wxtim wxtim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm OK with this. @oliver-sanders - do you want to do a final check?

@oliver-sanders
Copy link
Member

LGTM

@oliver-sanders oliver-sanders merged commit 2243c39 into cylc:master Jul 20, 2022
@MetRonnie MetRonnie deleted the cylc-clean branch July 20, 2022 12:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something is wrong :( small
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants