Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Only poll non-waiting tasks #4658

Merged
merged 5 commits into from
Feb 14, 2022
Merged

Conversation

hjoliver
Copy link
Member

@hjoliver hjoliver commented Feb 5, 2022

Currently we can poll a waiting task (as reloaded from the DB at restart ... and otherwise), and erroneously update it with the state of a previously submitted job of the same task. (This is partly because submit number does not increment until the task reaches the preparing state, before which time it matches the previous job ... but I don't think we need to poll non-active tasks anyway??)

[UPDATE] After the discussion below, pending resolution of #4678 this PR now avoids polling waiting tasks, rather than targeting active tasks.

This is a small change with no associated Issue.

Requirements check-list

  • I have read CONTRIBUTING.md and added my name as a Code Contributor.
  • Contains logically grouped changes (else tidy your branch by rebase).
  • Does not contain off-topic changes (use other PRs for other changes).
  • Applied any dependency changes to both setup.cfg and conda-environment.yml.
  • New integration test added.
  • Appropriate change log entry included.
  • No documentation update required.

@hjoliver hjoliver added the bug Something is wrong :( label Feb 5, 2022
@hjoliver hjoliver added this to the cylc-8.0rc1 milestone Feb 5, 2022
@hjoliver hjoliver self-assigned this Feb 5, 2022
@hjoliver hjoliver mentioned this pull request Feb 5, 2022
9 tasks
@oliver-sanders
Copy link
Member

Seems sensible.

@oliver-sanders
Copy link
Member

(can get away without a test for this one)

@hjoliver hjoliver marked this pull request as ready for review February 10, 2022 01:08
@oliver-sanders
Copy link
Member

I've got a bad idea that we were purposefully polling all task-pool tasks. Spoke to Dave, trying to see if we can remember why.

If this issue is only due to waiting tasks perhaps we should implement a filter to prevent polling of waiting tasks at a lower level (because it really doesn't make sense to poll waiting tasks does it?).

@hjoliver
Copy link
Member Author

hjoliver commented Feb 10, 2022

I've got a bad idea that we were purposefully polling all task-pool tasks.

I was wondering that myself. I have a vague recollection that I suggested it myself: if it has a job ID, why not poll it (at restart, or on a poll-all request)? But still, I don't recall any reason why we'd need to poll non-active tasks.

@hjoliver
Copy link
Member Author

hjoliver commented Feb 10, 2022

A-ha, I'm probably recalling this: #1792

@hjoliver
Copy link
Member Author

#1762 (comment)

This suggests polling of non-active tasks may have only been needed for some functional tests, which I guess is not relevant anymore because the tests all passed on this branch (after my one change).

@hjoliver
Copy link
Member Author

hjoliver commented Feb 10, 2022

Note also there is a current bug resulting from polling of failed tasks!

#4513
#4516

which this PR will presumably fix

@hjoliver
Copy link
Member Author

hjoliver commented Feb 10, 2022

So, I think the approach on this branch might be best (unless you recall other reasons for polling non-active tasks).

@dpmatthews
Copy link
Contributor

Looking at #1792, I think the polling of non-active tasks isn't just for functional tests. At the very least I think it relates to supporting resurrect-able tasks. I think we need to take a much closer look at this (post 8.0) to better understand what cases we're trying to support. Some of this is discussed in #4513.

@oliver-sanders
Copy link
Member

oliver-sanders commented Feb 11, 2022

I would quite like to hash out exactly what polling behaviour we would expect and what batch system quirks we are trying to support. There may be scope for simplification of code / reduction of polling operations. There may also be scope for testing within the integration battery (with simulated batch systems). Opened an issue #4678.

@oliver-sanders
Copy link
Member

I don't think it makes any sense to poll waiting tasks so we should be able to remove this functionality completely at a lower level (otherwise cylc poll '*' and possibly other poll operations could reproduce this bug?).

@hjoliver
Copy link
Member Author

hjoliver commented Feb 13, 2022

Yeah, all good. I'll switch to lower-level avoidance of waiting tasks for now... done; title and description updated.

@hjoliver hjoliver changed the title Only poll active tasks at restart. Only poll ~~active~~ non-waiting tasks ~~at restart~~ Feb 13, 2022
@hjoliver hjoliver changed the title Only poll ~~active~~ non-waiting tasks ~~at restart~~ Only poll non-waiting tasks Feb 13, 2022
Co-authored-by: Oliver Sanders <oliver.sanders@metoffice.gov.uk>
@oliver-sanders oliver-sanders merged commit e733725 into cylc:master Feb 14, 2022
@hjoliver hjoliver deleted the fix-poll-on-restart branch March 4, 2022 04:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something is wrong :( small
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants