Fix flow merge into no-flow task. #4645

hjoliver · 2022-02-03T01:43:50Z

These changes close #4644

parented tasks
parentless tassk

Requirements check-list

I have read CONTRIBUTING.md and added my name as a Code Contributor.
Contains logically grouped changes (else tidy your branch by rebase).
Does not contain off-topic changes (use other PRs for other changes).
Applied any dependency changes to both setup.cfg and conda-environment.yml.
Appropriate tests are included (unit and/or functional).
Appropriate change log entry included.
No documentation update required.

hjoliver · 2022-02-03T03:04:36Z

This wasn't difficult to fix once I understood the problem, so bumping up to 8.0rc1.

MetRonnie

~~(Note this is not currently compatible with #4640):~~ Edit: now resolved

unheld -> not-held in cylc dump
spawn_successor -> spawn_successor_if_parentless

oliver-sanders · 2022-02-03T15:27:04Z

cylc/flow/task_pool.py

+                if completed_only:
+                    c_task.state.satisfy_me({
+                        (str(itask.point), itask.tdef.name, output)
+                    })
+                    self.data_store_mgr.delta_task_prerequisite(c_task)


This branch isn't covered, need to manually test...

I've run the test from #4651 against a flow.cylc and a suite.rc variants and god the same result, happy.

Something wrong with code coverage here. The new functional test here should hit this...

... brute force test: I put a print statement in there and it does get executed. 🤔

Any idea what could cause coverage to miss this?

I think the coverage is accurate in this case. I put a LOG.critical("Here") at L1273 and it didn't show up in the workflow log for tests/functional/spawn-on-demand/14-trigger-flow-blocker.t

No, completed_only is True for retroactive spawning on completed outputs (when an ongoing flow merges with a one-off manually-triggered task (thus making it part of the ongoing flow) ... i.e. what this PR is about). In back-compat mode, completed_only is False and we spawn on all outputs ahead of time (before they are completed) to approximate Cylc 7 spawning of tasks before they are needed.

I think the coverage is accurate in this case. I put a LOG.critical("Here") at L1273 and it didn't show up in the workflow log for tests/functional/spawn-on-demand/14-trigger-flow-blocker.t

Strange. I just tried the same, to double check, and it does show up in my workflow log. (I did have to disable the purge in the reftest shell function, to keep the log for inspection).

(Maybe the test is not actually working correctly in some environments, namely yours and GitHub actions ... I'll investigate later...)

Had a play but I can't get this branch to activate either.

Test fixed. It was flaky.

oliver-sanders · 2022-02-03T16:51:17Z

~~I think this functionality is broken when merged with #4640, see #4651~~

Ignore me, I think I messed up the merge.

oliver-sanders · 2022-02-03T18:19:04Z

Fixes the reported bug, however, is missing a DB check which causes case (4) to fail in the #4651 tests.

hjoliver · 2022-02-04T01:33:36Z

Rebased and deconflicted, post merge of #4640

hjoliver · 2022-02-04T01:36:27Z

Fixes the reported bug, however, is missing a DB check which causes case (4) to fail in the #4651 tests.

Not needed - see #4651 (comment)

hjoliver · 2022-02-04T04:28:23Z

OK, with my linked explanation of the controversial 4/x case, hopefully this is good to go as-is.

(Note also: cylc trigger --help rewrite, to clarify the basics of force triggering, for users)

cylc/flow/scripts/trigger.py

MetRonnie · 2022-02-04T10:59:09Z

cylc/flow/task_pool.py

+                if completed_only:
+                    c_task.state.satisfy_me({
+                        (str(itask.point), itask.tdef.name, output)
+                    })
+                    self.data_store_mgr.delta_task_prerequisite(c_task)


I think the coverage is accurate in this case. I put a LOG.critical("Here") at L1273 and it didn't show up in the workflow log for tests/functional/spawn-on-demand/14-trigger-flow-blocker.t

oliver-sanders · 2022-02-04T11:22:37Z

Fixes the reported bug, however, is missing a DB check which causes case (4) to fail in the #4651 tests.

Not needed - see #4651 (comment)

I think this is needed to maintain some consistencies / logic see #4651 (comment)

oliver-sanders

This does not work across a restart:

[scheduling]                                                  
    initial cycle point = 1                                   
    final cycle point = 5                                     
    cycling mode = integer                                    
    runahead limit = P1                                       
    [[graph]]                                                 
        P1 = x                                                
                                                              
[runtime]                                                     
    [[x]]                                                     
        script = """                                          
            if (( CYLC_TASK_CYCLE_POINT == 1 )); then         
                cylc trigger "${CYLC_WORKFLOW_ID}//4/x"       
            elif (( CYLC_TASK_CYCLE_POINT == 3 )); then       
                cylc stop "${CYLC_WORKFLOW_ID}" --now --now    
                contact="${HOME}/cylc-run/${CYLC_WORKFLOW_ID}/.service/contact"    
                while [[ -f "${contact}" ]]; do    
                    sleep 0.1                                                        
                done                                          
                cylc play "${CYLC_WORKFLOW_ID}"               
            fi                                                
        """

Log from the second run:

2022-02-04T12:20:38Z INFO - Scheduler: url=tcp://...:43024/ pid=50836
2022-02-04T12:20:38Z INFO - Workflow publisher: url=tcp://vld601.cmpd1.metoffice.gov.uk:43020
2022-02-04T12:20:38Z INFO - Run: (re)start=1 log=1
2022-02-04T12:20:38Z INFO - Cylc version: 8.0rc1.dev
2022-02-04T12:20:38Z INFO - Run mode: live
2022-02-04T12:20:38Z INFO - Initial point: 1
2022-02-04T12:20:38Z INFO - Final point: 5
2022-02-04T12:20:38Z INFO - [3/x running(runahead) job:01 flows:1] => running
2022-02-04T12:20:40Z INFO - [3/x running job:01 flows:1] (polled)started at 2022-02-04T12:20:35Z
2022-02-04T12:20:40Z INFO - [3/x running job:01 flows:1] health: execution timeout=None, polling intervals=PT15M,...
2022-02-04T12:20:40Z INFO - [4/x waiting(runahead) job:01 flows:1] (polled)succeeded at 2022-02-04T12:20:28Z
2022-02-04T12:20:40Z INFO - [4/x waiting(runahead) job:01 flows:1] => succeeded(runahead)
2022-02-04T12:20:41Z INFO - [3/x running job:01 flows:1] (received)succeeded at 2022-02-04T12:20:40Z
2022-02-04T12:20:41Z INFO - [3/x running job:01 flows:1] => succeeded
2022-02-04T12:20:41Z INFO - Workflow shutting down - AUTOMATIC
2022-02-04T12:20:41Z INFO - DONE

cylc/flow/scripts/trigger.py

hjoliver · 2022-02-04T13:34:25Z

This does not work across a restart:

That's a bug, but it's not in scope for this PR (and not really specific to reflow either).

submit no. does not increment until the preparing stage, so a) polling of a waiting(runahead) task at restart sees the previous job; and b) we should not be polling waiting tasks anyway
in the context of your example, this PR is about triggering 3/x so that the main flow catches up with it while it is still running; not 4/x which is a future task that the scheduler doesn't catch up with (in the sense of requiring a merge)

oliver-sanders · 2022-02-04T15:34:21Z

future task

If I subtract one from the trigger point I get the same result:

[scheduling]
    initial cycle point = 1
    final cycle point = 5
    cycling mode = integer
    runahead limit = P1
    [[graph]]
        P1 = x

[runtime]
    [[x]]
        script = """
            if (( CYLC_TASK_CYCLE_POINT == 1 )); then
                cylc trigger "${CYLC_WORKFLOW_ID}//3/x"
            elif (( CYLC_TASK_CYCLE_POINT == 2 )); then
                cylc stop "${CYLC_WORKFLOW_ID}" --now --now
                contact="${HOME}/cylc-run/${CYLC_WORKFLOW_ID}/.service/contact"
                while [[ -f "${contact}" ]]; do
                    sleep 0.1
                done
                cylc play "${CYLC_WORKFLOW_ID}"
            fi
        """

2022-02-04T15:30:57Z INFO - Initial point: 1
2022-02-04T15:30:57Z INFO - Final point: 5
2022-02-04T15:30:57Z INFO - [2/x running(runahead) job:01 flows:1] => running
2022-02-04T15:30:57Z WARNING - Task 3/x already spawned in {1}
2022-02-04T15:30:58Z INFO - [2/x running job:01 flows:1] (polled)started at 2022-02-04T15:30:51Z
2022-02-04T15:30:58Z INFO - [2/x running job:01 flows:1] health: execution timeout=None, polling intervals=PT15M,...
2022-02-04T15:30:59Z INFO - [2/x running job:01 flows:1] (received)succeeded at 2022-02-04T15:30:59Z
2022-02-04T15:30:59Z INFO - [2/x running job:01 flows:1] => succeeded
2022-02-04T15:31:00Z INFO - Workflow shutting down - AUTOMATIC
2022-02-04T15:31:00Z INFO - DONE

Whereas with this diff, the workflow runs on to cycle 5 correctly:

             elif (( CYLC_TASK_CYCLE_POINT == 2 )); then
+                return
                 cylc stop "${CYLC_WORKFLOW_ID}" --now --now

hjoliver · 2022-02-05T11:50:17Z

That's a bug, but it's not in scope for this PR (and not really specific to reflow either).
...
future task

If I subtract one from the trigger point I get the same result:

So:
-2. now your test does do something that this PR addresses (flow merge) ✔️
-1. but the reason it fails is still off-topic ❌ ...

... see #4658

Both variants run to completion with that change in.

Full credit, you're hammering this issue so hard that we're flushing other bugs out of the system, which is great 😁

hjoliver · 2022-02-05T12:01:39Z

(Added some logging and rebased again)

hjoliver · 2022-02-08T00:44:41Z

(Rebased again to pick up unit test fixes).

cylc/flow/task_pool.py

oliver-sanders · 2022-02-08T18:31:27Z

Full credit, you're hammering this issue so hard that we're flushing other bugs out of the system

So much so I think I got a new one #4665

oliver-sanders · 2022-02-08T18:33:44Z

I'm ok with this approach, due to the number of trigger-related bugs uncovered recently I'd like to test/understand more tomorrow (still need to get other PRs merged so not blocking). Would be good to generalise the trigger tests to make sure we are covering all cases correctly.

hjoliver · 2022-02-09T03:31:14Z

Full credit, you're hammering this issue so hard that we're flushing other bugs out of the system

So much so I think I got a new one #4665

Fixed! (And not really spawn-on-demand related, for the record).

oliver-sanders · 2022-02-09T13:36:01Z

One remaining issue with the coverage thing #4645 (comment)

hjoliver · 2022-02-14T06:40:36Z

~~Fast test fail needs #4684~~ cherry picked it

hjoliver · 2022-02-14T06:41:52Z

One remaining issue with the coverage thing #4645 (comment)

The test was flaky, so sometimes the merge didn't occur. Should be good now.

hjoliver · 2022-02-14T08:34:12Z

Coverage 100% 🎉

cylc/flow/scripts/trigger.py

hjoliver added the bug Something is wrong :( label Feb 3, 2022

hjoliver added this to the cylc-8.0rc2 milestone Feb 3, 2022

hjoliver self-assigned this Feb 3, 2022

hjoliver modified the milestones: cylc-8.0rc2, cylc-8.0rc1 Feb 3, 2022

hjoliver marked this pull request as ready for review February 3, 2022 03:46

hjoliver force-pushed the fix-4644 branch from 4d6bee9 to b0a3f1e Compare February 3, 2022 04:48

MetRonnie mentioned this pull request Feb 3, 2022

task_pool: fix spawning when runahead tasks are triggered #4621

Closed

7 tasks

MetRonnie approved these changes Feb 3, 2022

View reviewed changes

oliver-sanders reviewed Feb 3, 2022

View reviewed changes

oliver-sanders mentioned this pull request Feb 3, 2022

tests/f: add test to cover triggering bugs #4651

Closed

7 tasks

oliver-sanders approved these changes Feb 3, 2022

View reviewed changes

hjoliver force-pushed the fix-4644 branch from b0a3f1e to 109edba Compare February 4, 2022 01:32

hjoliver force-pushed the fix-4644 branch from 9315ca6 to e70a9ae Compare February 4, 2022 04:00

MetRonnie reviewed Feb 4, 2022

View reviewed changes

oliver-sanders requested changes Feb 4, 2022

View reviewed changes

MetRonnie reviewed Feb 4, 2022

View reviewed changes

cylc/flow/scripts/trigger.py Outdated Show resolved Hide resolved

cylc/flow/scripts/trigger.py Outdated Show resolved Hide resolved

hjoliver force-pushed the fix-4644 branch from 5d6072b to 09f9a3b Compare February 4, 2022 13:16

hjoliver mentioned this pull request Feb 5, 2022

Only poll non-waiting tasks #4658

Merged

7 tasks

hjoliver force-pushed the fix-4644 branch from 09f9a3b to 7eb67d5 Compare February 5, 2022 12:00

hjoliver force-pushed the fix-4644 branch 2 times, most recently from aafb0e3 to 4232627 Compare February 8, 2022 00:43

MetRonnie reviewed Feb 8, 2022

View reviewed changes

cylc/flow/task_pool.py Outdated Show resolved Hide resolved

oliver-sanders self-requested a review February 8, 2022 18:30

hjoliver force-pushed the fix-4644 branch 3 times, most recently from ed7d780 to 7e5f7ca Compare February 14, 2022 05:09

hjoliver added 7 commits February 14, 2022 19:28

Fix flow merge into no-flow task.

0f49fec

Add new func. test.

24ce86b

Update change log.

6c3d068

Post-rebase tweaks.

475435e

Flaky test needs more time.

9e0b92d

Improve cylc trigger --help.

9c870fb

Demote log message.

83c68fd

hjoliver force-pushed the fix-4644 branch from 7e5f7ca to 83c68fd Compare February 14, 2022 06:36

hjoliver added 2 commits February 14, 2022 20:16

Tweak docstring (to make codecov happy).

2d3abda

Fix for new flake8-simplify rule.

342dbeb

MetRonnie approved these changes Feb 14, 2022

View reviewed changes

cylc/flow/scripts/trigger.py Show resolved Hide resolved

cylc/flow/scripts/trigger.py Show resolved Hide resolved

oliver-sanders approved these changes Feb 14, 2022

View reviewed changes

oliver-sanders merged commit 81ca304 into cylc:master Feb 14, 2022

hjoliver deleted the fix-4644 branch February 14, 2022 18:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix flow merge into no-flow task. #4645

Fix flow merge into no-flow task. #4645

hjoliver commented Feb 3, 2022 •

edited

Loading

hjoliver commented Feb 3, 2022

MetRonnie left a comment •

edited

Loading

oliver-sanders Feb 3, 2022

oliver-sanders Feb 3, 2022

hjoliver Feb 4, 2022 •

edited

Loading

hjoliver Feb 4, 2022

MetRonnie Feb 4, 2022 •

edited

Loading

hjoliver Feb 4, 2022

hjoliver Feb 4, 2022

hjoliver Feb 4, 2022

oliver-sanders Feb 8, 2022

hjoliver Feb 14, 2022

oliver-sanders commented Feb 3, 2022 •

edited

Loading

oliver-sanders commented Feb 3, 2022

hjoliver commented Feb 4, 2022

hjoliver commented Feb 4, 2022

hjoliver commented Feb 4, 2022 •

edited

Loading

MetRonnie Feb 4, 2022 •

edited

Loading

oliver-sanders commented Feb 4, 2022 •

edited

Loading

oliver-sanders left a comment

hjoliver commented Feb 4, 2022

oliver-sanders commented Feb 4, 2022

hjoliver commented Feb 5, 2022 •

edited

Loading

hjoliver commented Feb 5, 2022 •

edited

Loading

hjoliver commented Feb 8, 2022

oliver-sanders commented Feb 8, 2022

oliver-sanders commented Feb 8, 2022 •

edited

Loading

hjoliver commented Feb 9, 2022 •

edited

Loading

oliver-sanders commented Feb 9, 2022

hjoliver commented Feb 14, 2022 •

edited

Loading

hjoliver commented Feb 14, 2022

hjoliver commented Feb 14, 2022

Fix flow merge into no-flow task. #4645

Fix flow merge into no-flow task. #4645

Conversation

hjoliver commented Feb 3, 2022 • edited Loading

hjoliver commented Feb 3, 2022

MetRonnie left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hjoliver Feb 4, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

MetRonnie Feb 4, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

oliver-sanders commented Feb 3, 2022 • edited Loading

oliver-sanders commented Feb 3, 2022

hjoliver commented Feb 4, 2022

hjoliver commented Feb 4, 2022

hjoliver commented Feb 4, 2022 • edited Loading

MetRonnie Feb 4, 2022 • edited Loading

Choose a reason for hiding this comment

oliver-sanders commented Feb 4, 2022 • edited Loading

oliver-sanders left a comment

Choose a reason for hiding this comment

hjoliver commented Feb 4, 2022

oliver-sanders commented Feb 4, 2022

hjoliver commented Feb 5, 2022 • edited Loading

hjoliver commented Feb 5, 2022 • edited Loading

hjoliver commented Feb 8, 2022

oliver-sanders commented Feb 8, 2022

oliver-sanders commented Feb 8, 2022 • edited Loading

hjoliver commented Feb 9, 2022 • edited Loading

oliver-sanders commented Feb 9, 2022

hjoliver commented Feb 14, 2022 • edited Loading

hjoliver commented Feb 14, 2022

hjoliver commented Feb 14, 2022

hjoliver commented Feb 3, 2022 •

edited

Loading

MetRonnie left a comment •

edited

Loading

hjoliver Feb 4, 2022 •

edited

Loading

MetRonnie Feb 4, 2022 •

edited

Loading

oliver-sanders commented Feb 3, 2022 •

edited

Loading

hjoliver commented Feb 4, 2022 •

edited

Loading

MetRonnie Feb 4, 2022 •

edited

Loading

oliver-sanders commented Feb 4, 2022 •

edited

Loading

hjoliver commented Feb 5, 2022 •

edited

Loading

hjoliver commented Feb 5, 2022 •

edited

Loading

oliver-sanders commented Feb 8, 2022 •

edited

Loading

hjoliver commented Feb 9, 2022 •

edited

Loading

hjoliver commented Feb 14, 2022 •

edited

Loading