-
Notifications
You must be signed in to change notification settings - Fork 94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Monthly task in an otherwise hourly workflow has unexpected scheduling runahead limit behavior #5705
Comments
On startup, Cylc spawns tasks from 20230812T0000Z out to 20230910T0000Z. This is definitely not the expected behaviour, thank you for reporting this issue. The issue only occurs when both of the recurrences ( |
Thanks for the response. It also occurs if you list specific hours and not just
|
Using a interger interval rather than datetime interval (i.e. |
This appears to be a bug in the runahead limit calculation in the start up case (i.e. This diff is enough to get it working correctly: diff --git a/cylc/flow/task_pool.py b/cylc/flow/task_pool.py
index e7d85f669..057f3dfa8 100644
--- a/cylc/flow/task_pool.py
+++ b/cylc/flow/task_pool.py
@@ -371,6 +371,13 @@ class TaskPool:
else:
count_cycles = True
+ if not self.main_pool:
+ points = [
+ point
+ for point in points
+ if point <= base_point + limit
+ ]
+
# Get all cycle points possible after the runahead base point.
if (
not force Note at this stage, |
A more minimal example: [scheduler]
allow implicit tasks = True
[scheduling]
initial cycle point = 2023
runahead limit = PT2H
[[graph]]
PT1H = "foo"
R/^+P1D/P1D = "foo => bar" Which confirms it's the offset recurrence start point that screws up the limit, and just at start-up. It settles down to normal behaviour once the initial spawned tasks are done. (I guess none of our tests have recurrence start points beyond the initial runahead limit!) |
@jaworsks the fix will be in the upcoming 8.2.2 release. In the meantime you could patch your local installation as above. |
PR didn't auto-close issue. |
Description
We have been working to move from Cylc 7 to Cylc 8. Currently using Cylc 8.2.1
Our workflow has multiple jobs occurring every hour. Also in that workflow we have a job that is scheduled to only run on the 10th of every month. Our workflow has a high volume of parallel tasks each cycle, some of which will go into a retry state until data is available, to help manage performance we implemented a
[scheduling]runahead limit = PT6H
. However, it seems the when loading the10T00
entry from the graph it causes all hourly cycles/tasks from theinitial cycle point
until the monthly task to be generated and tracked causing significant delay to play the workflow or pull up in the UI.Reproducible Example
Relevant portion of a simplified
flow.cylc
to reproduce the problem:Will result in all cycles between 2023-08-12 and 2023-09-10 to be tracked and all tasks for those cycles to be tracked in a waiting state. I've only put 3 sets of tasks here but our production workflow can be much larger on the cardinal hours.
Expected Behaviour
Don't know exactly the expectation here , other than not having such an impact on performance. If this is expected behavior, looking for recommendations for a work-around. Only solution I've thought of is to create a separate workflow for our monthly tasks and have one workflow signal the other.
The text was updated successfully, but these errors were encountered: