-
Notifications
You must be signed in to change notification settings - Fork 94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Data store avoid graph re-walk #5660
Data store avoid graph re-walk #5660
Conversation
f10aa7b
to
65cb9fc
Compare
Coverage should improve with a n=2 window test. |
99a86e8
to
f110f63
Compare
714f758
to
58dbf6e
Compare
I've been having a play with this branch with the n=1 window in I've been playing with the following diff in order to track which children/parents are being walked over: diff --git a/cylc/flow/data_store_mgr.py b/cylc/flow/data_store_mgr.py
index b6a2960fb..e78cd767c 100644
--- a/cylc/flow/data_store_mgr.py
+++ b/cylc/flow/data_store_mgr.py
@@ -732,6 +732,7 @@ class DataStoreMgr:
None
"""
+ print(f'# increment_graph_window({source_tokens.relative_id})')
# common refrences
active_id = source_tokens.id
diff --git a/cylc/flow/taskdef.py b/cylc/flow/taskdef.py
index a4ae681e2..d2517383b 100644
--- a/cylc/flow/taskdef.py
+++ b/cylc/flow/taskdef.py
@@ -36,6 +36,7 @@ if TYPE_CHECKING:
def generate_graph_children(tdef, point):
"""Determine graph children of this task at point."""
+ print(f'# generate_graph_children({point}/{tdef.name})')
graph_children = {}
for seq, dout in tdef.graph_children.items():
for output, downs in dout.items():
@@ -78,6 +79,7 @@ def generate_graph_parents(tdef, point, taskdefs):
Infer parents be reversing upstream triggers that lead to point/task.
"""
+ print(f'# generate_graph_parents({point}/{tdef.name})')
graph_parents = {}
for seq, triggers in tdef.graph_parents.items():
if not seq.is_valid(point): Using the graph from your example above with the default n=1 window: flow.cylc[scheduler]
cycle point format = CCYY
[scheduling]
runahead limit = P1
initial cycle point = 2020
[[graph]]
P1Y = """
a => b => c => d => e => f => g
b => h => i => f
g[-P1Y] => a
"""
[runtime]
[[root]]
script = sleep 2
[[b, c, d, e, f, i]]
[[a]]
[[g,h]] It looks like the algorithm is looking one edge further infront ( E.G. here's a snippet from
E.G. in the above output, when the n=0 window was When the children of |
Moving onto the n=2 window (and dumping out the n-window after each call) with this diff: diff --git a/cylc/flow/data_store_mgr.py b/cylc/flow/data_store_mgr.py
index b6a2960fb..3a21bbbea 100644
--- a/cylc/flow/data_store_mgr.py
+++ b/cylc/flow/data_store_mgr.py
@@ -473,7 +473,7 @@ class DataStoreMgr:
self.updated_state_families = set()
# Update workflow state totals once more post delta application.
self.state_update_follow_on = False
- self.n_edge_distance = 1
+ self.n_edge_distance = 2
self.next_n_edge_distance = None
self.latest_state_tasks = {
state: deque(maxlen=LATEST_STATE_TASKS_QUEUE_SIZE)
@@ -732,6 +732,7 @@ class DataStoreMgr:
None
"""
+ print(f'# increment_graph_window({source_tokens.relative_id})')
# common refrences
active_id = source_tokens.id
@@ -1018,6 +1019,10 @@ class DataStoreMgr:
self.prune_trigger_nodes[active_id])
del self.prune_trigger_nodes[active_id]
+ task_proxies = self.data[self.workflow_id][TASK_PROXIES]
+ for tp_id in sorted(list(task_proxies) + list(self.added[TASK_PROXIES])):
+ print(f'$ {tp_id}')
+
def generate_edge(
self,
parent_tokens: Tokens,
diff --git a/cylc/flow/taskdef.py b/cylc/flow/taskdef.py
index a4ae681e2..d2517383b 100644
--- a/cylc/flow/taskdef.py
+++ b/cylc/flow/taskdef.py
@@ -36,6 +36,7 @@ if TYPE_CHECKING:
def generate_graph_children(tdef, point):
"""Determine graph children of this task at point."""
+ print(f'# generate_graph_children({point}/{tdef.name})')
graph_children = {}
for seq, dout in tdef.graph_children.items():
for output, downs in dout.items():
@@ -78,6 +79,7 @@ def generate_graph_parents(tdef, point, taskdefs):
Infer parents be reversing upstream triggers that lead to point/task.
"""
+ print(f'# generate_graph_parents({point}/{tdef.name})')
graph_parents = {}
for seq, triggers in tdef.graph_parents.items():
if not seq.is_valid(point): Here's the log:
Looks good, gives the right result, some possible savings to be made avoiding duplicate |
There are duplicate I spotted a [very] small optimisation moving that return a little higher up in the code: --- a/cylc/flow/data_store_mgr.py
+++ b/cylc/flow/data_store_mgr.py
@@ -1108,11 +1108,15 @@ class DataStoreMgr:
Orphan tasks with no children return (True, None) respectively.
"""
+ tp_id = tokens.id
+ task_proxies = self.data[self.workflow_id][TASK_PROXIES]
+ if tp_id in task_proxies or tp_id in self.added[TASK_PROXIES]:
+ # ghost task already created
+ return
+
name = tokens['task']
point_string = tokens['cycle']
t_id = self.definition_id(name)
- tp_id = tokens.id
- task_proxies = self.data[self.workflow_id][TASK_PROXIES]
is_orphan = False
if name not in self.schd.config.taskdefs:
@@ -1120,8 +1124,6 @@ class DataStoreMgr:
if itask is None:
itask = self.schd.pool.get_task(point_string, name)
- if tp_id in task_proxies or tp_id in self.added[TASK_PROXIES]:
- return
if itask is None:
itask = TaskProxy( |
Tried out the new graphDepth field, the numbers came out as expected so this should close #5609 which will allow us to unlock the higher n-window views in the GUI! Is the graph window that the data store is currently configured to track available in the schema? That would be handy for the UI. |
diff --git a/cylc/flow/data_store_mgr.py b/cylc/flow/data_store_mgr.py
index 03c77eb94..a085f2f3f 100644
--- a/cylc/flow/data_store_mgr.py
+++ b/cylc/flow/data_store_mgr.py
@@ -754,6 +754,7 @@ class DataStoreMgr:
# walk filling, that may not have been the entire walk).
# If walk already completed, must have gone from non-active to active
# again.. So redo walk (as walk nodes may be pruned).
+ print(f'# Active ID: {active_id}')
if (
active_id not in all_walks
or active_id in self.n_window_completed_walks
@@ -899,6 +900,7 @@ class DataStoreMgr:
)
continue
+ print(f'# Generate Children/Parents of {node_tokens.relative_id}')
# Children/downstream nodes
# TODO: xtrigger is workflow_state edges too
# Reference set for workflow relations
diff --git a/cylc/flow/taskdef.py b/cylc/flow/taskdef.py
index a4ae681e2..daae255f4 100644
--- a/cylc/flow/taskdef.py
+++ b/cylc/flow/taskdef.py
@@ -15,6 +15,7 @@
# along with this program. If not, see <http://www.gnu.org/licenses/>.
"""Task definition."""
+import inspect
from collections import deque
from typing import TYPE_CHECKING
@@ -36,6 +37,7 @@ if TYPE_CHECKING:
def generate_graph_children(tdef, point):
"""Determine graph children of this task at point."""
+ print(f'# Children of {point}/{tdef.name} - Called by {inspect.currentframe().f_back.f_code.co_name}')
graph_children = {}
for seq, dout in tdef.graph_children.items():
for output, downs in dout.items():
@@ -78,6 +80,7 @@ def generate_graph_parents(tdef, point, taskdefs):
Infer parents be reversing upstream triggers that lead to point/task.
"""
+ print(f'# parents of {point}/{tdef.name} - Called by {inspect.currentframe().f_back.f_code.co_name}')
graph_parents = {}
for seq, triggers in tdef.graph_parents.items():
if not seq.is_valid(point):
As you've spotted, I've been relying on
I think we can add that as a workflow field (if it's not already). I've also thought of an optimization for the depth finder, so will put that up in the morning. |
58dbf6e
to
bee6937
Compare
I changed it to reuse the
|
471ebc4
to
529c3ca
Compare
There were still some duplicate |
529c3ca
to
76aecbc
Compare
* Sometimes tasks may appear in the n-window, then dissappear only to reappear at a later time. * This is a difficult case for the `increment_graph_window` to handle. * See cylc#5660
f68a63b
to
7ca4035
Compare
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
* Sometimes tasks may appear in the n-window, then dissappear only to reappear at a later time. * This is a difficult case for the `increment_graph_window` to handle. * See cylc#5660
Co-authored-by: Oliver Sanders <oliver.sanders@metoffice.gov.uk>
* Test the added, updated and pruned deltas created by increment_graph_window as the window changes. * Test window resizing with n=1,2,3
7ca4035
to
bc1a017
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Worked well in my testing
Profiling results for the following workflow in simulation mode: [task parameters]
x = 1..100
y = 1..100
[scheduling]
cycling mode = integer
initial cycle point = 1
final cycle point = 1
[[graph]]
P1 = d[-P1] => a => b<x> => c<y> => d
Good positive change, increases functionality whilst improving performance 💯! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for all your hard work @dwsutherland, this looked like a tricky one!
Added changelog for #5660
addresses #5485, #5609
At present we've resolved an efficiency problem at n=1 window size by restricting the graph walk to only parents or only children (i.e. 'c' 'p', or 'ccc' 'ppp' for n=3) .. However this excludes cousins, which is evident at n>1 (i.e. 'cp' or 'pc' at n=2)
This restriction was a cheap way to avoid graph re-walks.. #5475
Here the problem is actually fixed at any window size, by changing the graph walk algorithm from a recursive to a walk location approach.. From the method's description:
Window resizing will now happen as soon as it's set (not as the graph progresses)
N-window edge depth of nodes is now found/updated with new field
graph_depth
.. A good example workflow:Gives:
Performance
Taken from #5435
$ cylc play <id> --profile --no-detach
Run this for a few mins, when you get bored ctrl+c it, then:
This PR is even faster than current (#5475) at
n=1
:At
n=2
there's a bit more load:but clearly not squared or cubed.
Check List
CONTRIBUTING.md
and added my name as a Code Contributor.setup.cfg
(andconda-environment.yml
if present).CHANGES.md
entry included if this is a change that can affect users