-
Notifications
You must be signed in to change notification settings - Fork 94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Large suites: more efficient dependency matching. #1688
Comments
👍 |
(like #1689, this is also an old idea: #108 (comment)) |
I had a look, and I think we can get our performance boost just by using |
Some results for our N.B. I ran it with ~/cylc-run/busy symlinked to
(lots more below 10 seconds that I've omitted) The 10x busy suite I ran with (note: runtime uses wrong range for #!jinja2
title = Suite-1000-x3
description = A suite of 1000 tasks per cycle
[cylc]
UTC mode = True # Ignore DST
[scheduling]
initial cycle point = 20130101T00
final cycle point = 20130103T00
runahead limit = PT12H
[[queues]]
[[[everything]]]
limit = 100
members = root
[[dependencies]]
[[[T00]]]
graph="""
{% for i0 in range(10) -%}
{% for i1 in range(10) -%}
{% for i2 in range(10) -%}
t{{i0}}{{i1}}{{i2}}9[-P1D] => t{{i0}}{{i1}}{{i2}}0
t{{i0}}{{i1}}{{i2}}0 => t{{i0}}{{i1}}{{i2}}1 => \
t{{i0}}{{i1}}{{i2}}2 => t{{i0}}{{i1}}{{i2}}3
t{{i0}}{{i1}}{{i2}}3 => t{{i0}}{{i1}}{{i2}}4 => \
t{{i0}}{{i1}}{{i2}}5 => t{{i0}}{{i1}}{{i2}}6
t{{i0}}{{i1}}{{i2}}6 => t{{i0}}{{i1}}{{i2}}7 => \
t{{i0}}{{i1}}{{i2}}8 => t{{i0}}{{i1}}{{i2}}9
{% endfor -%}
{% endfor -%}
{% endfor -%}
"""
[runtime]
[[root]]
command scripting = sleep 1
[[[event hooks]]]
succeeded handler = true
failed handler = true
retry handler = true
submission failed handler = true
submission timeout handler = true
execution timeout handler = true
execution timeout = PT6M
submission timeout = PT1M
{% for i0 in range(10) -%}
{% for i1 in range(10) -%}
{% for i2 in range(1) -%}
{% for i3 in range(10) -%}
[[t{{i0}}{{i1}}{{i2}}{{i3}}]]
{% endfor -%}
{% endfor -%}
{% endfor -%}
{% endfor -%} |
That's amazing, good job. Let's get this in! 😀 |
Closing this as #1769 more or less does what was proposed. |
Companion to #1689. The speed of dependency matching goes down as the number of task proxies goes up. Currently indiscriminate: we get all waiting task proxies to ask all non-waiting ones to satisfy their prerequisites (albeit through a middle-man that records everyone's outputs). Given that the graph specifies exactly who depends on who, we should be able to have a newly completed output immediately update the prerequisites of exactly those tasks that are known to depend on it. This should scale much better to huge suite sizes.
The text was updated successfully, but these errors were encountered: