-
Notifications
You must be signed in to change notification settings - Fork 94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
limit task pool size #987
Comments
Moved this from the re-factor proposal:
|
As discussed, limiting the task pool size will improve performance for much larger suites, and in principle it would allow us to drop the runahead limit. But in my opinion we will still need a runahead limit as well as limiting the pool size. Otherwise even small suites - if they contain quick-running data-retrieval tasks that are not constrained by clock triggers, or similar - could fill the task pool. This may not be a real problem unless the run-ahead tasks fill up the disk, but it will make the suite hard to monitor (graph view) and it may cause panic in users (OMG - my get_data task just ran out 100 years ahead!). If this is not considered a problem the user can just set the runahead limit very high. |
Issue title changed from "rethink the runahead limit" to "limit task pool size". As per my previous comment, the runahead limit, although related, is a different concept and will probably still be needed in addition to limiting the pool size. |
Limiting the pool size would require knowing which waiting tasks will be needed first, as excluding the wrong ones from dependency matching could stall the suite? See also #993. How to do this? I guess we'll need to use dependency information from the suite graph: if a task has achieved the 'submitted' state or beyond, its downstream dependants could be needed soon and so should be created (#993) and added to the task pool. |
[meeting]
|
When we change the spawning behaviour (see the superseded issue #1538) so that there is not an implicit task successors submit dependency, we need to be able to report possible changes in behaviour in cylc validate |
Currently (on the iso8601 branch) the runahead limit is a suite-wide parameter that acts as follows: each task proxy spawns a 'waiting' successor when it enters the 'submitted' state. If a waiting task is beyond the runahead limit, it goes into a special pool that does not participate in dependency matching. Tasks under the limit drop back into the main pool. So at least one instance of every defined task proxy exists at any one time, more if one or more instances are submitted or running at the time. Only those below the runahead limit will impact scheduling performance (but all affect cylc's memory footprint). Note that tasks do not spawn multiple waiting instances out to the runahead limit.
@matthewrmshin has some ideas on how to avoid a suite-wide runahead limit...
[edit: it seems the focus of this issue is really on limiting the task pool size, which is one function of the runahead limit but not the only one - we may still need it as well (see below).]
The text was updated successfully, but these errors were encountered: