Large suites: light weight task proxies. #1689

hjoliver · 2015-12-06T22:09:25Z

A major limiting factor for very large suites is memory use by task proxies. This issue is about minimizing the size of task proxies (the total number of them matters too, but less so if they become very small).

Task proxies have two distinct purposes:

scheduling: they keep track of task prerequisites and outputs.
- small, used throughout the task proxy life cycle (for dependency matching).
hold task runtime settings - written to the job file submitted to run the task.
- potentially very large,but only needed at job submit time.

Clearly we need to either eliminate all duplication of runtime settings that could be shared by task proxies or only load runtime settings when needed at job submit time, then immediately forget them again.

hjoliver · 2015-12-06T23:08:32Z

Eliminate duplication of runtime settings?

Default settings are now shared by task proxies, by means of intervening in the dict look-up mechanism: if a requested item doesn't exist, look it up in the shared defaults dict (#1500).

Inherited settings could also be shared, not duplicated across the inheriting tasks (I think we're still doing that...). Could this be done like the defaults, or is storing references to the same data enough?.

However, the minimum size with all possible sharing of data is still potentially very large: it is the amount of information under [runtime] after Jinja2 processing but before inheritance is worked through (plus defaults).

hjoliver · 2015-12-06T23:47:58Z

Load runtime settings only at job submit time?

This is clearly the right thing to do given that (a) the minimum size of all task runtimes is potentially very large; and (b) it is not needed by the suite daemon for any purpose other than job submission.

Choices:

Write the runtime dict for each task proxy class to disk at start-up. At job submit time, load it, write the job file, forget it.
- The extra disk I/O at job submission time will have no impact: it occurs at exactly the point that we already do a similar amount of I/O - to write the job script.
Hold the minimal complete runtime (above) in memory, and work through inheritance again each time before job submission.
- If inherited settings can all be shared by task proxies instead of duplicated (above) then there is no advantage in delaying inheritance processing until the data is needed. Even if there was an advantage, this would not help suites in which the bulk of runtime settings are not acquired through inheritance.
Read and parse the whole suite definition off disk again prior to every job submission, to extract the runtime for the task.
- this is clearly a bad idea - it involves a lot more I/O than 1. and a lot more processing - parsing a large suite can take a significant amount of time (and maybe just as much memory use as we're trying to avoid, although it could possibly be done by the job submission command).

Therefore, by a process of unassailable logic I propose that we implement 1. above. 😬 Possible caveat, next comment below.

(Note that #1428 which - in effect - dropped all task runtime info immediately after job submission demonstrated a factor of six reduction in memory use for a large suite with a lot of runahead; however, the implementation there had some negative consequences that this proposal does not have, e.g. on monitoring and the ability to re-trigger tasks that have already finished. It may be that optimal sharing of task runtime data could also have achieved a big reduction here.)

hjoliver · 2015-12-07T02:12:14Z

Possible caveat: can the task runtime conf files be generated incrementally at start-up without loading the entire runtime configuration - without defaults - into memory first, for inheritance processing? If not, is brief high memory use at start-up that much better than ongoing high memory use? The disk-based solution would still be much simpler (no need to bother with the extra complexity of ensuring optimal sharing of all settings).

hjoliver · 2015-12-07T10:55:00Z

(this is in fact rather an old idea: #108 (comment))

matthewrmshin · 2015-12-07T11:07:46Z

With your proposed solution 1, are we going to end up with a new small file per task/job? The only concern is that it increases inode usage on the file system, (and some file systems are very unfriendly to lots of small files), but maybe it does not matter.

(On similar note, but unrelated to this issue, perhaps we should move the job-activity.log file one level up, as it is shared between jobs of the same task.)

hjoliver · 2015-12-07T11:17:31Z

No, it'll just be one new small file for each task ~~class~~ (parameterized type I guess; no longer a class), created at suite start-up. I.e. one for each task name. They'll get re-used for each task instance/job.

matthewrmshin · 2015-12-07T11:19:40Z

Sounds like a good compromise.

hjoliver · 2015-12-08T20:47:36Z

To get the full benefit of reduced task proxy size, we need to avoid the initial memory high water mark caused by parsing the suite in its entirety (which includes all the task proxy runtime info). Even when this data is garbage collected after writing the new task proxy runtime config files, the memory may not be returned to the OS by the Python interpreter (although it will be re-used internally) - i.e. the external "resident memory size" of the suite daemon may not go down after suite parsing.

So, we have agreed on the following (via email):

low-memory suite parsing design

(Noting that when a process finishes, all of its memory is released to the OS).

Generate Jinja2-processed file in a subprocess (this is necessarily monolithic, but the whole file is just treated as text by Jinja2).
Read and parse the processed file, skipping over (ignoring) all lines in the [runtime] section.
Read the [runtime] section line by line just to extract the "inherit" item (i.e. not full parsing) for each namespace, then compute the C3 linearization (inheritance order)
Use a process pool to generate the new task config files concurrently, one process for each task: read and parse just the [runtime] namespaces in the inheritance list for the task (and maybe the inheritance can even be done in-place in a single data structure rather than one for each member of the parent list).
At job submit time, have the job-submit command (which is in a sub-process) read the task config file, not the suite daemon.

hjoliver · 2016-06-22T11:06:36Z

[meeting]

worth trying and profiling
some concern about extra I/O, extra files on disk, however:
- one file per task name, not per instance
- at read time (just prior to job submit) we already do I/O (write the job script)
- could use one db for all these files?

hjoliver · 2020-07-16T00:38:35Z

#3515 (spawn on demand) does not address this issue but will reduce task pool size so much that it may not be relevant anymore. But before closing this we should consider the ideas above for reducing the memory footprint due to parsing the suite at start-up?

This was referenced Dec 7, 2015

Large suites: more efficient dependency matching. #1688

Closed

Allowing cycling tasks to run out of order. #1538

Closed

matthewrmshin added this to the soon milestone Dec 7, 2015

hjoliver self-assigned this Dec 8, 2015

This was referenced Jun 22, 2016

Ideas for more efficient scheduling of very large suites #108

Closed

Memory use in large suites #1222

Closed

hjoliver modified the milestones: later, soon Jun 23, 2016

hjoliver added the efficiency For notable efficiency improvements label Jun 23, 2016

hjoliver mentioned this issue Aug 22, 2016

Optionally spawn to max active cycle points. #1966

Merged

hjoliver mentioned this issue Jul 9, 2017

Improve interaction between task state, output and prerequisite #2329

Open

oliver-sanders mentioned this issue Aug 21, 2019

spawn on demand / event driven graph #3304

Open

matthewrmshin modified the milestones: later, cylc-9 Aug 28, 2019

oliver-sanders mentioned this issue Jul 15, 2020

Task proxy spawn on demand. #3515

Merged

11 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Large suites: light weight task proxies. #1689

Large suites: light weight task proxies. #1689

hjoliver commented Dec 6, 2015

hjoliver commented Dec 6, 2015

hjoliver commented Dec 6, 2015

hjoliver commented Dec 7, 2015

hjoliver commented Dec 7, 2015

matthewrmshin commented Dec 7, 2015

hjoliver commented Dec 7, 2015

matthewrmshin commented Dec 7, 2015

hjoliver commented Dec 8, 2015

hjoliver commented Jun 22, 2016 •

edited

Loading

hjoliver commented Jul 16, 2020

Large suites: light weight task proxies. #1689

Large suites: light weight task proxies. #1689

Comments

hjoliver commented Dec 6, 2015

hjoliver commented Dec 6, 2015

hjoliver commented Dec 6, 2015

hjoliver commented Dec 7, 2015

hjoliver commented Dec 7, 2015

matthewrmshin commented Dec 7, 2015

hjoliver commented Dec 7, 2015

matthewrmshin commented Dec 7, 2015

hjoliver commented Dec 8, 2015

hjoliver commented Jun 22, 2016 • edited Loading

hjoliver commented Jul 16, 2020

hjoliver commented Jun 22, 2016 •

edited

Loading