Spawning + datastore with thousands of "ready" tasks #5437

hjoliver · 2023-03-29T00:48:56Z

Not strictly a bug, but some extreme optimization is required in Cylc 8 for pathological workflows where a huge number of tasks hit the active window at once.

Here the scheduler has to spawn 7000 tasks all at once, off of a:succeed, which takes ~5 minutes on my fairly powerful laptop, during which time the scheduler is unresponsive.

[task parameters]
   m = 0..6999  # !!! 
[scheduling]
   [[queues]]
      [[[default]]]
         limit = 4
   [[graph]]
      R1 = "a => b<m>"  # !!!
[runtime]
   [[a]]
      script = sleep 10
   [[b<m>]]

Initial profiling results from @oliver-sanders shows:

primarily, the datastore n-window computation is responsible
secondarily (much less time), each spawned tasks needs a database read to get submit number and flow info

Ok, I got bored after 20mins or so and cut the run off at that point. FYI, if you ctrl+c your workflow, the profile.prof file still gets generated.

The spawn_on_output function itself took 0.1562s.

The increment_graph_window function in the data store took 1139s (including its resulting calls).

So it's the data store not the task pool. The increment_graph_window function was called 4'325 times, but called itself recursively 30'276'650 times which is where the CPU gets soaked up.

This is more-or-less as expected, we knew this function was being called more times than necessary, see this comment - #5319 (comment)
data-store graph window efficiency/refactor by dwsutherland · Pull Request #5319 · cylc/cylc-flow - GitHub
Partially addresses #5315 Supersedes #5316 Check List I have read CONTRIBUTING.md and added my name as a Code Contributor. Contains logically grouped changes (else tidy your branch by rebase). ...

Two suggestions:

If possible, batch together the increment_graph_window / task spawning to reduce the number of top-level calls to increment_graph_window.

Would require heavy refactoring, the function is designed to expand the graph around one task at a time.

Potential savings ~4000x

Come up with a more efficient approach to increment_graph_window.

I.E. Remember which nodes we have already visited to avoid repeat visits.

Potential savings somewhere between 750x and 43'000x depending on the impact of batching.

The end result of these increment_graph_window calls is 30,392,831 detokenise calls, but that's not really the culprit here. There are 7000 tasks and 7000 dependencies so there's only call for 14000 detokenise calls, so we're calling the interface ~2000 times more than we should be. If we can make detokenise faster, great, but reducing the number of calls is where the order of magnitude improvements we need will come from.

The text was updated successfully, but these errors were encountered:

hjoliver · 2023-04-26T01:16:14Z

Closing as a duplicate of #5435 (although I think this was the original)

hjoliver added the bug Something is wrong :( label Mar 29, 2023

hjoliver mentioned this issue Mar 29, 2023

Batch spawn POC #5438

Draft

8 tasks

oliver-sanders added this to the cylc-8.2.0 milestone Mar 29, 2023

hjoliver removed this from the cylc-8.2.0 milestone Apr 26, 2023

hjoliver added the duplicate This is a duplicate of something else label Apr 26, 2023

hjoliver closed this as completed Apr 26, 2023

oliver-sanders mentioned this issue Apr 26, 2023

Avoid duplicate prerequisites from multiple recurrences. #5466

Merged

8 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Spawning + datastore with thousands of "ready" tasks #5437

Spawning + datastore with thousands of "ready" tasks #5437

hjoliver commented Mar 29, 2023

hjoliver commented Apr 26, 2023

Spawning + datastore with thousands of "ready" tasks #5437

Spawning + datastore with thousands of "ready" tasks #5437

Comments

hjoliver commented Mar 29, 2023

hjoliver commented Apr 26, 2023