Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Efficiency improvements #23

Merged
merged 4 commits into from
Jun 7, 2022
Merged

Efficiency improvements #23

merged 4 commits into from
Jun 7, 2022

Conversation

myxie
Copy link
Owner

@myxie myxie commented Jun 7, 2022

This PR features a number of changes that improves the performance of the scheduling, specifically in the context of:

  • Reading in environments and storing/calculating task runtime
  • Calculating the Estimated Start Time (EST) of a task, which previously was checking through all allocations.

There is also the addition of multiprocessing examples for use with large graphs, which were the originator of efficiency investigations.

myxie added 4 commits March 27, 2022 15:28
Previously we were pre-calculating all task costs on
machines and then accessing through calculated_runtime.

When running with large files, this was ridiculously wasteful and would cause SHADOW to hang.

Now it can process an example skaworkflow (>4400 tasks) on a HPC environment (~900 machines) in ~1minute.

FCFS was also incorrect, so we fixed that up.
Previously we would generate slots for all allocs, rather
than excluding allocations that finish before the earliest
possible start time of current task. Now we only consider
the necessary allocations, rather than all of them.
@myxie myxie merged commit c8583bf into development Jun 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant