Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cross Pool Home-Away Preemption #3988

Merged
merged 3 commits into from
Oct 8, 2024
Merged

Conversation

d80tb7
Copy link
Collaborator

@d80tb7 d80tb7 commented Oct 8, 2024

Implementation of cross-pool home-away preemption. Main Changes are as follows:

Pools now need to be enumerated up-front in the scheduler config, rather than dynamically determined.
Pools can now have one or more awayPools associated with them. For each awayPool, jobs may run as away jobs on that pool.
If a Pool only has awayPools set then it is essentially a synthetic pool. It contains no first-class nodes and jobs scheduled on it can only run as away jobs on other pools.
Synthetic pools work as normal pools, except their size is given as the capacity of the "away" pool(s) minus any home jobs running on those pools.

James Murkin and others added 3 commits October 8, 2024 11:12
* First step

* Step 2

* Step 4

* Remove change

* Remove change

* Make nodedb and sctx upfront

* Lint

* WIP

* wip - now compiling

* Hack submitcheck to assign pools correctly

* Minimal working version

* Remove scheduling order

* Fix tests

* Lint

* fix tests

* Fix tests

* Remove unused function

* Add tests

* Update submitcheck tests

* Add ScheduledAway field to PodSchedulingContext

* Add tests for scheduling_algo away scheduling

* Remove unused testfixtures funcs

* Remove debug code

* Simplify pool config + add default config

* Remove intersection

* wip

* Populate ResourceUsageByQueueAndPool on executor

* Remove unused func

* fix merge error

* wip

* Populate pool on pods + fix usage metric calculation

* Simplify

* Fix api tests

* Update job_repository tests

* fixed test

* added extra test

* lint

* linting

* remove stray file

* Revert "remove stray file"

This reverts commit faaaa969d6c4c93e933b316768c08650e178321a.

* remove floatingResources for nodedb

* fix test

* renmae

* rename to UnallocatableResources

---------

Co-authored-by: chrismar503 <chris.martin@gresearch.co.uk>
* First step

* Step 2

* Step 4

* Remove change

* Remove change

* Make nodedb and sctx upfront

* Lint

* WIP

* wip - now compiling

* Hack submitcheck to assign pools correctly

* Minimal working version

* Remove scheduling order

* Fix tests

* Lint

* fix tests

* Fix tests

* Remove unused function

* Add tests

* Update submitcheck tests

* Add ScheduledAway field to PodSchedulingContext

* Add tests for scheduling_algo away scheduling

* Remove unused testfixtures funcs

* Remove debug code

* Simplify pool config + add default config

* Remove intersection

* wip

* Populate ResourceUsageByQueueAndPool on executor

* Remove unused func

* fix merge error

* wip

* Populate pool on pods + fix usage metric calculation

* Simplify

* Fix api tests

* Update job_repository tests

* fixed test

* added extra test

* lint

* linting

* remove stray file

* Revert "remove stray file"

This reverts commit faaaa969d6c4c93e933b316768c08650e178321a.

* remove floatingResources for nodedb

* fix test

* renmae

* rename to UnallocatableResources

* fix test

---------

Co-authored-by: jamesmur913 <james.murkin@gresearch.co.uk>
@d80tb7 d80tb7 merged commit e0ad8d9 into master Oct 8, 2024
24 checks passed
@d80tb7 d80tb7 deleted the sendToGitHub/cross-pool-home-away branch October 8, 2024 11:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants