-
Notifications
You must be signed in to change notification settings - Fork 447
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
client: avoid overscheduling CPUs in presence of MT jobs #5257
Conversation
In 20ff585 we changed the sched policy so that e.g. if there are 2 4-CPU jobs on a 6-CPU host, it runs them both. I.e. overscheduling the CPUs is better than starving them. This commit refines this a bit: if in addition to the MT jobs there are some 1-CPU jobs, it runs one MT job and two of the 1-CPU jobs. Also: show resource usage in cpu_sched_debug messages Also: if CPUs are starved, trigger a work request. This logic was mistakenly hidden in an if (log_flags.cpu_sched_debug) Also: don't ignore log flags in the simulator
Thanks David. I'll wait until the CI checks have completed and the artifacts have been built, then test it. |
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## master #5257 +/- ##
============================================
- Coverage 10.86% 10.86% -0.01%
Complexity 1064 1064
============================================
Files 279 279
Lines 35968 35969 +1
Branches 8275 8275
============================================
Hits 3909 3909
- Misses 31667 31668 +1
Partials 392 392 |
I've loaded the artifacts from this PR (f0685c4), and re-created the scenario from yesterday. I don't think we've quite hit the mark - in fact, I think we've rather overshot it on the other side. Here's a full, unedited, log cycle for cpu_sched_debug: I'm currently running two GPU tasks specified to use one full CPU core each, and four separate CPU single core tasks. One of the single core tasks has just finished - yesterday, in this situation, the first available 3-core Amicable Numbers MT task was started, and (one minute later) two further single core tasks were suspended. Today, the problem is at the very end:
The overcommit protection has kicked in too soon, and no MT task can run at all - a new single core task has been started instead. In this situation, if the single core tasks finish singly, no MT task will run until forced by deadline pressure. |
But then again, nearly an hour later, this happened:
(I'd turned off cpu_sched_debug, and this log is taken from remote monitoring on a Windows machine). I'd also suspended all remaining unstarted single core tasks, so there wasn't one available to replace the one which finished - not a "normal running" situation. Notice the 1 minute preempt delay has occurred again. |
run MT jobs even if they overcommit CPUs. The problem with running 1-CPU jobs instead is that the MT job may never run until it's in deadline pressure.
Ah. If we let 1-CPU jobs cut in front of the MT job, this may go on indefinitely. This PR contains other improvements so I'm keeping it. |
We also need to consider the other boundary condition - work fetch. In this test, the single-core task (numberfields) is my regular backfill project: the MT tasks (Amicable) are new for this test. They have the same resource share. Amicable has already fetched a new task: numberfields has not - that's to be expected from fetch priorities. The machine is likely to have run dry of single-core tasks in around 3 hours - but after I've gone to bed. It will be interesting to see in the morning what it's done overnight. |
Yes, as I half expected, work fetch failed to take account of the need for single-core tasks.
That new Amicable task was the second to run concurrently, so we're back to 8-core overcommittment. But manually forcing a work fetch brought the core count back to 6:
(the task I suspended was cached, unstarted) |
In 20ff585 we changed the sched policy so that e.g. if there are 2 4-CPU jobs on a 6-CPU host, it runs them both. I.e. overscheduling the CPUs is better than starving them.
This commit refines this a bit: if in addition to the MT jobs there are some 1-CPU jobs,
it runs one MT job and two of the 1-CPU jobs.
Also: show resource usage in cpu_sched_debug messages
Also: if CPUs are starved, trigger a work request. This logic was mistakenly hidden in an
if (log_flags.cpu_sched_debug)
Also: don't ignore log flags in the simulator
Fixes #5254