Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

scx_layered: Implement empty LLC draining #1092

Merged
merged 4 commits into from
Dec 11, 2024
Merged

Conversation

htejun
Copy link
Contributor

@htejun htejun commented Dec 10, 2024

A layer is served by per-LLC DSQs. Only tasks that can be serviced by all
CPUs in an LLC are put in the DSQ, so as long as a CPU is assigned to the
layer in the LLC, forward progress is guaranteed. However, if a layer-LLC
loses all its CPUs, there is no forward progress guarantee ignoring the
antistall mechanism. For a confined layer, no CPU will be visiting the empty
LLCs and even for a grouped or open layer, execution from an LLC without
CPUs assigned is lower priority than owned execution and can easily starve.

To resolve the problem, implement LLC draining mechanism. When layer-LLC
loses all CPUs with tasks in it, draining is turned on and other CPUs
assigned to the layer will alternate between their own execution and
draining LLCs without any CPU. The interlockings between the involved code
paths - refresh_cpumasks(), layered_enqueue() and layered_dispatch() - are
rather intricate to guarantee that no tasks end up sitting in a CPU-less
LLC. See comments for details.

Use < instead of <= when comparing against xllc_mig_min_ns so that 0 can
disable it completely.
This is trivial to count from BPF but we'll also add per-LLC counts, so
let's just do whatever we can do from userspace in userspace.
A layer is served by per-LLC DSQs. Only tasks that can be serviced by all
CPUs in an LLC are put in the DSQ, so as long as a CPU is assigned to the
layer in the LLC, forward progress is guaranteed. However, if a layer-LLC
loses all its CPUs, there is no forward progress guarantee ignoring the
antistall mechanism. For a confined layer, no CPU will be visiting the empty
LLCs and even for a grouped or open layer, execution from an LLC without
CPUs assigned is lower priority than owned execution and can easily starve.

To resolve the problem, implement LLC draining mechanism. When layer-LLC
loses all CPUs with tasks in it, draining is turned on and other CPUs
assigned to the layer will alternate between their own execution and
draining LLCs without any CPU. The interlockings between the involved code
paths - refresh_cpumasks(), layered_enqueue() and layered_dispatch() - are
rather intricate to guarantee that no tasks end up sitting in a CPU-less
LLC. See comments for details.
@htejun htejun added this pull request to the merge queue Dec 11, 2024
Merged via the queue into main with commit 1946c01 Dec 11, 2024
46 checks passed
@htejun htejun deleted the htejun/layered-updates branch December 11, 2024 01:29
@htejun htejun restored the htejun/layered-updates branch December 11, 2024 01:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants