Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

scx_layered: Make low fallback DSQs useful and other changes #1076

Merged
merged 12 commits into from
Dec 8, 2024

Conversation

htejun
Copy link
Contributor

@htejun htejun commented Dec 8, 2024

  • Implement starvation prevention for low fallback DSQs and use them to execute tasks with custom affinities and from empty layers.
  • Simplify owned execution protection.
  • Guarantee layer growth of high util_range layers.
  • Other changes.

htejun added 12 commits December 6, 2024 19:05
Currently, DSQ IDs are allocated consecutively and it's not trivial to tell
which layer a given DSQ ID belongs to as it's dependent on the number of
LLCs on the system. Instead, assign fixed ranges to DSQs so that when
printed in hex, it's trivial to tell what type of DSQ it is and which
layer/LLC it belongs to.

While at it:
- Remove a duplicate antistall_set() call on HI_FALLBACK_DSQ_BASE.
- Remove unused roate_layer_id() and rotate_llc_id().
And drop llc_ from the dsq_id functions so that they are more consistent
with layer_dsq_id(). The lo fallback DSQs are still not used. The next patch
will implement starvation prevention for lo fallbacks and use them for
affinity violating tasks.
So that we can tell how much CPU is used for the fallback DSQs.
Repeating the pattern "bpf_intf::*_stat_id_ID as usize" in the code is too
distracting. Define rust consts up top instead.
Because low fallback DSQs were prone to starvation, we couldn't use them for
tasks with custom affinities and instead they were put on hi fallback DSQs.
This isn't what we want to do. We don't to discourage the use of custom
affinities not encourage them.

Implement low fallback DSQ starvation prevention - they are guaranteed a
share of each CPU after a set delay. As this makes low fallback DSQs safe to
use, use them to execute tasks with custom affinities.
Now that low fallback DSQs are usable, we can put empty layer tasks there
instead of depending on the fallback CPU. The fallback CPU is still needed
because race conditions can put stray tasks in empty layers but this will be
rare and should be negligible.

As the fallback CPU was already prioritizing empty layers after only the hi
fallbac, the owned protection wasn't doing much and is completely
unnecessary now. Rip it out.

Also add fallback CPU util metric to verify that it's always staying close
to zero.
Rust init code alerady sets layer->slice_ns to the default slice_ns if not
specified otherwise. Remove the unncessary accessor function. Also, yield
path was incorrectly using system slice_ns. Switch to layer->slice_ns.
Modulating owned execution protection doesn't seem to show any noticeable
benefits compared to protecting full while complicating the behavior
unnecessarily. Simplify.
Because CPU allocations are on core boundary, if the amount of growth
requested is smaller than the core size, sometimes layers fail to grow.
Guarantee growth by activating force_free logic whenever there are layers
that want to grow.
It's almost always protected. Not interesting anymore.
- Layer util sum didn't include open usage because the sum end index was off
  by one. Fix it.

- Make calc_frac() a proper function and use it in more places.
Copy link
Contributor

@likewhatevs likewhatevs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, rly like the stats cleanup

@htejun htejun added this pull request to the merge queue Dec 8, 2024
Merged via the queue into main with commit 00dac9e Dec 8, 2024
46 checks passed
@htejun htejun deleted the htejun/layered-updates branch December 8, 2024 10:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants