scx_layered: Make low fallback DSQs useful and other changes #1076

htejun · 2024-12-08T09:03:09Z

Implement starvation prevention for low fallback DSQs and use them to execute tasks with custom affinities and from empty layers.
Simplify owned execution protection.
Guarantee layer growth of high util_range layers.
Other changes.

Currently, DSQ IDs are allocated consecutively and it's not trivial to tell which layer a given DSQ ID belongs to as it's dependent on the number of LLCs on the system. Instead, assign fixed ranges to DSQs so that when printed in hex, it's trivial to tell what type of DSQ it is and which layer/LLC it belongs to. While at it: - Remove a duplicate antistall_set() call on HI_FALLBACK_DSQ_BASE. - Remove unused roate_layer_id() and rotate_llc_id().

And drop llc_ from the dsq_id functions so that they are more consistent with layer_dsq_id(). The lo fallback DSQs are still not used. The next patch will implement starvation prevention for lo fallbacks and use them for affinity violating tasks.

So that we can tell how much CPU is used for the fallback DSQs.

Repeating the pattern "bpf_intf::*_stat_id_ID as usize" in the code is too distracting. Define rust consts up top instead.

Because low fallback DSQs were prone to starvation, we couldn't use them for tasks with custom affinities and instead they were put on hi fallback DSQs. This isn't what we want to do. We don't to discourage the use of custom affinities not encourage them. Implement low fallback DSQ starvation prevention - they are guaranteed a share of each CPU after a set delay. As this makes low fallback DSQs safe to use, use them to execute tasks with custom affinities.

Now that low fallback DSQs are usable, we can put empty layer tasks there instead of depending on the fallback CPU. The fallback CPU is still needed because race conditions can put stray tasks in empty layers but this will be rare and should be negligible. As the fallback CPU was already prioritizing empty layers after only the hi fallbac, the owned protection wasn't doing much and is completely unnecessary now. Rip it out. Also add fallback CPU util metric to verify that it's always staying close to zero.

Rust init code alerady sets layer->slice_ns to the default slice_ns if not specified otherwise. Remove the unncessary accessor function. Also, yield path was incorrectly using system slice_ns. Switch to layer->slice_ns.

Modulating owned execution protection doesn't seem to show any noticeable benefits compared to protecting full while complicating the behavior unnecessarily. Simplify.

Because CPU allocations are on core boundary, if the amount of growth requested is smaller than the core size, sometimes layers fail to grow. Guarantee growth by activating force_free logic whenever there are layers that want to grow.

It's almost always protected. Not interesting anymore.

- Layer util sum didn't include open usage because the sum end index was off by one. Fix it. - Make calc_frac() a proper function and use it in more places.

likewhatevs

LGTM, rly like the stats cleanup

htejun added 12 commits December 6, 2024 19:05

scx_layered: Add hi/lo fallback stats

f3ab830

So that we can tell how much CPU is used for the fallback DSQs.

scx_layered: Define and use rust consts for stat indices

33bca09

Repeating the pattern "bpf_intf::*_stat_id_ID as usize" in the code is too distracting. Define rust consts up top instead.

scx_layered: s/fallback/fb/ for brevity

82e3844

scx_layered: Remove layer_slice_ns()

f9a49af

Rust init code alerady sets layer->slice_ns to the default slice_ns if not specified otherwise. Remove the unncessary accessor function. Also, yield path was incorrectly using system slice_ns. Switch to layer->slice_ns.

scx_layered: Simplify owned execution protection

bb600e6

Modulating owned execution protection doesn't seem to show any noticeable benefits compared to protecting full while complicating the behavior unnecessarily. Simplify.

scx_layered: Guarantee layer growth

d715b90

Because CPU allocations are on core boundary, if the amount of growth requested is smaller than the core size, sometimes layers fail to grow. Guarantee growth by activating force_free logic whenever there are layers that want to grow.

scx_layered: Remove LAYER_USAGE_PROTECTED

09e4737

It's almost always protected. Not interesting anymore.

scx_layered: Fix stats

99206dd

- Layer util sum didn't include open usage because the sum end index was off by one. Fix it. - Make calc_frac() a proper function and use it in more places.

htejun requested review from hodgesds, JakeHillion, etsal and likewhatevs December 8, 2024 09:03

likewhatevs approved these changes Dec 8, 2024

View reviewed changes

htejun added this pull request to the merge queue Dec 8, 2024

Merged via the queue into main with commit 00dac9e Dec 8, 2024
46 checks passed

htejun deleted the htejun/layered-updates branch December 8, 2024 10:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

scx_layered: Make low fallback DSQs useful and other changes #1076

scx_layered: Make low fallback DSQs useful and other changes #1076

htejun commented Dec 8, 2024

likewhatevs left a comment

scx_layered: Make low fallback DSQs useful and other changes #1076

scx_layered: Make low fallback DSQs useful and other changes #1076

Conversation

htejun commented Dec 8, 2024

likewhatevs left a comment

Choose a reason for hiding this comment