uma: Remove a bogus setbounds operation #2271

markjdb · 2024-12-18T17:09:36Z

No description provided.

It's up to the backend allocator to apply bounds. When UMA is allocating memory, this is already done in the keg (slab allocator) layer. For cache zones, where UMA is just providing a frontend to some custom allocator, we shouldn't be touching bounds at all; we don't even know that what we're allocating is memory, UMA is just providing per-CPU caching for pointer-sized objects.

qwattash · 2024-12-20T14:10:16Z

sys/vm/uma_core.c

-#ifdef __CHERI_PURE_CAPABILITY__
-	if ((zone->uz_flags & UMA_ZONE_PCPU) == 0)
-		item = cheri_setboundsexact(item, zone->uz_size);
-#endif


I'm slightly confused by this. I think the original reason it is there is because we want to ensure that the resulting item has tight bounds even if the uz_import function does not set them. After closer inspection I think it is redundant and should not break bounds in UMA, however it should be noted that uma_zalloc_arg relies on this to have proper bounds on items when cache_alloc_retry can not get hold of a bucket.
Perhaps we want to change it with an assertion?

However, I'm not sure how this breaks things, because the uz_import function should always return an object with uz_size here and should be representable. Maybe the mistake was to use uz_size instead of uk_rsize.

There are also implications for zeroing representability padding here that I need to think more about.

UMA cannot really assume that it's returning capabilities, outside of the keg/slab layer. UMA is a generic resource cache[*]; it can be used as the frontend for any allocator. For example, it can be used to implement vmem's quantum cache.

Here, we can't unconditionally use uk_rsize because the zone might not have a keg associated with it. Any assertion/setbounds operation would have to be conditional on the zone having a keg.

I believe the problem is as follows. In a purecap kernel, a kstack is 5 pages plus a guard page at the "bottom" of the stack. The allocation size according to UMA is 6*PAGE_SIZE. kstack_import() brings kstacks into the UMA cache; it relies on vmem to set bounds, and it returns the address of the first non-guard page in the kstack, so the capability offset is PAGE_SIZE.

Then, when UMA gets a hold of the stack, it tries to set bounds that are larger than the current bounds on the kstack capability, clearing the tag. Why don't we see this problem more often? There are two places where UMA imports items from the backend, zone_alloc_bucket() and zone_alloc_item(), but we only set bounds in the latter, which is a slow path used when there's no free memory available to allocate UMA buckets.

[*] In fact it is not fully generic: it cannot cache the value 0, since that's overloaded to signal allocation failure.

Note, btw, that we really should be able to run with KSTACK_GUARD_PAGES 0 for purecap kernels since bounds on the kernel stack pointer should already provide that protection. I thought I had changed that previously back when merging Bojan's changes for dealing with kstack object page indices, but perhaps I didn't do it yet.

I thought I did change KSTACK_GUARD_PAGES to 0 at some point. Should we open an issue to avoid forgetting?

UMA cannot really assume that it's returning capabilities, outside of the keg/slab layer. UMA is a generic resource cache[*]; it can be used as the frontend for any allocator. For example, it can be used to implement vmem's quantum cache.

We have a flag in vmem for the same reason, so in principle we could/should propagate VMEM_CAPABILITY_ARENA to UMA, or more likely have a negative flag that prevents the zone from setting bounds if necessary.

I see that the problem is also the fact that you may get a capability with an offset from uz_import(). That's clearer now. I think it is fine to rely on the backing allocator to set bounds in this case.

I thought I did change KSTACK_GUARD_PAGES to 0 at some point. Should we open an issue to avoid forgetting?

I opened #2275 for this.

UMA cannot really assume that it's returning capabilities, outside of the keg/slab layer. UMA is a generic resource cache[*]; it can be used as the frontend for any allocator. For example, it can be used to implement vmem's quantum cache.

We have a flag in vmem for the same reason, so in principle we could/should propagate VMEM_CAPABILITY_ARENA to UMA, or more likely have a negative flag that prevents the zone from setting bounds if necessary.

We could if necessary. This would probably be a flag to uma_zcache_create().

I see that the problem is also the fact that you may get a capability with an offset from uz_import(). That's clearer now. I think it is fine to rely on the backing allocator to set bounds in this case.

Are there other cases where we need the frontend to set capability bounds? Looking through existing uma_zcache_create() callers, I don't think so. I note for example that ktls_buffer_import() sets bounds before handing items to UMA.

It is a bit fragile that we rely on uma_zcache_create() consumers to set bounds themselves. Maybe having UMA do it by default, and letting zones opt out as needed, is the right way to go. That's a bit ugly as it requires a second loop over all the imported items after calling uz_import.

In any case, I believe this patch should go in regardless, as the way we're handling things today is not correct: we don't apply bounds at the site of the other uz_import call, which is the common case.

markjdb · 2025-01-04T23:00:02Z

Any objections to merging this? Domagoj reported that it fixed the crashes he was seeing.

bsdjhb

I am fine with merging this as-is for now given that we don't apply bounds in the more common case where uz_import is called.

bsdjhb approved these changes Dec 18, 2024

View reviewed changes

qwattash reviewed Dec 20, 2024

View reviewed changes

bsdjhb reviewed Jan 10, 2025

View reviewed changes

markjdb merged commit 8e565d7 into CTSRD-CHERI:dev Jan 10, 2025
29 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

uma: Remove a bogus setbounds operation #2271

uma: Remove a bogus setbounds operation #2271

markjdb commented Dec 18, 2024

qwattash Dec 20, 2024

markjdb Dec 20, 2024

bsdjhb Dec 20, 2024

qwattash Dec 21, 2024 •

edited

Loading

markjdb Dec 21, 2024

markjdb commented Jan 4, 2025

bsdjhb left a comment

uma: Remove a bogus setbounds operation #2271

uma: Remove a bogus setbounds operation #2271

Conversation

markjdb commented Dec 18, 2024

qwattash Dec 20, 2024

Choose a reason for hiding this comment

markjdb Dec 20, 2024

Choose a reason for hiding this comment

bsdjhb Dec 20, 2024

Choose a reason for hiding this comment

qwattash Dec 21, 2024 • edited Loading

Choose a reason for hiding this comment

markjdb Dec 21, 2024

Choose a reason for hiding this comment

markjdb commented Jan 4, 2025

bsdjhb left a comment

Choose a reason for hiding this comment

qwattash Dec 21, 2024 •

edited

Loading