A debug allocator which removes overalignment from align < 8 allocations #99074

saethlin · 2022-07-09T01:52:31Z

This reorganizes the implementation of the System allocator to permit adding various debuggig features. Currently, all that this implements is a scheme that allocates a little extra space for low-alignment allocations then returns a pointer into the actual allocation which is offset so that it is not over-aligned.

This is a huge aid in discovering accidental reliance on over-alignment. Allocators designed for C can be relied upon to produce over-aligned pointers, so alignment-related bugs can be latent for a long time. Currently I am aware of ~100 crates in the wild where Miri detects misaligned pointer access. Many of these take a &[u8], sometimes from a Vec<u8> and attempt to do reads by converting to a more-aligned type without checking if the access is aligned.

On its own, this PR does basically nothing to detect bugs, but we already have debug assertions in a number of standard library APIs which are enabled along with this allocator in -Zbuild-std. And if I ever finish #98112, we will be able to catch uses which just do misaligned pointer dereferences directly.

This implementation is factored so accommodate other patches to the default allocator which can help in detecting other sources of UB.

@rustbot label +T-libs

This technique I'm trying to use breaks a UI test and I do not know why and I can't figure out how to fix it. I feel like this suggestion that's now coming out of a UI test doesn't even make sense. I made the per-system System allocators pub(crate) but they're suggested outside of std I think? Help? resolved by #99091 ❤️

rustbot · 2022-07-09T01:52:34Z

Hey! It looks like you've submitted a new PR for the library teams!

If this PR contains changes to any rust-lang/rust public library APIs then please comment with @rustbot label +T-libs-api -T-libs to tag it appropriately. If this PR contains changes to any unstable APIs please edit the PR description to add a link to the relevant API Change Proposal or create one if you haven't already. If you're unsure where your change falls no worries, just leave it as is and the reviewer will take a look and make a decision to forward on if necessary.

Examples of T-libs-api changes:

Stabilizing library features
Introducing insta-stable changes such as new implementations of existing stable traits on existing stable types
Introducing new or changing existing unstable library APIs (excluding permanently unstable features / features without a tracking issue)
Changing public documentation in ways that create new stability guarantees
Changing observable runtime behavior of library APIs

rust-highfive · 2022-07-09T01:52:34Z

r? @m-ou-se

(rust-highfive has picked a reviewer for you, use r? to override)

ghost · 2022-07-09T07:34:33Z

I wonder if that's because global_allocator is an attribute macro that expands to rust code. That rust code could have mentioned the crate internal one. It didn't, and never will, but the hygiene might not be smart enough to know that.

I mean, it's still wrong, it shouldn't be doing that, but I wonder if this issue would crop up anywhere else.

ghost · 2022-07-09T07:56:20Z

Yep, this is indeed an issue for any macro, rust will calculate the suggestions based on what the inner bits of the macro can see, not what the calling crate can see. I can't seem to find an open issue about that, but I'll open one.

ghost · 2022-07-10T11:04:10Z

library/std/src/alloc.rs

Should/can we check to ensure the input pointer has the lowest bits set according to the align, and abort if not (because that means they gave us the wrong alignment, which happens).

Are you referring to detecting an attempt to dealloc a different pointer value than one returned by alloc? I think that's a useful property for allocators to have, but it seems like glibc's allocator already does the full implementation of that (as opposed to just checking a few bits), and perhaps as a result people don't make this mistake. I haven't found any code in the wild that would be caught by such a check.

In this PR I'm more interested to know if t-libs is willing to take such an allocator, and I'm starting with a behavior that I know exposes a lot of latent bugs in the wild. We can always add more things to this later if it is accepted.

No, I mean if you alloc with align 2, then try and dealloc with align 1, the offset calculated will be wrong, wouldn't it? Because the offset is calculated based on the alignment.

Recovering the alignment from the given to dealloc address (if it ends in 0b111, assume alignment 1, if it ends in 0b110, assume 2, 0b100 is 4) and then aborting with an error if it's wrong, because we would have tried to free with the wrong address seems sensible.

As in, the address we were given was the same as the address we passed back from alloc, but the alignment is different.

Ahhhhh I see. Yes the current implementation is definitely not good here, because it turns code which is UB but works into an error that only makes any sense if you know about this implementation.

I think eventually we should detect and report that error, but for this PR I am wary of scope creep and I'm not even sure how to print in an allocator.

I think you could allocate a bit more space to store some information in the bytes directly above the pointer you hand back to to the user (either the real Layout, the originally allocated pointer, etc. There are multiple approaches that could be taken). Then you can extract it with read_unaligned. This would also make it easier to add future checks that use this information to verify other things, such as the layout being correct.

I think it might be slightly clearer than the bitwise trickery in this PR, and it's pretty typical of an allocator to have information in some sort of "header" anyway.

thomcc

In this PR I'm more interested to know if t-libs is willing to take such an allocator, and I'm starting with a behavior that I know exposes a lot of latent bugs in the wild.

So, as someone who has written a bunch of smaller ad-hoc allocation error checkers to figure out what was going on, I'm in favor of this kind of thing, and mostly in favor of it living in std when built with debug assertions.

As it is, it's nice that it's opt-in¹, and even if nothing else, it's a good place to insert checks that verify the stdlib itself is violating any rules (for example, this is a different type of misuse, but ideally the bug that #96642, ideally would be caught by a "size not zero" check in the allocator).

That said, user code can already easily add an allocator that detects these things (although I don't know of a currently existing debug allocator which is well-maintained and catches rust-specific misuse), and the scope of these can get wildly out of hand.

That said, this particular issue is one I don't care that much about. Or rather, I think it probably isn't worth the hassle², except for that we have to start somewhere and can use this as scaffolding for future checks.

That is to say, IMO, if we left it here after this, it would be a bit pointless, so should be prepared to invest in some amount effort into this, which probably deserves a plan (or... at least a plan to make a plan 😅).

Making some sort of a proposal also allow t-libs to make the broader decision on whether or not we want to maintain this kind of thing, rather than just having it start small but grow organically into something very complex. (Also, IMO the plan should include what we eventually do with this thing -- if it's just for std and users "in the know", I think we have less stomach for it, but anything bigger definitely requires planning³).

That said, it's totally possible that others disagree with all of this. I've considered adding checks to the allocator, a bunch of times, but didn't want to add them into __rust_alloc (etc), and there wasn't a great place to add them. So despite my hesitation around having a plan, it's possible that on its own this is already a win.

Sorry for writing so much, it's a problem I've thought about some.

Eventually turning it on for all debug_assertions code somehow would be nice, so long as we find a way to be sure we aren't making any UB "worse" (more exploitable), which is fairly tricky. ↩
Obviously it's a bug, but it's one of the lowest impact bug to detect in the allocator -- for example, it's not exploitable (and almost all other allocator-related bugs are). Also, it's not something that is likely to be a real bug in practice -- most custom allocators tend to return 8-byte aligned data anyway (even if they don't have to). And while certainly not guaranteed or universally true, on most targets misaligned access will succeed. ↩
We don't want to make it so release+asserts builds become more dangerous, since people do use those in production sometimes. That said, it could be a new flag in cargo profiles, or something. (Or perhaps we could be very careful that it doesn't "unharden" things?) ↩

thomcc · 2022-07-11T13:31:45Z

library/std/src/alloc.rs

I think you could allocate a bit more space to store some information in the bytes directly above the pointer you hand back to to the user (either the real Layout, the originally allocated pointer, etc. There are multiple approaches that could be taken). Then you can extract it with read_unaligned. This would also make it easier to add future checks that use this information to verify other things, such as the layout being correct.

I think it might be slightly clearer than the bitwise trickery in this PR, and it's pretty typical of an allocator to have information in some sort of "header" anyway.

thomcc · 2022-07-11T14:43:32Z

library/std/src/alloc.rs

I do want to say that I think this is the right place to insert such checks, since it neatly sidesteps questions about what programs which override the global allocator get to rely on. Thanks!

thomcc · 2022-07-11T14:56:18Z

library/std/src/alloc.rs

IMO this really needs to report the issue/abort/something. What it does now seems to just be returning a misaligned pointer silently, and hoping the caller notices. This seems likely to end up not noticed at all, or noticed far away from the allocation site.

I don't have strong thoughts on how to handle it, but I think printing output to stderr and aborting might be a start (or perhaps seeing what handle_alloc_error does? Does it have some solution for the "report-and-abort without unwinding" problem?).

Given that this is mostly the scaffolding for other checks... I think it's worth figuring this out sooner rather than later.

An issue is that stderr might not exist, this is in alloc. We always can panic with a message, though, but we'd need a panic_nounwind!() usable in alloc/core, which would be generally useful even outside of this PR, since we have debug assertions that panic when code might not be unwind safe, and we have debug assertions that abort, but then they can't print any info.

handle_alloc_error seems to be for what callers do when their allocation fails. Looks like it can unwind, if someone sets a hook that panics. I don't see anything that specifically prevents that. Haven't tested that though.

An issue is that stderr might not exist, this is in alloc

The default allocator is in std. If it were in alloc, it couldn't call malloc/HeapAlloc/etc. This does not need to be used for cases where the user does not have the system allocator, or cases where they have overridden it with their own allocator (and I would be fairly strongly opposed to doing anything like this when the allocator has been overridden).

I didn't mean actually use handle_alloc_error for this, I meant see what we do by default in that function, which presumably both prints some message and aborts.

Also to be clear, even in std, you are correct that there may be no standard error stream (such as on wasm*-unknown-unknown). I think it's okay for that to be best effort, so long as the effort is actually pretty good and usually does help the user, (although the aborting should be unconditional, IMO).

And it's a debugging feature, one that is only available via unstable flags. It not being possible on some targets doesn't mean it can't be available on any. We can always just not enable this on targets where we'd have no method of doing anything about detected misuse. That said, I think just about everything with std should be able to do something here (perhaps wasm has no stderr, though).

What it does now seems to just be returning a misaligned pointer silently, and hoping the caller notices.

I'm pretty sure this doesn't return a misaligned pointer silently. The effect of this strategy is to populate the lowest 3 bits to the greatest extent possible on the application side, and on the actual system allocator side keep them always clear. So this just indiscriminately clears the lowest 3 bits when handing a pointer back to the system allocator, which I'm pretty sure makes deallocation with the wrong layout totally fine if the underlying system allocator also doesn't care about layout.

saethlin · 2022-07-11T17:06:39Z

Making some sort of a proposal also allow t-libs

This is the sort of feedback I was looking for. Is there any template for such a proposal? Would such a change be the libs-specific Change Proposal? I'm happy to just freehand it if there isn't anything.

saethlin · 2022-07-26T21:39:21Z

Per feedback from @thomcc I'm going to write something up. No need to prioritize review until then.

@rustbot author

bors · 2022-08-12T19:45:15Z

☔ The latest upstream changes (presumably #100456) made this pull request unmergeable. Please resolve the merge conflicts.

This reorganizes the implementation of the System allocator to permit adding various debuggig features. Currently, all that this implements is a scheme that allocates a little extra space for low-alignment allocations then returns a pointer into the actual allocation which is offset so that it is not over-aligned. This is a huge aid in discovering accidental reliance on over-alignment. Allocators designed for C can be relied upon to produce over-aligned pointers, so alignment-related bugs can be latent for a long time. This implementation is also factored so accommodate other patches to the default allocator which can help in detecting other sources of UB.

saethlin · 2022-09-20T21:56:56Z

@rustbot label +I-libs-api-nominated

saethlin · 2022-09-20T22:00:38Z

@rustbot label -I-libs-api-nominated +I-libs-nominated

m-ou-se · 2022-10-05T14:35:49Z

As discussed on Zulip in #t-libs, this seems like something that (at least at first) can be provided by a crate on crates.io independent from the standard library.

saethlin · 2022-10-05T15:04:26Z

👍 This can be revived later or in another PR, when it's been proven out.

rust-highfive assigned m-ou-se Jul 9, 2022

rustbot added the T-libs Relevant to the library team, which will review and decide on the PR/issue. label Jul 9, 2022

rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jul 9, 2022

This comment has been minimized.

Sign in to view

ghost mentioned this pull request Jul 9, 2022

"The following types implement trait..." suggets crate private items, if inside macro. #99080

Closed

ghost reviewed Jul 10, 2022

View reviewed changes

This comment has been minimized.

Sign in to view

thomcc reviewed Jul 11, 2022

View reviewed changes

saethlin force-pushed the debug-allocator branch from b923233 to b67e033 Compare July 11, 2022 17:04

rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jul 26, 2022

saethlin mentioned this pull request Aug 9, 2022

A debug allocator for std rust-lang/libs-team#83

Closed

saethlin force-pushed the debug-allocator branch 2 times, most recently from 444951f to 2710457 Compare August 12, 2022 20:07

saethlin force-pushed the debug-allocator branch from 2710457 to d641a1d Compare August 12, 2022 20:12

rustbot added the I-libs-api-nominated Nominated for discussion during a libs-api team meeting. label Sep 20, 2022

rustbot added I-libs-nominated Nominated for discussion during a libs team meeting. and removed I-libs-api-nominated Nominated for discussion during a libs-api team meeting. labels Sep 20, 2022

m-ou-se removed the I-libs-nominated Nominated for discussion during a libs team meeting. label Oct 5, 2022

saethlin closed this Oct 5, 2022

A debug allocator which removes overalignment from align < 8 allocations #99074

A debug allocator which removes overalignment from align < 8 allocations #99074

Uh oh!

Conversation

saethlin commented Jul 9, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rustbot commented Jul 9, 2022

Uh oh!

rust-highfive commented Jul 9, 2022

Uh oh!

This comment has been minimized.

ghost commented Jul 9, 2022

Uh oh!

ghost commented Jul 9, 2022

Uh oh!

Choose a reason for hiding this comment

Uh oh!

saethlin Jul 10, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

This comment has been minimized.

thomcc left a comment

Choose a reason for hiding this comment

Footnotes

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

thomcc Jul 11, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

saethlin commented Jul 11, 2022

Uh oh!

saethlin commented Jul 26, 2022

Uh oh!

bors commented Aug 12, 2022

Uh oh!

saethlin commented Sep 20, 2022

Uh oh!

saethlin commented Sep 20, 2022

Uh oh!

m-ou-se commented Oct 5, 2022

Uh oh!

saethlin commented Oct 5, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

saethlin commented Jul 9, 2022 •

edited

Loading

saethlin Jul 10, 2022 •

edited

Loading

thomcc Jul 11, 2022 •

edited

Loading