Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Miri disables debug_assertions in std, and so misses some library UB it could be reporting #2497

Closed
asquared31415 opened this issue Aug 19, 2022 · 9 comments
Labels
C-proposal Category: a proposal for something we might want to do, or maybe not; details still being worked out

Comments

@asquared31415
Copy link
Contributor

Miri should not disable debug_assertions in std (it should in fact explicitly enable them) to improve the detection of bugs in programs, even when they are not language level UB, "only" library UB.

std has several additional library invariants that are not language level UB, for example Vec's len being less than or equal to its capacity, slice::get_unchecked needing an index that is in bounds in element count not just bytes (the case where the resulting byte index is out of bounds is handled by other language rules), CStr::from_bytes_with_nul_unchecked having a trailing null byte and no interior null bytes, and String containing UTF-8 bytes. Some of these conditions are checked by std using debug assertions, specifically a macro that when debug_assertions are enabled, aborts when the condition fails. When debug_assertions are disabled, it conditionally compiles out to a no-op. Currently Miri disables debug_assertions explicitly, noting that it helps performance. I know nothing about this performance nor have I tested performance with or without debug_assertions.

As a concrete example, the following code:

fn main() {
    let _foo = unsafe { *[()].get_unchecked(1) };
}

passes miri with no warnings or errors currently (nightly 2022-08-18), however, it fails to complete execution successfully under a -Zbuild-std with debug_assertions enabled, aborting due to an illegal instruction caused by the abort.

Modifying Miri to enable the flag produces an error, which, even though it's not optimal, is much better than nothing. This error could also be improved with cooperation from std to improve the debug checks.

error: abnormal termination: the program aborted execution
   --> .rustup/toolchains/miri/lib/rustlib/src/rust/library/core/src/slice/index.rs:225:13
    |
225 |             assert_unsafe_precondition!(self < slice.len());
    |             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ the program aborted execution
    |
<snip trace through std internals>
note: inside `main` at src/main.rs:2:26
   --> src/main.rs:2:26
    |
2   |     let _foo = unsafe { *[()].get_unchecked(1) };
    |                          ^^^^^^^^^^^^^^^^^^^^^
    = note: this error originates in the macro `assert_unsafe_precondition` (in Nightly builds, run with -Z macro-backtrace for more info)

This, with no other modifications, raises an error, points to the code the user passed, mentions assert_unsafe_precondition, and has the condition that failed, which should tell the user to check that they're satisfying the unsafe conditions of the function they're calling. This is a sufficient, but not ideal error.

std could also benefit from an additional pass over unsafe conditions for debug checks. Currently most of the checks are on pointer-like operations. The Vec example given above, and Strings's UTF-8 bytes and several other cases of library-only UB are not checked even in debug mode. This would have even more benefit if Miri also ran those checks by enabling debug_assertions.

I would be willing to add these assertions and improve the output from the failure of these checks under Miri.

Downsides

  • Running debug assertions is running more code, which inherently will have some cost.

  • Several debug assertions check things that Miri would already check via language UB, such as the slice::get_unchecked case but for a T that is not zero sized. Importantly, in these cases that Miri would have already checked, the error gets worse since it's now an abort. (!!this is really bad!!)

I believe that there could be a workaround to have certain language level UB not be asserted on if it would already be checked in Miri, but it would be more complicated, involve manually determining whether an unsafe condition causes language or library UB, and being conservative with assertions if it could be both.

@RalfJung
Copy link
Member

RalfJung commented Aug 20, 2022

You have already listed the downsides which explain why at least currently, I don't think Miri should enable debug assertions in the stdlib.

To add to that: the main goal and primary purpose of Miri is to detect language UB. If we can also get some library UB, sure, but (a) it should always be clear which is which, and (b) that should not come at the expense of its ability to detect language UB. (So, losing notable performance due to this is bad. Miri is already so slow that some code cannot reasonably be tested with Miri.) Once you have established with Miri that there is no language UB, I think there are better ways than Miri to detect library UB -- namely, you can now just run regular compiled code with debug assertions, and since there is no language UB, you know that nothing weird will happen and the library UB detection is in full effect.

I know it is currently hard to run the standard library with debug assertions, and Miri already compiles its own standard library so it is easy for Miri to enable debug assertions, but that does not make Miri the right tool for "running the standard library with debug assertions". Instead we should work on making it easier to run code against a standard library built with debug assertions. Honestly that's not even that hard to do... maybe I can spend a bit of time on that, a cargo std-with-debug run/test kind of a thing. Does anyone have a good idea for how to call such a tool? cargo hardened? cargo careful?

@RalfJung RalfJung added the C-proposal Category: a proposal for something we might want to do, or maybe not; details still being worked out label Aug 22, 2022
@saethlin
Copy link
Member

The precipitating context for this is:

fn main() {
    let _foo = unsafe { *[()].get_unchecked(1) };
}

Would this be considered UB?
(context: I've found an API in a library that can issue OOB get_uncheckeds, but only with slices to ZSTs, and I'm trying to figure out if it's unsound)

(OP runs the code under Miri, no error)

confusion ensues, someone asks:

why can't miri run a debug build-std?
i honestly kinda thought it did already


So I think we have two problems here. Miri is seen as the tool to find UB, and yet it is opting out of UB checks. So I think we have a user experience problem here because we betray the user's expectations. I just ran the benchmarks on my tag-gc branch with debug assertions enabled and disabled, and I see regressions between 50% and 0%. So the overhead currently is concerningly large, but I'd like to look into knocking that down before concluding that we can't have these assertions on. I've never worked with Miri or the implementation of these to get the overhead down so who knows maybe there's a lot of wiggle room here.

Secondly, I'm concerned that this case maybe shouldn't be UB. I think slice::get_unchecked is UB when out of bounds because that is a requirement of Stacked Borrows (and probably similar aliasing models). So if this is UB because of Stacked Borrows, then I'm concerned that we have language UB being widened and converted to library UB that actually nobody needs, and we're then punting on checking it in Miri because "it's library UB". Which seems pretty derpy if it's only documented as UB in the library because the model that Miri wants it for non-ZSTs.

@RalfJung
Copy link
Member

The precipitating context for this is:

Yeah I highly doubt this will ever be language UB.

I do agree we need to get better at distinguishing language UB and library UB. To an extent we currently use that distinction to hide the fact that language UB has not been precisely specified yet. That's not great, of course, but also not something we can fix until we do have a more precise spec of language UB.

@saethlin
Copy link
Member

rust-lang/rust#101079 would bring those regressions down to between 0% and 10%. I could make the regressions smaller, but I'm trying to balance against complicating the standard library code. It's not easy to make unoptimized code faster.

@RalfJung
Copy link
Member

I was also concerned about checks somewhere inside data structures, such as HashMap or BTreeMap. If there are debug assertions there are check data structure invariants, those might be quite expensive to check in Miri.

@RalfJung
Copy link
Member

See https://github.com/RalfJung/cargo-careful for a nice way to get a standard library with debug assertions. :)

@RalfJung
Copy link
Member

RalfJung commented Nov 22, 2022

In rust-lang/rust#104726 @the8472 got surprised that Miri did not catch library UB.

But I still think we don't want all stdlib debug assertions in Miri. The stdlib contains two kinds of debug assertions:

  • some catching library UB
  • and many asserting consistency of internal data structures (in particular the btree is full of these)

We don't want the second kind in Miri. Maybe we don't even want them in cargo-careful? Looks like we want another macro that's more like library_ub_debug_assertion which is enabled for cfg(any(miri, careful, debug_assertions)) or so? We probably don't want to list all the tools here though, so we should find something more general.

Cc @thomcc

@the8472
Copy link
Member

the8472 commented Nov 22, 2022

My case is invalid. Miri wouldn't have caught this and neither would have running std tests with debug asserts because the issue is data-dependent and we didn't have a test case covering it.

@RalfJung RalfJung changed the title Miri disables debug_assertions, and so misses some library UB it could be reporting Miri disables debug_assertions in std, and so misses some library UB it could be reporting May 13, 2023
@RalfJung
Copy link
Member

This is fixed by rust-lang/rust#121662: the library UB checks in the stdlib are now controlled by whether the final binary crate is run with -Cdebug-assertions, not the value of that flag during the sysroot build.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-proposal Category: a proposal for something we might want to do, or maybe not; details still being worked out
Projects
None yet
Development

No branches or pull requests

4 participants