Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using Option to get undefined behaviour in a const #95332

Open
arnaudgolfouse opened this issue Mar 26, 2022 · 26 comments
Open

Using Option to get undefined behaviour in a const #95332

arnaudgolfouse opened this issue Mar 26, 2022 · 26 comments
Labels
A-const-eval Area: Constant evaluation (MIR interpretation) C-discussion Category: Discussion or questions that doesn't represent real issues. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-opsem Relevant to the opsem team

Comments

@arnaudgolfouse
Copy link
Contributor

I tried this code:

use std::num::NonZeroU8;

pub const UNDEFINED: Option<NonZeroU8> = Some(unsafe { NonZeroU8::new_unchecked(0) });

And assumed the compiler would reject it, but there was no compilation errors.

Note that the following also compiled:

use std::num::NonZeroU8;

pub const UNDEFINED: Option<NonZeroU8> = Some(unsafe { NonZeroU8::new_unchecked(0) });

// Sometime using the undefined constant triggers an error: not here...
pub use_undefined() -> Option<NonZeroU8> { UNDEFINED }

// This is not limited to Option !
pub enum MyOption {
    Some(NonZeroU8),
    None
}

pub const UNDEFINED2: MyOption = MyOption::Some(unsafe { NonZeroU8::new_unchecked(0) });

pub use_undefined2() -> MyOption { UNDEFINED2 }

I am assuming this is caused by layout optimizations ?

Meta

rustc --version --verbose:

rustc 1.59.0 (9d1b2106e 2022-02-23)
binary: rustc
commit-hash: 9d1b2106e23b1abd32fce1f17267604a5102f57a
commit-date: 2022-02-23
host: x86_64-unknown-linux-gnu
release: 1.59.0
LLVM version: 13.0.0

rustc +nightly --version --verbose:

rustc 1.61.0-nightly (68369a041 2022-02-22)
binary: rustc
commit-hash: 68369a041cea809a87e5bd80701da90e0e0a4799
commit-date: 2022-02-22
host: x86_64-unknown-linux-gnu
release: 1.61.0-nightly
LLVM version: 14.0.0
@arnaudgolfouse arnaudgolfouse added the C-bug Category: This is a bug. label Mar 26, 2022
@RalfJung
Copy link
Member

And assumed the compiler would reject it, but there was no compilation errors.

We do not guarantee that all Undefined Behavior is detected, not at run-time and not at compile-time. When you use unsafe, it is up to you to make sure you follow the rules.

We do detect some UB, e.g. here:

pub const UNDEFINED: NonZeroU8 = unsafe { NonZeroU8::new_unchecked(0) };

But exhaustively checking for all UB is super expensive at best. Even Miri, which does a lot more work than CTFE, does not detect all UB.

So I consider this not a bug.
Cc @rust-lang/wg-const-eval

@saethlin
Copy link
Member

Miri detects this if the code is shifted to normal runtime eval. Is it worth considering moving this check into CTFE or does that just open up too many questions?

@Lokathor
Copy link
Contributor

on the one hand i would like CTFE to catch it, on the other I'm worried if people might think that it'll then catch the runtime version as well (which it won't).

@saethlin
Copy link
Member

I'm not so sure about that. C++ has set a precedent that const-eval has extra UB checking powers that normal runtime eval does not have.

@oli-obk
Copy link
Contributor

oli-obk commented Mar 26, 2022

As Ralf noted, this can be very expensive. We can benchmark it to see how expensive and make a decision based on that.

We can also consider using a lint for these extra checks and only turning them on if the lint is warn/deny (and set it to allow by default for speed).

Oh, even funnier idea: clippy can overwrite the const eval query to make it run with validation. This way cargo clippy will catch more UB than cargo check

@ChrisDenton
Copy link
Member

Specifically in the narrow case of NonZero* integer types, would it be quicker for Rust to warn against any use of new_unchecked while in a const context?

@saethlin
Copy link
Member

I am suspicious of claims that this check is expensive. Miri is poorly-benchmarked, does CTFE have any benchmarks?

@oli-obk
Copy link
Contributor

oli-obk commented Mar 26, 2022

Yes, CTFE is benchmarked with the rustc benchmarks. Trying it out is as simple as opening a PR removing

fn enforce_validity(_ecx: &InterpCx<$mir, $tcx, Self>) -> bool {
and manually adding it to the two use sites of the macro. Then flip the bool for the const_eval case.

@fee1-dead
Copy link
Member

It feels like it can just have debug_assert! inside the unchecked functions, or would it not be ideal?

@saethlin
Copy link
Member

@fee1-dead I've had a PR up for the past few months that does just that :) #92686

@RalfJung
Copy link
Member

That wouldn't make any difference here though, since the stdlib we ship is compiled without debug assertions.

@saethlin
Copy link
Member

saethlin commented Mar 27, 2022

sigh of course the bigger problem is that all those checks are disabled in const eval. And I did that because I assumed while writing it that const eval would catch all these things. Around and around we go...

@oli-obk What's wrong with just doing this? I found out what is wrong with this (MIR transforms panic if they do validation).

diff --git a/compiler/rustc_const_eval/src/interpret/machine.rs b/compiler/rustc_const_eval/src/interpret/machine.rs
index 7d75c84d108..121121cdab2 100644
--- a/compiler/rustc_const_eval/src/interpret/machine.rs
+++ b/compiler/rustc_const_eval/src/interpret/machine.rs
@@ -422,7 +422,7 @@ fn force_int_for_alignment_check(_memory_extra: &Self::MemoryExtra) -> bool {
 
     #[inline(always)]
     fn enforce_validity(_ecx: &InterpCx<$mir, $tcx, Self>) -> bool {
-        false // for now, we don't enforce validity
+        true
     }
 
     #[inline(always)]

@fee1-dead
Copy link
Member

That wouldn't make any difference here though, since the stdlib we ship is compiled without debug assertions.

Then an unconditional check using const_eval_select would suffice?

@RalfJung
Copy link
Member

Then an unconditional check using const_eval_select would suffice?

Yes that would be an option.

@RalfJung
Copy link
Member

I found out what is wrong with this (MIR transforms panic if they do validation).

The compiler contains two instances of the interpret::Machine trait: one for constant propagation (a MIR transform, i.e. an opportunistic optimization), and one for constant evaluation (the thing that evaluates const/static, array lengths, etc).

You'll want to only enable validation for the constant evaluator, not for constant propagation; that's what @oli-obk described.

But I think even that is going to be way too slow. If we truly want to do something here we should have a type-based heuristic to only perform validation on certain 'interesting' types like scalars with a non-total validity range. But that might mean even more people expect us to actually catch all UB...

@Lokathor
Copy link
Contributor

I know nothing about the const evaluator, but can a single extra check against zero, in limited situations, really be that much of a slow down?

And if so, could it be done based on if some cargo key is enabled or not? Like only do the extra checks if debug_assertions are enabled?

@RalfJung
Copy link
Member

RalfJung commented Mar 27, 2022

I know nothing about the const evaluator, but can a single extra check against zero, in limited situations, really be that much of a slow down?

The check we are talking about -- the check Miri performs -- is to check on every single assignment that the value being assigned indeed satisfies validity of the type the assignment happens at.

That's why I mentioned a typed-based heuristic to only check in certain situations, e.g. only assignments of type NonZero* (and similar non-null types). But that will miss some bugs, like when transmuting 0u16 to type (u8, NonZeroU8).

@saethlin
Copy link
Member

saethlin commented Mar 27, 2022

Just so everyone knows, I'm working on the suggested PR above, but this breaks a lot of UI tests that x test --bless doesn't fix so it might be a bit.

@saethlin
Copy link
Member

I've implemented my ideas in #95377. The general idea is to try to detect when we are executing unsafe code, and toggle on checking for that code. The implementation is not clean, I'm fiddling with another iteration locally that will make things a lot more readable.

That PR detects the UB that was reported here, along with the test for #64506. Which I think is pretty cool.

The naive approach reports:

Arithmetic mean of relevant regressions: 6.7%
Arithmetic mean of all relevant changes: 6.6%
Largest regression in instruction counts: 50.6% on full builds of ctfe-stress-4 check

My best work so far is:

Arithmetic mean of relevant regressions: 0.8%
Arithmetic mean of all relevant changes: 0.7%
Largest regression in instruction counts: 1.9% on full builds of ctfe-stress-4 check

My biggest problem now is that I can't figure out where those ~1.9% of instructions is coming from. My best guess at the moment is that simply having the validation code in the CTFE code instead of compiled out because it's behind an if false has caused a handful of inlining changes for the worse (I know that the inlining changed, I'm just not sure if that is the cause). Locally I measure a 1.4% regression with a much higher significance factor, and adding up percentages in each of the functions I'm adding doesn't come close to 1.4%.

So on the one hand, I'm tempted to make some compensatory optimizations to bring the regression down. But I don't know if that's cheating. We are turning on more checks. It would be kind of crazy if that didn't cause a slowdown. Are compensatory optimizations of interested here, such as to get the inlining back to what it was? What is considered an acceptable performance regression for merging a PR that turns on some validation in CTFE?

@RalfJung
Copy link
Member

RalfJung commented Apr 1, 2022

Could you describe at a high level when the extra check kicks in? EDIT: Oh I can't read, you said that. Inside unsafe code. Hm...

I am not sure if we want to do something ad-hoc that catches this UB but still leaves many similar cases of UB undetected.

@RalfJung
Copy link
Member

RalfJung commented Apr 1, 2022

The general idea is to try to detect when we are executing unsafe code, and toggle on checking for that code.

Unfortunately, I don't think we have any benchmarks that actually execute unsafe const code. So those numbers are not very useful to gauge the perf impact of this change. It would probably still be prohibitive for const code that does use unsafe intensively...

We also agree that this check is not water-tight, right? There are ways to 'delay' the UB until after the unsafe block. It also misses cases like rust-lang/unsafe-code-guidelines#84. However I agree this heuristic is probably going to catch the vast majority of real code that creates invalid values. I am not sure how much perf cost that is worth, I don't feel like that is a judgment call I can make.

@saethlin
Copy link
Member

saethlin commented Apr 1, 2022

Just to make sure we understand, the point of this is not to catch all UB in const eval, or all invalid data. The idea is to selectively turn on validation where it's likely to be particularly helpful.

I am not sure if we want to do something ad-hoc that catches this UB but still leaves many similar cases of UB undetected.

The validation that CTFE does is already flaky with respect to how code is factored, so I am not convinced that this patch is any worse in this particular regard.

Unfortunately, I don't think we have any benchmarks that actually execute unsafe const code.

Sure we do. Basically all serious software executes unsafe const code. unsafe ends up threaded through a fair bit of code because it underlies things like constructing a Cell. While building the compiler, the UnsafeVisitor finds unsafe 158 times, here are the top few (x build | counts with a println! of the DefId when unsafe is found):

(  1)       40 (25.3%, 25.3%): DefId(0:9242 ~ core[cdca]::sync::atomic::atomic_load)
(  2)       20 (12.7%, 38.0%): DefId(0:1018 ~ test[5b9c]::term::terminfo::parm::format)
(  3)       12 ( 7.6%, 45.6%): DefId(0:4750 ~ std[40f7]::io::default_read_to_end)
(  4)        5 ( 3.2%, 48.7%): DefId(0:10318 ~ std[40f7]::panicking::rust_panic_with_hook)
(  5)        4 ( 2.5%, 51.3%): DefId(0:262 ~ test[5b9c]::bench::fmt_thousands_sep)
(  6)        3 ( 1.9%, 53.2%): DefId(0:13333 ~ std[40f7]::sys::unix::kernel_copy::sendfile_splice)
(  7)        3 ( 1.9%, 55.1%): DefId(0:258 ~ test[5b9c]::bench::fmt_bench_samples)
(  8)        3 ( 1.9%, 57.0%): DefId(0:5990 ~ std[40f7]::os::unix::net::addr::sockaddr_un)
(  9)        2 ( 1.3%, 58.2%): DefId(0:10143 ~ std[40f7]::alloc::default_alloc_error_hook)
( 10)        2 ( 1.3%, 59.5%): DefId(0:10231 ~ std[40f7]::panicking::set_hook)
( 11)        2 ( 1.3%, 60.8%): DefId(0:10232 ~ std[40f7]::panicking::take_hook)
( 12)        2 ( 1.3%, 62.0%): DefId(0:10326 ~ std[40f7]::panicking::rust_panic)
( 13)        2 ( 1.3%, 63.3%): DefId(0:13733 ~ std[40f7]::sys::unix::os::error_string)
( 14)        2 ( 1.3%, 64.6%): DefId(0:50 ~ std[40f7]::rt::init)
( 15)        2 ( 1.3%, 65.8%): DefId(0:701 ~ core[cdca]::num::flt2dec::strategy::dragon::format_exact)

While building the hyper-0.14.18 benchmark, it finds unsafe 200 times.

(  1)        2 ( 1.0%,  1.0%): DefId(0:10066 ~ tokio[f9b9]::task::local::CURRENT)
(  2)        2 ( 1.0%,  2.0%): DefId(0:6491 ~ tokio[f9b9]::runtime::basic_scheduler::CURRENT)
(  3)        2 ( 1.0%,  3.0%): DefId(0:7088 ~ tokio[f9b9]::runtime::thread_pool::worker::CURRENT)
(  4)        1 ( 0.5%,  3.5%): DefId(0:10024 ~ futures_util[351e]::stream::stream::peek::_#3)
(  5)        1 ( 0.5%,  4.0%): DefId(0:10067 ~ tokio[f9b9]::task::local::CURRENT::FOO)
(  6)        1 ( 0.5%,  4.5%): DefId(0:10073 ~ futures_util[351e]::stream::stream::peek::_#4)
(  7)        1 ( 0.5%,  5.0%): DefId(0:10124 ~ futures_util[351e]::stream::stream::skip::_)
(  8)        1 ( 0.5%,  5.5%): DefId(0:10170 ~ futures_util[351e]::stream::stream::skip_while::_)
(  9)        1 ( 0.5%,  6.0%): DefId(0:10237 ~ futures_util[351e]::stream::stream::take::_)
( 10)        1 ( 0.5%,  6.5%): DefId(0:10283 ~ futures_util[351e]::stream::stream::take_while::_)
( 11)        1 ( 0.5%,  7.0%): DefId(0:10350 ~ futures_util[351e]::stream::stream::take_until::_)
( 12)        1 ( 0.5%,  7.5%): DefId(0:10407 ~ futures_util[351e]::stream::stream::then::_)
( 13)        1 ( 0.5%,  8.0%): DefId(0:10472 ~ futures_util[351e]::stream::stream::zip::_)
( 14)        1 ( 0.5%,  8.5%): DefId(0:10526 ~ futures_util[351e]::stream::stream::chunks::_)
( 15)        1 ( 0.5%,  9.0%): DefId(0:10574 ~ futures_util[351e]::stream::stream::ready_chunks::_)

I picked hyper-0.14.18 because it's the real-world benchmark with the largest regression. Now, I think maybe there is an argument that some very unsafe const code sees an unfair regression from this and maybe that's possible. But I think the rustc-perf benchmark suite has a basic sampling of unsafe const code, as it is written for normal software.

@RalfJung
Copy link
Member

RalfJung commented Apr 1, 2022

Sure we do.

Hm, fair. Out stress test benchmark I think doesn't do this though.
To evaluate the worst-case scenario here, it might be worth editing the ctfe-stress benchmark to put everything into an unsafe block and see what happens? Or is the expected result then that we get the 50% that the initial attempt produced?

@saethlin
Copy link
Member

saethlin commented Apr 3, 2022

The worst-case scenario appears to be this sort of shenanigans, which produces a ~58% regression (at a guess the 8% is related to deciding to enable the checks):

diff --git a/collector/benchmarks/ctfe-stress-4/src/lib.rs b/collector/benchmarks/ctfe-stress-4/src/lib.rs
index 074a18c5..1119b3b1 100644
--- a/collector/benchmarks/ctfe-stress-4/src/lib.rs
+++ b/collector/benchmarks/ctfe-stress-4/src/lib.rs
@@ -11,10 +11,10 @@ use std::mem::MaybeUninit;
 macro_rules! const_repeat {
     // Base case: Use 16 at the end to avoid function calls at the leafs as much as possible.
     ([16] $e: expr, $T: ty) => {{
-        $e; $e; $e; $e;
-        $e; $e; $e; $e;
-        $e; $e; $e; $e;
-        $e; $e; $e; $e
+        unsafe{$e}; unsafe{$e}; unsafe{$e}; unsafe{$e};
+        unsafe{$e}; unsafe{$e}; unsafe{$e}; unsafe{$e};
+        unsafe{$e}; unsafe{$e}; unsafe{$e}; unsafe{$e};
+        unsafe{$e}; unsafe{$e}; unsafe{$e}; unsafe{$e}
     }};
     ([1] $e: expr, $T: ty) => {{
         $e

Simply putting the whole benchmark in an unsafe block or making every fn an unsafe fn does not produce anything measurably worse than the all-safe overhead (which I think I just drove down to 0.8%), because this check is sloppy enough that it basically doesn't validate anything.

@RalfJung
Copy link
Member

RalfJung commented Apr 3, 2022

Simply putting the whole benchmark in an unsafe block or making every fn an unsafe fn does not produce anything measurably worse than the all-safe overhead (which I think I just drove down to 0.8%), because this check is sloppy enough that it basically doesn't validate anything.

So the check interacts badly with macros, or so? Because that should still all be unsafe code then, right?

@saethlin
Copy link
Member

saethlin commented Apr 3, 2022

I suspect the check interacts badly with functions. It's extremely local and sloppy, in the name of doing little work when there is no unsafe code. It could definitely be tightened up if you or Oli have suggestions on a more principled but 🤔 appropriately thrifty way to do this.

Whenever we call load_mir, the HIR for the loaded Instance is searched for unsafe blocks. If one is found, checking is on. If one isn't found, checking is turned off.

If we call enforce_validity without seeing a call to load_mir and we have a stack with one frame, we check it for unsafe. If it has no unsafe blocks in it, checking is off. If it does contain an unsafe block, or there is more than one stack frame, checking is on. Subsequent calls to enforce_validity will not not change whether or not checking is turned on. Only a call to load_mir can do that.

We also defensively assume that MIR from outside the current crate contains unsafe.

@ChrisDenton ChrisDenton added the needs-triage-legacy Old issue that were never triaged. Remove this label once the issue has been sufficiently triaged. label Jul 16, 2023
@fmease fmease added A-const-eval Area: Constant evaluation (MIR interpretation) C-discussion Category: Discussion or questions that doesn't represent real issues. T-opsem Relevant to the opsem team T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. and removed needs-triage-legacy Old issue that were never triaged. Remove this label once the issue has been sufficiently triaged. C-bug Category: This is a bug. labels Jan 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-const-eval Area: Constant evaluation (MIR interpretation) C-discussion Category: Discussion or questions that doesn't represent real issues. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-opsem Relevant to the opsem team
Projects
None yet
Development

No branches or pull requests

8 participants