-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add const-ub RFC #3016
add const-ub RFC #3016
Conversation
Does this RFC tries to make It is obvious that if
Maybe this way should be the main way instead of an alternative? |
(All the invariants for the non- More specifically, it grants access to constructs which would have to be checked by MIRI as part of their compile-time execution for UB to be detected by the compiler. |
The "freedom" turned on by But it's still just as wrong to cause UB inside
It depends on what you mean by "not-actually-unsafe".
Why would that be a good idea? |
+1 to this. A few questions though:
|
Promotion is completely independent. It should not be possible to cause UB in promotion, as the analysis that allows promotion essentially just allows basic math and some field accesses.
The RFC states that any UB in intrinsic invocation must be caught, so
Restricting dereferences beyond "must be in a live allocation" is actually harder than not doing so. I don't think Rust has any notion of an "object".
Even if we disallowed pointer casts, you could always use transmute, so this is impossible to protect against afaict. |
Did I understand correctly that for following program:
RFC states that
|
For the specific example (
will always be an error, and all compilers must emit an error But there are cases of UB that we do not reliably detect:
A compiler may choose to emit an error, warning or nothing at all if it encounters this kind of UB. This means on 64 bit systems (not on 16 or 32 bit, due to transmute size checks) you are free to do const x: String = unsafe { std::mem::transmute((1, 2, 3, 4, 5, 6)) };
fn main() {
print!("{}", x);
} which actually compiles on stable and will die horribly. |
I'm not saying that it should not be allowed to error, rather it should generally be unspecified. It may be entirely unreasonable to do anything other than error. However, there are some cases where diagnosing errors for some intrinsics may be difficult itself.
The issue with pointer casts and transmute involving pointers actually goes with usize, and was something I was made aware of when I looked to C++'s std-proposals list to see if it was viable to lift some
And I presume that if Promotion and required Constant evaluation are evaluated together, it would still be permissible (and in fact, required) that failed promotion downgrades to runtime.
I would assume that producing an invalid value should always be an error, and, absent pointer transmutes, is probably quite easy to diagnose. |
Another question: |
we have not encountered this yet, so we could also just wait until it comes up with a new intrinsic
yes, this is what happens with
it's very expensive to do that check, we could, but it would slow down a lot of totally fine code. We do check final values of constants though, just not the intermediate states. So you could have a bool with the bit pattern
Depends on the method you're suggesting. Ideally you'd just create a new intrinsic and invoke that. It could be as simple as invoking |
@chorman0773 @oli-obk
I dont understand what you mean with downgrading. It is not in the promotion RFC.
Every check of things you dont use is a performance loss, so you dont want to do it "in release builds". |
Keeping in mind that the intrinsics provided by the compiler are not specified as part of the stable language (with the possible exception of the limited ones marked
What I mean here by promotion is I want to convert constant expressions (which is a property of the expression, not of when it is evaluated) to evaluate at compile time, similar to what a C++ compiler can do. If attempting to perform constant evaluation of a non-madatory constant expression fails, "downgrade" here means that it just abandons constant evaluation and leaves it to runtime. |
Intrinsics are never stable. The I don't think Rust's memory is typed, and thus something like C++'s |
As the name of the intrinsic in question may imply, it's an intrinsic provided by |
What I would propose is the following:
|
Oh awesome! I didn't know about that. Won't your const evaluator need to support checking whether a pointer points to a valid object of the appropriate type anyway? |
Ah interesting, we stop focusing on intrinsics, which are not part of the language anyway and only address the stable standard library API and language features. I wonder if we can specify this in a way that's better than listing specific things that are caught, but would not be averse to just having an explicit list that we grow. One way to specify this in a general way could be to merge the rules for operations/branching and "intrinsics" by including the latter in "operations". Just like |
As a concrete example: I guess under this definition it would be ok for you to ignore overflow in |
An RFC for this is under development.
Good point -- I wasn't fully considering other implementations with a different set of intrinsics when writing the RFC. I think intrinsics should likewise either error or do the "obvious" sensible thing (and if there is nothing "obvious" they must error). I am not sure how to make that wording more precise though. Do you think this makes sense?
If your memory behaves "as-if" it is untyped, then there cannot be any UB caused by "type mismatches" (as those cannot occur in untyped memory), so this should not be a concern relevant for this RFC.
I don't think we want to go over all stdlib functions here and classify them like this. Plus, detecting UB even for the ones you stated would be very expensive (in the way rustc is currently architected), so this is not a good option IMO. The RFC deliberately says nothing about library UB, it is only about language UB. The issue is that intrinsics make this line a bit harder to draw that I'd like.
Well, the RFC also says that this is not a stable guarantee. So it will always be an error until another RFC changes the rules. |
I should rephrase: to the extent necessary to emulate the observable behaviour of the rust abstract machine, it operates as-if memory is untyped (which, pending answers to UCG #2, #84, and #236, merely requires that strict-aliasing is not applied to code compiled from rust). This is distinguishable via the launder builtin, which notably is absent from rust, and present only because it's a C++ intrinsic that needs to communicate with the optimizer.
It was primarily used as an expository argument against blanket requiring intrinsic UB to be diagnosed. Any compiler could come up with some wild intrinsic that has hard-to-diagnose UB. In this case, it's existence is serving a compelling interest, implementing the std::launder function from C++ in a way that generally preserves the optimizations it serves to inhibit.
I believe the three I identified would already be diagnosed under the intrinsic rules (transmute is an intrinsic, zeroed calls init, an intrinsic, and unreachable_unchecked is an intrinsic, and rather easy to diagnose, imo). Also, production/copies/moves of trivially-invalid values should be some of the easiest to diagnose, as I presume rustc knows about the bitwise validity invariants of every type.
I stated, I think it should be unspecified whether intrinsics are diagnosed, though maybe promoting it to requiring the implementation to document the intrinsics where it might not diagnose. I'm sure any implementation would choose to diagnose as much as is possible and reasonable to implement. If this is a reasonable direction, I'd propose the following wording:
I'd argue that at the least, it should say that it is unspecified whether library UB is diagnosed. That way, implementations can do so. |
Having a special case just for EDIT: Actually,
The question is, is it easy to implement this intrinsic just ignoring UB? I assume this can be treated like we treat e.g. aliasing violations, where checking memory accesses for UB is actually highly non-trivial due to aliasing constraints, but there is "obvious" behavior to fall back to if this UB cannot be detected: just perform the access ignoring the aliasing rules.
That's fair. OTOH, there is a tension here between documenting precisely what rustc does, and documenting what should apply to all implementations. |
@rfcbot fcp merge Not everyone has read this yet (including me, ha!) but I'm going to propose FCP anyway based on the summary of what is in it, and because this has been extensively discussed. |
Team member @nikomatsakis has proposed to merge this. The next step is review by the rest of the tagged team members: No concerns currently listed. Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up! See this document for info about what commands tagged team members can give me. |
Folks in @rust-lang/lang can check their boxes. |
I'm assuming that if we hit const ub then rustc will still produce the same binary outputs bit for bit? It would be unfortunate if we couldn't get reproducable build outputs. (I.e. same file hashes) |
In |
🔔 This is now entering its final comment period, as per the review above. 🔔 |
The final comment period, with a disposition to merge, as per the review above, is now complete. As the automated representative of the governance process, I would like to thank the author for their work and everyone else who contributed. The RFC will be merged soon. |
So it does not need a tracking issue.
…metics_as_const, r=joshtriplett Mark unsafe methods NonZero*::unchecked_(add|mul) as const. Now that rust-lang/rfcs#3016 has landed, these two unstable `std` function can be marked `const`, according to this detail of rust-lang#84186.
…metics_as_const, r=joshtriplett Mark unsafe methods NonZero*::unchecked_(add|mul) as const. Now that rust-lang/rfcs#3016 has landed, these two unstable `std` function can be marked `const`, according to this detail of rust-lang#84186.
This RFC was drafted by me with lots of input from @oli-obk.
Define UB during const evaluation to lead to an unspecified result for the affected CTFE query, but not otherwise infect the compilation process.
Cc @rust-lang/wg-const-eval
Rendered (latest version)