-
Notifications
You must be signed in to change notification settings - Fork 12.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Perform validation in const_eval_raw, return valtree from const_eval and decouple MIR-embedded consts from the rest #72396
Comments
With #73513 we have a |
MCP for the const representation change: rust-lang/compiler-team#323 |
@oli-obk did we agree on names for the two different types representing consts and the corresponding queries? For the types I think we toyed with |
the latter query will be rarely invoked I think. Mostly you'll be using the |
|
My assumption with this plan was that for Does this also apply to larger valtree-representable types, e.g. a big struct full of integers? We could try to get a valtree and fall back to a raw allocation when that fails. Or maybe we just want to do with this scalars, in which case we could check the layout before deciding if we want to ask for a valtree or a raw allocation. |
In another thread, @oli-obk wrote something that confused me a lot:
I do not understand this statement. Are you saying mir::Const will not just be an AllocId? I think that is the wrong approach, I'd rather make the caller that has these weird special needs (i.e., codegen for consts) deal with "I want raw representation except sometimes", than making everyone deal with it. @oli-obk, could you clarify? :) |
We don't need it at all if I'm understanding your proposal correctly, as it would just be a What we do need is for the |
Yeah that sounds accurate.^^
Indeed, I think that's what I had in mind.
Okay, so we actually have two places where we do not want to use I would treat those separately. In particular, codegen is just one of several consumers of the final value of a For But there's still a problem with that... I'd think a enum NotSureWhatToCallThis<T> {
Unevaluated(...),
Param(...),
Error(...),
Evaluated(T)
}
mod ty { type Const = NotSureWhatToCallThis<Valtree>; }
mod mir { type Const = NotSureWhatToCallThis<&'tcx Allocation>; } Then use |
I think we're in agreement here, but I think this wording is more precise: It's something that should be evaluable to a valtree and will error if trying to do so fails.
This is similar to my first attempt, but it failed badly because getting there is very hard with the way I'm not sure we can just use |
👍
Those would be two different types then,
Good point, I had forgotten about that. It would be more like enum MirConstValue {
Literal(Valtree),
Alloc(&'tcx Allocation),
}
mod ty { type Const = NotSureWhatToCallThis<Valtree>; }
mod mir { type Const = NotSureWhatToCallThis<MirConstValue>; } |
We don't really need the full //! The evaluated result of a constant in the MIR.
enum MirConstValue {
Scalar(...),
ScalarPair(...), // or "Slice" or so -- I suppose we cannot entirely avoid this
Alloc(&'tcx Allocation),
}
mod ty { type Const = NotSureWhatToCallThis<Valtree>; }
mod mir { type Const = NotSureWhatToCallThis<MirConstValue>; } |
I just realized |
the change is massive if I do this, I was hoping to give reviewable PRs, but it's a huge huge change with lots of moving parts. I can split it into reviewable commits, but it won't be like each of the commits actually builds successfully |
Hm, I don't think I understand why the change would become massive, or rather why it would become more massive than any other proposal which starts distinguishing If we want to not distinguish
|
I don't think that proposal is per-se harder, but I am currently trying to separate the generics parts of
Yea, and all of these are reasons to go with a scheme like you described, I'm just trying to find a way to get there that isn't as bad as the miri merger (where I did the change from the old constants to the new ones in one go, which caused lots of regressions) |
I think the names we agreed on now are What is not clear to me yet is how to handle the problem that we permit consts in patterns that cannot be converted to valtrees -- raw pointers and function pointers. I assume here that pattern matching needs to use the valtree query? Well I guess we could keep the "destructure" query (on "allocation" constants) around and use it for patterns... we'll need some code like this anyway for creating the valtree in the first place. |
@oli-obk and me discussed a bit more the |
I had another long chat with @oli-obk, here's the design that we are at right now as I understood it. Most of the open questions are around pattern matching really -- we want to make sure that patterns can share infrastructure with const generics as much as is possible and makes sense. |
…RalfJung Validate constants during `const_eval_raw` This PR implements the groundwork for rust-lang#72396 * constants are now validated during `const_eval_raw` * to prevent cycle errors, we do not validate references to statics anymore beyond the fact that they are not dangling * the `const_eval` query ICEs if used on `static` items * as a side effect promoteds are now evaluated to `ConstValue::Scalar` again (since they are just a reference to the actual promoted allocation in most cases).
…lfJung Validate constants during `const_eval_raw` This PR implements the groundwork for rust-lang#72396 * constants are now validated during `const_eval_raw` * to prevent cycle errors, we do not validate references to statics anymore beyond the fact that they are not dangling * the `const_eval` query ICEs if used on `static` items * as a side effect promoteds are now evaluated to `ConstValue::Scalar` again (since they are just a reference to the actual promoted allocation in most cases).
I'm wondering whether we should also have |
I think having We then need to be careful to never use the normal aggregate representation for these types, as that would break equality checking. |
Ugh yea, I forgot about the equality part. I guess let's go with full expansion at first and then check whether doing an optimization like that actually has any benefits |
I would be very surprised if this was not necessary to avoid some enormous regressions, but let's see. |
Since this will only be used for constants in the type system, I don't see what this could regress. We don't have any constants but |
Refactorings in preparation for const value trees cc rust-lang#72396 This PR changes the `Scalar::Bits { data: u128, size: u8 }` variant to `Scalar::Bits(ScalarInt)` where `ScalarInt` contains the same information, but is `repr(packed)`. The reason for using a packed struct is to allow enum variant packing to keep the original size of `Scalar` instead of adding another word to its size due to padding. Other than that the PR just gets rid of all the inspection of the internal fields of `Scalar::Bits` which were frankly scary. These fields have invariants that we need to uphold and we can't do that without making the fields private. r? `@ghost`
…ination, r=varkor stabilize `#![feature(min_const_generics)]` in 1.51 *A new Kind* *A Sort long Prophesized* *Once Fragile, now Eternal* blocked on rust-lang#79073. # Stabilization report This is the stabilization report for `#![feature(min_const_generics)]` (tracking issue rust-lang#74878), a subset of `#![feature(const_generics)]` (tracking issue rust-lang#44580), based on rust-lang/rfcs#2000. The [version target](https://forge.rust-lang.org/#current-release-versions) is ~~1.50 (2020-12-31 => beta, 2021-02-11 => stable)~~ 1.51 (2021-02-111 => beta, 2021-03-25 => stable). This report is a collaborative effort of `@varkor,` `@shepmaster` and `@lcnr.` ## Summary It is currently possible to parameterize functions, type aliases, types, traits and implementations by types and lifetimes. With `#![feature(min_const_generics)]`, it becomes possible, in addition, to parameterize these by constants. This is done using the syntax `const IDENT: Type` in the parameter listing. Unlike full const generics, `min_const_generics` is limited to parameterization by integers, and constants of type `char` or `bool`. We already use `#![feature(min_const_generics)]` on stable to implement many common traits for arrays. See [the documentation](https://doc.rust-lang.org/nightly/std/primitive.array.html) for specific examples. Generic const arguments, for now, are not permitted to involve computations depending on generic parameters. This means that const parameters may only be instantiated using either: 1. const expressions that do not depend on any generic parameters, e.g. `{ foo() + 1 }`, where `foo` is a `const fn` 1. standalone const parameters, e.g. `{N}` ### Example ```rust #![feature(min_const_generics)] trait Foo<const N: usize> { fn method<const M: usize>(&mut self, arr: [[u8; M]; N]); } struct Bar<T, const N: usize> { inner: [T; N], } impl<const N: usize> Foo<N> for Bar<u8, N> { fn method<const M: usize>(&mut self, arr: [[u8; M]; N]) { for (elem, s) in self.inner.iter_mut().zip(arr.iter()) { for &x in s { *elem &= x; } } } } fn function<const N: u16>() -> u16 { // Const parameters can be used freely inside of functions. (N + 1) / 2 * N } fn main() { let mut bar = Bar { inner: [0xff; 3] }; // This infers the value of `M` from the type of the function argument. bar.method([[0b11_00, 0b01_00], [0b00_11, 0b00_01], [0b11_00, 0b00_11]]); assert_eq!(bar.inner, [0b01_00, 0b00_01, 0b00_00]); // You can also explicitly specify the value of `N`. assert_eq!(function::<17>(), 153); } ``` ## Motivation Rust has the built-in array type, which is parametric over a constant. Without const generics, this type can be quite cumbersome to use as it is not possible to generically implement a trait for arrays of different lengths. For example, this meant that, for a long time, the standard library only contained trait implementations for arrays up to a length of 32. This restriction has since been lifted through the use of const generics. Const parameters allow users to naturally specify variants of a generic type which are more naturally parameterized by values, rather than by types. For example, using const generics, many of the uses of the crate [typenum](https://crates.io/crates/typenum) may now be replaced with const parameters, improving compilation time as well as code readability and diagnostics. The subset described by `min_const_generics` is self-contained, but extensive enough to help with the most frequent issues: implementing traits for arrays and using arbitrarily-sized arrays inside of other types. Furthermore, it extends naturally to full `const_generics` once the remaining design and implementation questions have been resolved. ## In-depth feature description ### Declaring const parameters *Const parameters* are allowed in all places where types and lifetimes are supported. They use the syntax `const IDENT: Type`. Currently, const parameters must be declared after lifetime and type parameters. Their scope is equal to the scope of other generic parameters. They live in the value namespace. `Type` must be one of `u8`, `u16`, `u32`, `u64`, `u128`, `usize`, `i8`, `i16`, `i32`, `i64`, `i128`, `isize`, `char` and `bool`. This restriction is implemented in two places: 1. during name resolution, where we forbid generic parameters 1. during well-formedness checking, where we only allow the types listed above The updated syntax of parameter listings is: ``` GenericParams: (OuterAttr* LifetimeParam),* (OuterAttr* TypeParam),* (OuterAttr* ConstParam),* OuterAttr: '#[' ... ']' LifetimeParam: ... TypeParam: ... ConstParam: 'const' IDENT ':' Type ``` Unlike type and lifetime parameters, const parameters of types can be used without being mentioned inside of a parameterized type because const parameters do not have issues concerning variance. This means that the following types are allowed: ```rust struct Foo<const N: usize>; enum Bar<const M: usize> { A, B } ``` ### Const arguments Const parameters are instantiated using *const arguments*. Any concrete const expression or const parameter as a standalone argument can be used. When applying an expression as const parameter, most expressions must be contained within a block, with two exceptions: 1. literals and single-segment path expressions 1. array lengths This syntactic restriction is necessary to avoid ambiguity, or requiring infinite lookahead when parsing an expression as a generic argument. In the cases where a generic argument could be resolved as either a type or const argument, we always interpret it as a type. This causes the following test to fail: ```rust type N = u32; struct Foo<const N: usize>; fn foo<const N: usize>() -> Foo<N> { todo!() } // ERR ``` To circumvent this, the user may wrap the const parameter with braces, at which point it is unambiguously accepted. ```rust type N = u32; struct Foo<const N: usize>; fn bar<const N: usize>() -> Foo<{ N }> { todo!() } // ok ``` Operations depending on generic parameters are **not** allowed, which is enforced during well-formedness checking. Allowing generic unevaluated constants would require a way to check if they would always evaluate successfully to prevent errors that are not caught at declaration time. This ability forms part of `#![feature(const_evaluatable_checked)]`, which is not yet being stabilised. Since we are not yet stabilizing `#![feature(lazy_normalization_consts)]`, we must not supply the parent generics to anonymous constants except for repeat expressions. Doing so can cause cycle errors for arrays used in `where`-bounds. Not supplying the parent generics can however lead to ICEs occurring before well-formedness checking when trying to use a generic parameter. See rust-lang#56445 for details. Since we expect cases like this to occur more frequently once `min_const_generics` is stabilized, we have chosen to forbid generic parameters in anonymous constants during name resolution. While this changes the ICE in the situation above to an ordinary error, this is theoretically a breaking change, as early-bound lifetimes were previously permitted in repeat expressions but now are disallowed, causing the following snippet to break: ```rust fn late_bound<'a>() { let _ = [0; { let _: &'a (); // ICE ==> ERR 3 }]; } fn early_bound<'a>() where &'a (): Sized { let _ = [0; { let _: &'a (); // ok ==> ERR 3 }]; } ``` ### Using const parameters Const parameters can be used almost everywhere ordinary constants are allowed, except that they may not be used in the construction of consts, statics, functions, or types inside a function body and are subject to the generic argument restrictions mentioned above. Expressions containing const parameters are eligible for promotion: ```rust fn test<const N: usize>() -> &'static usize { &(3 + N) } ``` ### Symbol mangling See the [Rust symbol name mangling RFC](https://rust-lang.github.io/rfcs/2603-rust-symbol-name-mangling-v0.html) for an overview. Generic const parameters take the form `K[type][value]` when the value is known, or `Kp` where the value is not known, where: - `[type]` is any integral type, `bool`, or `char`. - `[value]` is the unsigned hex value for integers, preceded by `n` when negative; is `0` or `1` for `bool`; is the hex value for `char`. ### Exhaustiveness checking We do not check the exhaustiveness of impls, meaning that the following example does **not** compile: ```rust struct Foo<const B: bool>; trait Bar {} impl Bar for Foo<true> {} impl Bar for Foo<false> {} fn needs_bar(_: impl Bar) {} fn generic<const B: bool>() { let v = Foo::<B>; needs_bar(v); } ``` ### Type inference The value of const parameters can be inferred during typeck. One interesting case is the length of generic arrays, which can also be inferred from patterns (implemented in rust-lang#70562). Practical usage of this can be seen in rust-lang#76825. ### Equality of constants `#![feature(min_const_generics)]` only permits generic parameters to be used as standalone generic arguments. We compare two parameters to be equal if they are literally the same generic parameter. ### Associated constants Associated constants can use const parameters without restriction, see rust-lang#79135 (comment) for more details. ## Future work As this is a limited subset of rust-lang/rfcs#2000, there are quite a few extensions we will be looking into next. ### Lazy normalization of constants Stabilizing `#![feature(lazy_normalization_consts)]` (tracking issue rust-lang#72219) will remove some special cases that are currently necessary for `min_const_generics`, and unblocks operations on const parameters. ### Relaxing ordering requirements between const and type parameters We currently restrict the order of generic parameters so that types must come before consts. We could relax this, as is currently done with `const_generics`. Without this it is not possible to use both type defaults and const parameters at the same time. Unrestricting the order will require us to improve some diagnostics that expect there to be a strict order between type and const parameters. ### Allowing more parameter types We would like to support const parameters of more types, especially`&str` and user-defined types. Both are blocked on [valtrees]. There are also open questions regarding the design of `structural_match` concerning the latter. Supporting generic const parameter types such as `struct Foo<T, const N: T>` will be a lot harder and is unlikely to be implemented in the near future. ### Default values of const parameters We do not yet support default values for const parameters. There is work in progress to enable this on nightly (see rust-lang#75384). ### Generic const operations With `#![feature(min_const_generics)]`, only concrete const expressions and parameters as standalone arguments are allowed in types and repeat expressions. However, supporting generic const operations, such as `N + 1` or `std::mem::size_of::<T>()` is highly desirable. This feature is in early development under `#![feature(const_evaluatable_checked)]`. ## Implementation history Many people have contributed to the design and implementation of const generics over the last three years. See rust-lang#44580 (comment) for a summary. Once again thank you to everybody who helped out here! [valtrees]: rust-lang#72396 --- r? `@varkor`
I think this is resolved? The main const-eval query is There is more cleanup that I can imagine, but it should probably be in a new issue. |
Right now, we have two "const_eval" queries:
const_eval_raw
is internal and returns a "by-reference" constant (MPlaceTy
) pointing to some global allocation intcx.alloc_map
. Andconst_eval
is the entry point supposed to be used by the rest of the compiler, and it returns aty::Const
, representing some constants of some types "by-value".Beyond the by-ref to by-val conversion,
const_eval
also performs validation of the constant to make sure it matches its type. The fact thatconst_eval_raw
does not do that is a constant source of concern for me, e.g. we have somedelay_span_bug
in that query that basically rely on validation happening eventually and I think some of those might actually be "exploitable" ICEs.Furthermore, "validated or not" as a question seems separate from "by-val or by-ref". By-val representation is appropriate for many uses of
const
(in particular around array lengths and pattern matching), but AFAIK never makes any sense forstatic
. We currently force statics to go throughty::Const
which seems odd. In fact, in the context of rust-lang/const-eval#42 and #70889, we determined that pattern matching and const generics likely want a different, even more higher-level representation of const values than what we currently have forty::Const
.So I propose we perform validation in
const_eval_raw
(whose name should be changed then... maybeconst_eval_ref
?). Then all thesedelay_span_bug
issues disappear. But the query should still return anMPlaceTy
. The idea is that this is the query used forstatic
and other clients (like Miri) that can handle constants of any type and only care about their value at the byte level.Then we can have a second query which, like current
const_eval
(new name maybeconst_eval_val
?), returnsty::Const
. This would only be used for things like array lengths, pattern matching, const generics -- things which need a more "structured" view of consts thanMPlaceTy
provides. I think this would be a great place to put some check that "all allocations reachable from a const are read-only" (as proposed here). Also, my hypothesis is that this query should only ever be called on "structural match" types -- maybe we can assert that? Long-term, this could become the "integer tree" representation @eddyb proposed.The main technical hurdle is that we have to avoid query cycles. That's why validation was split out into a separate pass in the first place. We can achieve that by stopping validation when we hit a
GlobalAlloc::Static(DefId)
allocation. I think that is reasonable -- we already stop there if theDefId
is in a different crate, so we cannot rely on this for soundness anyway.@rust-lang/wg-const-eval does this all make sense?
I think if this works it quite be a really nice cleanup, and I would finally be much less bothered by the existence of two queries and two different representations of consts. :D Some time ago I thought we could kill
ty::Const
and use by-ref consts for everything, and I think @oli-obk even tried that and it went nowhere. But "by-ref" just seemed like the canonical representations of consts, and we ended up with ugly hack where whether we are in a static affects how we convertMPlaceTy
intoty::Const
... ugh. Now I finally understand what that is the case: we have two different kinds of consumers of consts in the compiler, with different needs, and they should use different queries with different return types. Some are happy withMPlaceTy
, others require a more high-level view. I feel like we finally found the right structure to represent this. :DHere's that design worked out a bit more.
The text was updated successfully, but these errors were encountered: