CTFE keeps values in padding, breaking structural equality of const generic args. #70889

eddyb · 2020-04-07T15:00:09Z

This example should do one of these three things, but it doesn't (playground):

compile successfully (PADDED == FILLED after all, field-wise)
error on FILLED's definition (due to evaluated constant not fitting type)
error when FILLED is used as an argument to Phantom

#![feature(const_generics)]

union Transmute<T: Copy, U: Copy> {
    from: T,
    to: U,
}

const PADDED: (u8, u16) = (0, 0);
const FILLED: (u8, u16) = unsafe { Transmute { from: [0u8; 4] }.to };

struct Phantom<const X: (u8, u16)>;

fn test(x: Phantom<PADDED>) -> Phantom<FILLED> {
    x
}

Instead, Phantom<PADDED> and Phantom<FILLED> are considered different types.

If we want to make it compile, we could normalize FILLED to also have that padding byte marked as "undef", but I'm not sure if we can do this normalization if the values were behind a reference.

So we might want to error if e.g. &[0u8; 4] was transmuted to &(u8, u16), because normalizing it would change what runtime code would see. Again, we have two places where we can error.

If we want to error without causing no breaking changes, either always or just indirect case, we can do so by introducing a second, stricter, validity check in ty::Const Well-Formed rules (which we should be able to post-#70107).
That check should enforce that the value is a tree of constructors (&_ would be treated as a constructor) with integer leaves (no relocations, i.e. no raw/fn pointers), where any user ADTs are structurally-matchable, and all padding bytes (not occupied by leaves) are "undef".

cc @rust-lang/wg-const-eval @varkor @yodaldevoid

The text was updated successfully, but these errors were encountered:

eddyb · 2020-04-07T15:23:25Z

Actually, I think the error is because of a different bug, the allocation values in error the look identical.

I only noticed that because I expected this to work, but it errors (playground):

trait Foo {}
impl Foo for Phantom<PADDED> {}
impl Foo for Phantom<FILLED> {}

And the indirect version errors in an unexpected way, probably because different parts of the compiler handle by-ref ty::Consts differently.

RalfJung · 2020-04-07T16:22:22Z

Instead, Phantom<PADDED> and Phantom<FILLED> are considered different types.

TBH, at least from a pure const-eval perspective, this seems fine to me.
Actually formalizing Rust's "value domain" is a huge undertaking, and even then (ignoring const generics) I am not sure if we want to use that in const-eval.

So I think it would be acceptable if const generics used "identity of the underlying storage". But maybe I just want to avoid implementation complexity. :P

Though, on the other hand... does this mean that if we ever change the Miri engine to no longer preserve padding on copies, that would be a breaking change with const-generics? That seems quite problematic, then.

eddyb · 2020-04-07T18:09:14Z

TBH, at least from a pure const-eval perspective, this seems fine to me.

It's not, v0 mangling will require a pure tree of integer leaves to function. Encoding the underlying representation is not an option (a hash thereof might be, but it's a terrible compromise).

Actually formalizing Rust's "value domain" is a huge undertaking

Not if you treat it as a constructor tree with integral leaves. I agree that anything else is hard, but const generics must fundamentally use purely structural values for the typesystem to be able to embed them.

RalfJung · 2020-04-07T18:18:52Z

Not if you treat it as a constructor tree with integral leaves.

Hm... I suppose things are easier because we only have to compare values of equal types, so we don't actually need the full complexity of the "value domain".

But still, you keep talking about "trees" where if you turn the memory links into edges, you actually get a general graph. What notion would you use for graph equivalence? Two (&i32, &i32) where in one both have the same address but in the other we have two distinct shared references -- do you consider them equal?

And what about floating point leaves? NaNs are not usually equal to themselves even.

eddyb · 2020-04-07T19:02:26Z

But still, you keep talking about "trees" where if you turn the memory links into edges, you actually get a general graph.

It's a DAG (enforced by miri atm IIRC, or at least I remember @oli-obk implying that it is enforced), where sharing is an unobservable implementation optimization (same as with Ty). Therefore: tree.

Two (&i32, &i32) where in one both have the same address but in the other we have two distinct shared references -- do you consider them equal?

Of course, const generics are at least as restrictive as pattern-matching, with its structural == requirement. There's no concept of "address", and interning must be effective enough to normalize such differences (except for references to statics but those can be disallowed entirely in values passed to const generics).

And what about floating point leaves? NaNs are not usually equal to themselves even.

Floating-point values are not structurally ==, nor Eq.

It's funny you should mention this, given the history of type-level values and floating-point.
IIRC, idris-lang/Idris-dev#2609 was discovered after the subject of floating-point values in future Rust const-generics was brought up in #rust-offtopic - also see rust-lang/rfcs#1038 (comment).

RalfJung · 2020-04-08T16:59:13Z

It's a DAG (enforced by miri atm IIRC, or at least I remember @oli-obk implying that it is enforced), where sharing is an unobservable implementation optimization (same as with Ty). Therefore: tree.

Once const-code can mutate things, it's not a DAG any more but a general tree.
Also, why is sharing unobservable, assuming CTFE allows raw ptr comparisons eventually?

Of course, const generics are at least as restrictive as pattern-matching, with its structural == requirement.

Oh I see, so I cannot use just any type in const generics, but only types that have certain properties? Okay that's a totally different game, then the "value domain" is even more restricted of course.

I enforcing via some validity checks that padding is "undef" is the wrong approach though. It seems much more sensible to me to have a smarter comparison function that takes the type of the to-be-compared values into account -- basically, an interpreter-level implementation of == for supported types.

eddyb · 2020-04-08T17:39:30Z

Also, why is sharing unobservable, assuming CTFE allows raw ptr comparisons eventually?

Sorry, unobservable by type equality. The same way you can't tell from within the language that this always results in a non-tree DAG of interned types, for any type passed to it:

type Dup<X> = (X, X);

e.g.:

Dup<u8> = (u8, u8)
           \   /
             u8

It seems much more sensible to me to have a smarter comparison function that takes the type of the to-be-compared values into account

The typesystem does not admit such an approach for normalized types, a type can either:

have "unnormalized" components (like ty::Projection or ty::ConstKind::Unevaluated)
- normalization should fully replace these components, when monomorphic
- these components must be cheaply detectable
be equal to itself, and only to itself, in its generic "domain" (and modulo lifetime subtyping)
- when monomorphic, this is even stronger, as the entire type is known

Once you put a constant value in a type, all of the same rules apply to it as well.
If you have some way to create structurally-equal but distinct constants, you can do one of:

ban them in types (via WF checks)
normalize the difference away before they ever reach types
track them as unnormalized and have a normalization rule
- this is 2. but deferred and probably the worst of all worlds, implementation-wise

RalfJung · 2020-04-09T09:15:44Z

Sorry, unobservable by type equality.

Okay, so type equality has its own notion of equivalence, distinct from what CTFE can observe. Are there some crucial properties that relate the two, like one being strictly more fine-grained than the other or so? I suppose CTFE-observable-equal consts should always be type-equal (because why not), but I am not sure if that is crucial for soundness.

You seem to say that if CONST1 == CONST2 (in the sense of looking up the PartialEq instance and evaluating that code) then they must be type-equal. That seems sensible but I do not understand why it is crucial for soundness. Is this all written down somewhere?

If equality of references is defined by equality of the referent, then indeed a tree makes sense. I suppose raw pointers will not be permitted?

The typesystem does not admit such an approach for normalized types

So you are saying, for the type system we need to define equality by normalization, not via a comparison function. I can see how that helps with interning etc, but it is quite the complication here.

However, I think I'd be opposed to banning non-undef padding in "normal" CTFE results. That's a huge foot-gun, it would break consts like MaybeUninit::<(u8, u16)>::zeroed().assume_init(). And if we ban it just for consts used in types, that's still a big foot-gun as it means a random CONST I find somewhere could work fine for normal use until I try to put it into a type, and then the compiler complains about it having non-undef padding.

I suppose we could apply "typed copy" rules on the result of CTFE, and "reset" the padding on the final CTFE value, e.g. as part of interning. But this will be really hard to do for "nested" allocations where we might not have reliable type information, so that's not great either.

In that light, transforming the constant into some kind of tree representation designed specifically for this purpose makes a lot of sense indeed. Hopefully this can share at least something with ty::Const so we can avoid having three representations of constants in the compiler...

eddyb · 2020-04-09T18:17:57Z

You seem to say that if CONST1 == CONST2 (in the sense of looking up the PartialEq instance and evaluating that code) then they must be type-equal. That seems sensible but I do not understand why it is crucial for soundness. Is this all written down somewhere?

The reason we require "structurally matchable" ADTs (i.e. with #[derive(PartialEq, Eq)], not custom) is mostly for consistency (with regular == and patterns), because anything embedded into the typesystem is closer to a bitwise comparison (except references, which recurse), and if we ever extend this in the future, it could require e.g. const impl PartialEq (although this seems unlikely).

What's crucial about soundness is how this is all implemented in the compiler.
Oh and we need to be able to hash types (for TypeId and symbol names - the new mangling scheme is even stricter, but worst case it could fall back to hashes), in a way that, barring hash collisions, == on Ty is equivalent with first applying Hash/StableHash (the latter having to remain stable between compilations) and then comparing.

So variation in undef bytes might be allowable (the same way bitcast NaN floats would be if we went the "bitwise comparison even when it doesn't match ==).

I suppose the thing that would really not be is addresses (of anything other than a static), because there's no way to "purify" them, other than interning them by-value (this might also mean non-const-generic uses of const-evaluation may be incrementally-unsound without said interning).

Overall, I'm not sure what the best path forward is, but I'd rather start out with very tight restrictions as far as const generics are concerned, and try to go from there.

"Trees of ADTs/references with integer leaves" is the "obviously pure and sound" subset, everything else is a minefield that has to be carefully navigated, and I'd prefer not to let in haphazard implementation-dictated behavior.

I suppose we could apply "typed copy" rules on the result of CTFE, and "reset" the padding on the final CTFE value, e.g. as part of interning.

We might be doing that already, see #70889 (comment).

I'm not sure what's going on, maybe ConstValues can compare equal but result in distinct ty::Consts?

RalfJung · 2020-04-10T16:51:45Z

We might be doing that already,

I highly doubt it, that would be a non-trivial piece of code.
Miri might want something like that, too, but for every copy.

Overall, I'm not sure what the best path forward is, but I'd rather start out with very tight restrictions as far as const generics are concerned, and try to go from there.

Fully agreed.

I'm not sure what's going on, maybe ConstValues can compare equal but result in distinct ty::Consts?

Could it be a ByRef vs. "immediate representation" (the other ConstValue variants) thing? I think what we do is normalizing in the sense that constants with the same type should result in the same representation, but I know that at least when I reviewed that code I didn't know that would be important one day.

RalfJung · 2020-05-05T08:10:14Z

@eddyb I recently learned that raw pointers and function pointers are also allowed as constants in patterns -- and thus maybe also in const generics? For those, your integer-based "value tree" doesn't work as their equality test compares pointers, which are nothing like integers.

oli-obk · 2020-06-28T09:57:58Z

Const generics now statically exclude function pointers and raw pointers. Not sure how that holds up if they are fields of aggregates, but once we move to a value tree format, we won't see any of this anymore anyway.

eddyb · 2020-07-16T03:07:43Z

@RalfJung Yes, and they're "nothing like integers" in ways that are not suitable for embedding into a typesystem, IMO.

I would be fine with raw pointers containing integer addresses, and maybe even those pointing to statics (which have unique addresses, so we can use the identity of the static as a "nominal" value), but anything else I'd avoid.

RalfJung · 2020-07-16T07:44:55Z

For now I propose we just do not allow raw pointers and function pointers when converting to the valtree (which will be a type-directed translation I assume).

lcnr · 2022-03-30T11:25:51Z

#70889 (comment)

Also, why is sharing unobservable, assuming CTFE allows raw ptr comparisons eventually?

This is the only thing not currently tracked in #29 or the valtree design. Opened a topic for this on zulip and am going to close this issue.

eddyb added T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. C-bug Category: This is a bug. A-const-generics Area: const generics (parameters and arguments) F-const_generics `#![feature(const_generics)]` labels Apr 7, 2020

eddyb mentioned this issue Apr 7, 2020

Enforce a whitelist on the constant primitive types allowed in patterns. #70872

Closed

pnkfelix added the requires-nightly This issue requires a nightly compiler in some way. label Apr 7, 2020

eddyb mentioned this issue Apr 7, 2020

function pointers as match patterns have optimization-dependent behavior #70861

Closed

RalfJung mentioned this issue May 2, 2020

Why can consts not refer to statics? rust-lang/const-eval#11

Open

RalfJung mentioned this issue May 20, 2020

Perform validation in const_eval_raw, return valtree from const_eval and decouple MIR-embedded consts from the rest #72396

Closed

oli-obk added the A-valtree Area: Value trees or fixed by value trees label Mar 15, 2021

lcnr closed this as completed Mar 30, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

CTFE keeps values in padding, breaking structural equality of const generic args. #70889

CTFE keeps values in padding, breaking structural equality of const generic args. #70889

eddyb commented Apr 7, 2020

eddyb commented Apr 7, 2020 •

edited

Loading

Uh oh!

RalfJung commented Apr 7, 2020

Uh oh!

eddyb commented Apr 7, 2020

Uh oh!

RalfJung commented Apr 7, 2020

Uh oh!

eddyb commented Apr 7, 2020 •

edited

Loading

Uh oh!

RalfJung commented Apr 8, 2020

Uh oh!

eddyb commented Apr 8, 2020

Uh oh!

RalfJung commented Apr 9, 2020

Uh oh!

eddyb commented Apr 9, 2020

Uh oh!

RalfJung commented Apr 10, 2020

Uh oh!

RalfJung commented May 5, 2020 •

edited

Loading

Uh oh!

oli-obk commented Jun 28, 2020

Uh oh!

eddyb commented Jul 16, 2020

Uh oh!

RalfJung commented Jul 16, 2020

Uh oh!

lcnr commented Mar 30, 2022

Uh oh!

CTFE keeps values in padding, breaking structural equality of const generic args. #70889

CTFE keeps values in padding, breaking structural equality of const generic args. #70889

Comments

eddyb commented Apr 7, 2020

eddyb commented Apr 7, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

RalfJung commented Apr 7, 2020

Uh oh!

eddyb commented Apr 7, 2020

Uh oh!

RalfJung commented Apr 7, 2020

Uh oh!

eddyb commented Apr 7, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

RalfJung commented Apr 8, 2020

Uh oh!

eddyb commented Apr 8, 2020

Uh oh!

RalfJung commented Apr 9, 2020

Uh oh!

eddyb commented Apr 9, 2020

Uh oh!

RalfJung commented Apr 10, 2020

Uh oh!

RalfJung commented May 5, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

oli-obk commented Jun 28, 2020

Uh oh!

eddyb commented Jul 16, 2020

Uh oh!

RalfJung commented Jul 16, 2020

Uh oh!

lcnr commented Mar 30, 2022

Uh oh!

eddyb commented Apr 7, 2020 •

edited

Loading

eddyb commented Apr 7, 2020 •

edited

Loading

RalfJung commented May 5, 2020 •

edited

Loading