-
Notifications
You must be signed in to change notification settings - Fork 12.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CTFE keeps values in padding, breaking structural equality of const generic args. #70889
Comments
Actually, I think the error is because of a different bug, the allocation values in error the look identical. I only noticed that because I expected this to work, but it errors (playground): trait Foo {}
impl Foo for Phantom<PADDED> {}
impl Foo for Phantom<FILLED> {} And the indirect version errors in an unexpected way, probably because different parts of the compiler handle by-ref |
TBH, at least from a pure const-eval perspective, this seems fine to me. So I think it would be acceptable if const generics used "identity of the underlying storage". But maybe I just want to avoid implementation complexity. :P Though, on the other hand... does this mean that if we ever change the Miri engine to no longer preserve padding on copies, that would be a breaking change with const-generics? That seems quite problematic, then. |
It's not,
Not if you treat it as a constructor tree with integral leaves. I agree that anything else is hard, but const generics must fundamentally use purely structural values for the typesystem to be able to embed them. |
Hm... I suppose things are easier because we only have to compare values of equal types, so we don't actually need the full complexity of the "value domain". But still, you keep talking about "trees" where if you turn the memory links into edges, you actually get a general graph. What notion would you use for graph equivalence? Two And what about floating point leaves? NaNs are not usually equal to themselves even. |
It's a DAG (enforced by miri atm IIRC, or at least I remember @oli-obk implying that it is enforced), where sharing is an unobservable implementation optimization (same as with
Of course, const generics are at least as restrictive as pattern-matching, with its structural
Floating-point values are not structurally It's funny you should mention this, given the history of type-level values and floating-point. |
Once const-code can mutate things, it's not a DAG any more but a general tree.
Oh I see, so I cannot use just any type in const generics, but only types that have certain properties? Okay that's a totally different game, then the "value domain" is even more restricted of course. I enforcing via some validity checks that padding is "undef" is the wrong approach though. It seems much more sensible to me to have a smarter comparison function that takes the type of the to-be-compared values into account -- basically, an interpreter-level implementation of |
Sorry, unobservable by type equality. The same way you can't tell from within the language that this always results in a non-tree DAG of interned types, for any type passed to it: type Dup<X> = (X, X); e.g.: Dup<u8> = (u8, u8)
\ /
u8
The typesystem does not admit such an approach for normalized types, a type can either:
Once you put a constant value in a type, all of the same rules apply to it as well.
|
Okay, so type equality has its own notion of equivalence, distinct from what CTFE can observe. Are there some crucial properties that relate the two, like one being strictly more fine-grained than the other or so? I suppose CTFE-observable-equal consts should always be type-equal (because why not), but I am not sure if that is crucial for soundness. You seem to say that if If equality of references is defined by equality of the referent, then indeed a tree makes sense. I suppose raw pointers will not be permitted?
So you are saying, for the type system we need to define equality by normalization, not via a comparison function. I can see how that helps with interning etc, but it is quite the complication here. However, I think I'd be opposed to banning non-undef padding in "normal" CTFE results. That's a huge foot-gun, it would break consts like I suppose we could apply "typed copy" rules on the result of CTFE, and "reset" the padding on the final CTFE value, e.g. as part of interning. But this will be really hard to do for "nested" allocations where we might not have reliable type information, so that's not great either. In that light, transforming the constant into some kind of tree representation designed specifically for this purpose makes a lot of sense indeed. Hopefully this can share at least something with |
The reason we require "structurally matchable" ADTs (i.e. with What's crucial about soundness is how this is all implemented in the compiler. So variation in undef bytes might be allowable (the same way bitcast NaN floats would be if we went the "bitwise comparison even when it doesn't match I suppose the thing that would really not be is addresses (of anything other than a Overall, I'm not sure what the best path forward is, but I'd rather start out with very tight restrictions as far as "Trees of ADTs/references with integer leaves" is the "obviously pure and sound" subset, everything else is a minefield that has to be carefully navigated, and I'd prefer not to let in haphazard implementation-dictated behavior.
We might be doing that already, see #70889 (comment). I'm not sure what's going on, maybe |
I highly doubt it, that would be a non-trivial piece of code.
Fully agreed.
Could it be a |
@eddyb I recently learned that raw pointers and function pointers are also allowed as constants in patterns -- and thus maybe also in const generics? For those, your integer-based "value tree" doesn't work as their equality test compares pointers, which are nothing like integers. |
Const generics now statically exclude function pointers and raw pointers. Not sure how that holds up if they are fields of aggregates, but once we move to a value tree format, we won't see any of this anymore anyway. |
@RalfJung Yes, and they're "nothing like integers" in ways that are not suitable for embedding into a typesystem, IMO. I would be fine with raw pointers containing integer addresses, and maybe even those pointing to |
For now I propose we just do not allow raw pointers and function pointers when converting to the valtree (which will be a type-directed translation I assume). |
This example should do one of these three things, but it doesn't (playground):
PADDED == FILLED
after all, field-wise)FILLED
's definition (due to evaluated constant not fitting type)FILLED
is used as an argument toPhantom
Instead,
Phantom<PADDED>
andPhantom<FILLED>
are considered different types.If we want to make it compile, we could normalize
FILLED
to also have that padding byte marked as "undef", but I'm not sure if we can do this normalization if the values were behind a reference.So we might want to error if e.g.
&[0u8; 4]
was transmuted to&(u8, u16)
, because normalizing it would change what runtime code would see. Again, we have two places where we can error.If we want to error without causing no breaking changes, either always or just indirect case, we can do so by introducing a second, stricter, validity check in
ty::Const
Well-Formed rules (which we should be able to post-#70107).That check should enforce that the value is a tree of constructors (
&_
would be treated as a constructor) with integer leaves (no relocations, i.e. no raw/fn
pointers), where any user ADTs are structurally-matchable, and all padding bytes (not occupied by leaves) are "undef".cc @rust-lang/wg-const-eval @varkor @yodaldevoid
The text was updated successfully, but these errors were encountered: