Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When exactly is a constant well formed? #49

Open
lcnr opened this issue Jul 29, 2020 · 21 comments
Open

When exactly is a constant well formed? #49

lcnr opened this issue Jul 29, 2020 · 21 comments

Comments

@lcnr
Copy link

lcnr commented Jul 29, 2020

So my current understanding is that all constants used in types must be well formed.

I wasn't able to find any discussion on what exactly that means though 😅

I think that for concrete constants the following must hold:

Constants evaluate successfully

This means that all of 1usize - 2, panic!(), loop {} and [1, 2, 3][3] are not well formed as constants.

An evaluated constant must satisfy all lang invariants of its type

So something like unsafe { std::mem::transmute::<u8, bool>(2) } is not wf.
I do think that the following would be well formed though, as it only breaks library invariants:

const V: &'static str = unsafe { std::mem::transmute::<&[u8], &str>(&[0xff, 0xff, 0xff, 0xff]) };

The more interesting case are polymorphic constants as we can't just evaluate them.
For these I think the following rule is appropriate

A polymorphic constant is well formed, iff all possible concrete instances are well formed

Some examples:

use std::mem::size_of;

fn foo<T>() -> [u8; size_of::<T>()] {}
// well formed, as `size_of::<T>()` returns a wf constant no matter the type of `T`.

fn bar<T>() -> [u8; 64 / size_of::<T>()] {}
// not well formed, as `64 / size_of::<()>()` does not evaluate successfully and is therefore not wf.

struct Baz<const V: bool>;
fn baz<T>() -> Baz<{ unsafe { std::mem::transmute::<u8, bool>(size_of::<T>() as u8) }> {}
// not well formed, for any `T` with `size_of::<T>() > 3` the final value is not a valid bool. 

Does this definition make sense?

@lcnr
Copy link
Author

lcnr commented Jul 29, 2020

cc @oli-obk @RalfJung @eddyb

@oli-obk
Copy link
Contributor

oli-obk commented Jul 29, 2020

Nit:

fn baz<T>() -> [u8; unsafe { std::mem::transmute<u8, bool>(size_of::<T>()) }]
// not well formed, for any `T` with `size_of::<T>() > 3` the final value is not a valid bool. 

needs to be

fn baz<T>() -> [u8; unsafe { std::mem::transmute<u8, bool>(size_of::<T>() as u8) as usize }]
// not well formed, for any `T` with `size_of::<T>() > 3` the final value is not a valid bool. 

We currently do not detect this case because of performance reasons. It is trivial (a single flag) to turn on that locals are checked for validty after each modifiction

@lcnr
Copy link
Author

lcnr commented Jul 29, 2020

We detect it in the final value of a constant afaict. So const N: bool = unsafe { std::mem::transmute::<u8, bool>(2) }; already causes a compile time error.

error[E0080]: it is undefined behavior to use this value
 --> src/lib.rs:1:1
  |
1 | const N: bool = unsafe { std::mem::transmute::<u8, bool>(2) };
  | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ type validation failed: encountered 0x02, but expected a boolean
  |
  = note: The rules on what exactly is undefined behavior aren't clear, so this check might be overzealous. Please open an issue on the rustc repository if you believe it should not be considered undefined behavior.

@oli-obk
Copy link
Contributor

oli-obk commented Jul 29, 2020

Yes, but we don't detect it in intermediate computations like when it's stored in a local variable but never returned.

@programmerjake
Copy link
Member

This means that all of 1 - 2, panic!(), loop {} and [1, 2, 3][3] are not well formed as constants.

1 - 2 evaluates successfully for signed types. maybe you meant something like: 1 - 2usize?

@RalfJung
Copy link
Member

RalfJung commented Aug 5, 2020

A polymorphic constant is well formed, iff all possible concrete instances are well formed

The problem with this definition is that it is uncheckable (except with major effort).

I thought the point of well-formedness is for the compiler to check it? Similar to how it checks that e.g. lifetimes appropriately outlive each other? If that is the case, we need a notion of well-formedness that is efficiently decidable -- which the one I quoted above is not.

@oli-obk
Copy link
Contributor

oli-obk commented Aug 5, 2020

The problem with this definition is that it is uncheckable (except with major effort).

The way that it is proposed to check this is very cheap: If it is still generic, you need a 1:1 copy of the constant in a where bound (so exposing the fact that you need this constant to be evaluable to the caller), and then the caller either does the same, or (if it's monomorphic now), evaluates it. If it evaluates successfully, the bound is satisfied and the constant is well formed.

@RalfJung
Copy link
Member

RalfJung commented Aug 5, 2020

So basically this system punts the question of polymorphic constants? We don't even define what well-formedness means for them, but we ensure that the context always assumes well-formedness?

I'm somewhat surprised this works, and it also seems very restrictive, but -- yeah, sounds plausible. I'd change the text then and not say things like "polymorphic constants are well-formed if all instances are well-formed". Such a statement says that e.g. this is well-formed:

fn bar<T>() -> [u8; 64 + align_of::<T>() * 0] {}

Or this:

fn bar<T>() -> [u8; 64 / std::min(size_of::<T>(), 1)] {}

But it's not well-formed, due to a lack of where clause. There is simply no rule one can use to prove a polymorphic const well-formed, which makes "forward assumption to caller" the only option.

@lcnr
Copy link
Author

lcnr commented Aug 5, 2020

I don't see why "The problem with this definition is that it is uncheckable" is a problem in practice.

Just like the borrow checker prevents programs which are theoretically correct, so too can our const well formed check forbid theoretically correct problems.

So at least with how I understand this, both 64 + align_of::<T>() * 0 and 64 / std::min(size_of::<T>(), 1) are well formed, we just aren't currently able to prove this.

So we could theoretically add a potentially unsafe function fn std::constant::evaluatable_unchecked<T>(v: T) -> T { v } in the future, which allows the user to write const expressions we assume to be wf without checking it

@oli-obk
Copy link
Contributor

oli-obk commented Aug 5, 2020

So basically this system punts the question of polymorphic constants?

yes

We don't even define what well-formedness means for them, but we ensure that the context always assumes well-formedness?

yes

I'm somewhat surprised this works, and it also seems very restrictive, but -- yeah, sounds plausible

it is very restrictive, but it's not actually worse than the status quo (typenum's trait bounds work exactly the same way). We can in the future add more general WF restrictions like X > 10 or sth, which will stop requiring WF bounds for X - 10, X - 9, ... but still require them for X - 11, but any such improvements can be added in a nonbreaking way later.

@RalfJung
Copy link
Member

RalfJung commented Aug 5, 2020

So at least with how I understand this, both 64 + align_of::() * 0 and 64 / std::min(size_of::(), 1) are well formed, we just aren't currently able to prove this.

But then we need a different name for "the thing the compiler checks". So far, "well-formedness" was the thing that the compiler checked, not a more relaxed thing that we could theoretically also allow. This RFC does a good job of describing that thing (though it may be outdated). In the syntax of that RFC, there would simply be no rule that says when polymorphic consts are well-formed, which implicitly makes forwarding the requirement the only way.

If you suddenly use the term "well-formed" for something that is not what the compiler actually checks, that is going to be very confusing. Please don't change the meaning of established terms like that.


I thus propose that we say something like:

  • All constants are considered well-formed if they also appear in a where clause, thus forwarding the well-formedness requirements to the caller. (This is not specific to consts, it applies to all forms of well-formedness. We just repeat it here for clarity.)
  • Additionally, monorphic consts are well-formed if ...

Regarding the monomorphic case, shouldn't it also require that the constant must be convertible to a value tree?

@RalfJung
Copy link
Member

RalfJung commented Aug 5, 2020

So we could theoretically add a potentially unsafe function fn std::constant::evaluatable_unchecked(v: T) -> T { v } in the future, which allows the user to write const expressions we assume to be wf without checking it

I doubt we can do that, actually. Much of the compiler relies on well-formedness as an invariant, as far as I know. Such a method would likely trigger ICEs all over the place.

I don't think the theoretical possibility of re-architecting everything to allow such a method justifies what I consider to be a change in the meaning of the pre-existing term "well-formedness".

@lcnr

This comment has been minimized.

@RalfJung

This comment has been minimized.

@lcnr

This comment has been minimized.

@oli-obk

This comment has been minimized.

@RalfJung

This comment has been minimized.

@oli-obk
Copy link
Contributor

oli-obk commented Aug 5, 2020

Summary: we can change what "well formed constant" means, but when talking about well formed constants, it must be whatever the compiler currently accepts as well formed. We don't talk about "stable" things when talking about things that "could be stable", either. Please let's get back to discussing what it means to have a "well formed constant" and explicitly mention when something is "we could change well formedness rules to accept this in the future".

@lcnr
Copy link
Author

lcnr commented Aug 5, 2020

Ok, so from what I can tell, a given crate/type/whatever is well formed iff the language accepts it. IOW both things which are not well formed but are currently accepted and things which are well formed and not accepted are bugs of the compiler.

Is that correct?

At this point I am not actually thinking about "well formedness" here and need a better word for this 🤔


So the "correct" definition for well formed contants would be:

Well formed constants are currently either concrete constants (with the above properties) or const params.

And after rust-lang/compiler-team#340 is implemented, constants are also well formed if they are mentioned in the caller_bounds of a given type/function I guess?


What would [u8; 64 + align_of::<T>() * 0] be then?
64 + align_of::<T>() * 0 is evaluatable for all T 🤔
Would that mean that the thing I am trying to define is simply "generic const evaluatability"?
i.e. 64 + align_of::<T>() * 0 would be const evaluatable, even if our current checks do not allow it?

@oli-obk
Copy link
Contributor

oli-obk commented Aug 5, 2020

What would [u8; 64 + align_of::() * 0] be then?

Not well formed, unless T is monomorphic. We could change well formedness rules to accept that, as it has a straight forward rule on the term tree (Note: not value tree! "term trees" are part of rust-lang/compiler-team#340) for the constant. Performing such term tree simplifications should probably have its own RFC, as it's probably not too hard to come up with a corner case where it's not clear to users why one piece of code is accepted and another isn't.

@RalfJung
Copy link
Member

RalfJung commented Aug 5, 2020

@lcnr

At this point I am not actually thinking about "well formedness" here and need a better word for this thinking

Regarding the terminology, this is very similar to things like "well-typed". A program is "well-typed" if it gets accepted by the compiler. The purpose of writing down a type system precisely and formally is to have a representation of that that's easier to read than 10k lines of code doing type-checking.

The following code is not well-typed, even though it could be (because nothing can ever go wrong when executing that function):

fn foo() -> i32 {
  if 0 == 1 { false } else { 13 }
}

So just like the above function is not well-typed even though a smarter compiler could accept it, we also consider types not well-formed based on what we actually can reasonably check and want to implement, not on what is theoretically possible.

For "well-typed", we say that functions like foo above that we could accept but do not are "sound" or "semantically (as opposed to syntactically) well-typed". I do not know of corresponding standard terminology for "well-formed", which is "one level meta" from "well-typed".

@oli-obk

What would [u8; 64 + align_of::() * 0] be then?

Not well formed, unless T is monomorphic.

Agreed.

We could change well formedness rules to accept that, as it has a straight forward rule on the term tree (Note: not value tree! "term trees" are part of rust-lang/compiler-team#340) for the constant. Performing such term tree simplifications should probably have its own RFC, as it's probably not too hard to come up with a corner case where it's not clear to users why one piece of code is accepted and another isn't.

But before we go down that rabbit hole, we should have some idea for where we are going. This is moving towards arbitrary powerful static analysis to determine if CTFE will succeed with every possible input. We could throw an SMT solver at this but we probably do not want to. I think it is better to accept no such code than to have a piecemeal solution that accepts some random things but not others, in an unpredictable way.

That's for way in the future anyways. I hope.^^

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants