Skip to content

Commit

Permalink
Remove ascription on patterns. Add details about coercions.
Browse files Browse the repository at this point in the history
  • Loading branch information
nrc committed Feb 26, 2015
1 parent 32eadf4 commit 05e10ca
Showing 1 changed file with 43 additions and 111 deletions.
154 changes: 43 additions & 111 deletions text/0000-type-ascription.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,10 +4,10 @@

# Summary

Add type ascription to expressions and patterns.
Add type ascription to expressions. (An earlier version of this RFC covered type
ascription in patterns too, that has been postponed).

Type ascription on expression has already been implemented. Type ascription on
patterns can probably wait until post-1.0.
Type ascription on expression has already been implemented.

See also discussion on [#354](https://github.com/rust-lang/rfcs/issues/354) and
[rust issue 10502](https://github.com/rust-lang/rust/issues/10502).
Expand All @@ -16,12 +16,12 @@ See also discussion on [#354](https://github.com/rust-lang/rfcs/issues/354) and
# Motivation

Type inference is imperfect. It is often useful to help type inference by
annotating a sub-expression or sub-pattern with a type. Currently, this is only
possible by extracting the sub-expression into a variable using a `let`
statement and/or giving a type for a whole expression or pattern. This is un-
ergonomic, and sometimes impossible due to lifetime issues. Specifically, a
variable has lifetime of its enclosing scope, but a sub-expression's lifetime is
typically limited to the nearest semi-colon.
annotating a sub-expression with a type. Currently, this is only possible by
extracting the sub-expression into a variable using a `let` statement and/or
giving a type for a whole expression or pattern. This is un- ergonomic, and
sometimes impossible due to lifetime issues. Specifically, where a variable has
lifetime of its enclosing scope, but a sub-expression's lifetime is typically
limited to the nearest semi-colon.

Typical use cases are where a function's return type is generic (e.g., collect)
and where we want to force a coercion.
Expand Down Expand Up @@ -90,18 +90,6 @@ let x: T = {
let x: T foo(): U<_>;

This comment has been minimized.

Copy link
@1fish2

1fish2 Feb 26, 2015

(^ missing an =.) Is there a non-transitive effect that means this can't be written today as let x: U<_> = foo(); or let x: T = foo();?

[This was supposed to be a line note on line 90.]

This comment has been minimized.

Copy link
@nrc

nrc Feb 26, 2015

Author Owner

Sort of, the U<_> is required somewhere so that type inference can compute a type (because the return type of foo is generic. Currently that requires a temporary variable. With type ascription it can be denoted directly in the expression. Note that there is no coercion due to the type ascription, that is just supplying extra type info to inference. The coercion is converting the rvalue result of foo() into x which coerces from U<_> to T.

```

In patterns:

```
struct Foo<T> { a: T, b: String }
// Current
fn foo(Foo { a, .. }: Foo<i32>) { ... }
// With type ascription.
fn foo(Foo { a: i32, .. }) { ... }
```


# Detailed design

Expand All @@ -122,72 +110,44 @@ At runtime, type ascription is a no-op, unless an implicit coercion was used in
type checking, in which case the dynamic semantics of a type ascription
expression are exactly those of the implicit coercion.

The syntax of patterns is extended to include an optional type ascription.
Old syntax:

```
PT ::= P: T
P ::= var | 'box' P | ...
e ::= 'let' (PT | P) = ... | ...
```

where `PT` is a pattern with optional type, `P` is a sub-pattern, `T` is a type,
and `var` is a variable name. (Formal arguments are `PT`, patterns in match arms
are `P`).
@eddyb has implemented the expressions part of this RFC,
[PR](https://github.com/rust-lang/rust/pull/21836).

New syntax:

```
PT ::= P: T | P
P ::= var | 'box' PT | ...
e ::= 'let' PT = ... | ...
```
### coercion and `as` vs `:`

Type ascription in patterns has the narrowest precedence, e.g., `box x: T` means
`box (x: T)`. In particular, in a struct initialiser or patter, `x : y : z` is
parsed as `x : (y: z)`, i.e., a field named `x` is initialised with a value `y`
and that value must have type `z`. If only `x: y` is given, that is considered
to be the field name and the field's contents, with no type ascription.
A downside of type ascription is the overlap with explicit coercions (aka casts,
the `as` operator). Type ascription makes implicit coercions explicit. In RFC
401, it is proposed that all valid implicit coercions are valid explicit
coercions. However, that may be too confusing for users, since there is no

This comment has been minimized.

Copy link
@1fish2

1fish2 Feb 26, 2015

To clarify and to define terms for the language spec: With this RFC there will be (1) implicit coercion, (2) explicit coercion using type ascription syntax, and (3) explicit casting/coercion using as syntax.

Is casting a kind of coercion?

Can casting/coercion do data conversion such as widening a number?

Is changing from usize to u32 a coercion? A cast? Subtyping? Varies by target architecture? Would the proposed numeric cast lint vary by target architecture?

Is changing from a type to one of its implemented traits considered coercion to a supertype? I'd expect subtyping to be all about Liskov substitutability and thus include this case but the definition in rust-lang#401 only includes lifetime variance -- unless I misunderstand the phrase "type equality."

"Type ascription makes implicit coercions explicit." Suggestion: Define type ascription, e.g., explicitly requesting a coercion that could happen implicitly to the target type.

"In RFC 401, it is proposed that all valid implicit coercions are valid explicit coercions." -- Is the proposal here to make valid implicit coercions (== valid explicit coercions via type ascription) be invalid as casts?

This comment has been minimized.

Copy link
@nrc

nrc Feb 26, 2015

Author Owner

2 is not really a separate category, it is the same as 1. There are a bunch of places which cause coercion, intuitively, anywhere that a type can be derived by annotation, and the sub-expression in a type ascription expression is just another one of those places. There are no special rules for type ascription.

Casting is not a coercion. Both casting and coercion are kinds of type conversation. Casting is always explicit (using as), coercion is always implicit. Subtyping is a third kind of type conversion.

Both casting and coercion do data conversion at runtime in some (most) cases. Casting is used for those where the programmer needs to take care (e.g., numeric narrowing); coercion where there is no risk (e.g., converting from a sized to an unsized array). Widening is currently a cast, but in the future will probably become a coercion.

Subtyping in Rust is the only type conversion used in type inference and never change the runtime representation. Subtyping only affects lifetimes at the moment (including some funky stuff with quantifiers in types with higher ranked lifetimes. This is not a guarantee but is unlikely to change.

Changing from a trait object to a super-trait object is not possible at all at the moment. As part of rust-lang#401 it will become an implicit coercion. It can't be a subtype because the vtable must change.

I believe all three kinds of type conversion satisfy the Liskov substitution principle, but I haven't thought about that too hard. Some of the casts probably do not, in fact...

The intention in rust-lang#401 was that all implicit coercions would be valid as casts. This is adjusted here with the lints.

This comment has been minimized.

Copy link
@1fish2

1fish2 Feb 26, 2015

Apologies! I'm trying hard to help but not succeeding in building a conceptual model.

2 [a written type a-script-ion] ... is the same as 1 [implicit coercion]. ... coercion is always implicit. ... no coercion due to the type ascription, that is just supplying extra type info to inference.

Written != implicit. Is there a British-American English variance in these terms?

So the categories of conversions are (a) "casting" (== "writing as"? except where implicit coercion could work? except possibly-lossy cases? or rather a semantic operation that can change the data representation such as number width or vtable pointer? only the lossy cases?), (b) "coercion" (== any conversion that the compiler could insert automatically?), (c) "subtyping" (== at-least-as-long lifetime), and (d) "supplying extra type info to type inference" for generics. Does type ascription == a way to explicitly request (b) and (d)? Where does changing from a concrete type to a trait (e.g. bool to Display) fit into these categories? Are these categories at the level of syntax or semantics?

The Scala Guide says "Ascription is basically just an up-cast", e.g. Set("Hi": Object) to construct a Set[Object] rather than a Set[String], but that's subtyping. Does that match your example foo.enumerate().collect() : Vec<_>?

foo() : U<_> : T is a case where a type ascription chain is useful. "Sort of" non-transitive? What's a T where foo() : T won't work?

Writing a numeric-cast is needed to convert isize or usize to/from a fixed-size integer, so it doesn't seem like that should generate a warning when selecting a target architecture where it converts to >= width.

This comment has been minimized.

Copy link
@nrc

nrc Feb 27, 2015

Author Owner

Probably best to ignore the Scala definition. In Rust, a type ascription can be thought of as a type assertion. I.e., e: T asserts that e has type T. It doesn't imply any kind of type conversion. If the type system cannot derive the type T for e, then it's a type error. That is all type ascription does. This is of course, fully explicit.

Now, the type system during inference can choose to coerce an expression into a type. This coercion is always implicit - it is not denoted in the source at all. Like many other other places in Rust code, the sub-expression of type ascription may be coerced. The type system gets to e and tries to derive the type T for that sub-expression, it can only derive U so it then tries to coerce U to T. If so then it essentially replaces e with coerce<U->T>(e). It can then continue type checking the outer expression (the type ascription) and that succeeds because the sub-expression has type T. Note that the coercion and the type ascription are totally separate from the type checker's point of view.

So, there is no British/American thing here. It's just that I am talking about implicitness in very precise terms - the ascription is explicit, but the coercion is not (unfortunately my early draft was not very precise, I hope I have addressed that to some extent).

Your categories of conversion are pretty much right, except that d is not really a conversion. But in those terms, type ascription is purely category d.

Changing a concrete type to a trait object is b, an implicit coercion.

Casts have different syntax from the others. All three have different static semantics. Casts and coercions have similar dynamic semantics, but this depends on the individual conversion.

For ascription chains, as you write, then, in general, ascription is not transitive. Lets assume foo is collect, and we have a trait Bar, which is implemented by Vec, then &....collect() : &Vec<_> : &Bar will hold, but &....collect() : &Bar will not, because type inference cannot figure out what type collect will return and thus which instance of collect to call.

The details of widening as implicit coercion have not been worded out yet. Currently it requires a cast, so it is never an unnecessary cast and won't cause a warning. I imagine some users might want a warning for unnecessarily casting a numeric type, so it seems like a lint is appropriate.

This comment has been minimized.

Copy link
@1fish2

1fish2 Feb 27, 2015

Thanks, Nick! That's clearer.

Does that model have different implications than simply "e : T means coerce e to type T"? Are they enough better to be worth the usability cost? (The syntax is also complicated since in key: value : type the two colons have different associativity and precedence. Also : is harder to search the web for than a keyword.)

Maybe it's a mathematician/engineer thing.

This comment has been minimized.

Copy link
@1fish2

1fish2 Feb 27, 2015

Oh, oh. Surely you're jiving us. Please say you're jiving.

reason to use type ascription rather than `as` (if there is some coercion).
Furthermore, if programmers do opt to use `as` as the default whether or not it
is required, then it loses its function as a warning sign for programmers to
beware of.

The chagnes to pattern syntax mean that in some contexts where a pattern
previously required a type annotation, it is no longer required if all variables
can be assigned types via the ascription. Examples,
To address this I propose three lints which check for: trivial casts, coercible
casts, and trivial numeric casts. Other than these lints we stick with the
proposal from #401 that unnecessary casts will no longer be an error.

```
struct Foo {
a: Bar,
b: Baz,
}
fn foo(x: Foo); // Ok, type of x given by type of whole pattern
fn foo(Foo { a: x, b: y}: Foo) // Ok, types of x and y found by destructuring
fn foo(Foo { a: x: Bar, b: y: Baz}) // Ok, no type annotation, but types given as ascriptions
fn foo(Foo { a: x: Bar, _ }) // Ok, we can still deduce the type of x and the whole argument
fn foo(Foo { a: x, b: y}) // Ok, type of x and y given by Foo
struct Qux<X> {
a: Bar,
b: X,
}
fn foo(x: Qux<Baz>); // Ok, type of x given by type of whole pattern
fn foo(Qux { a: x, b: y}: Qux<Baz>) // Ok, types of x and y found by destructuring
fn foo(Qux { a: x: Bar, b: y: Baz}) // Ok, no type annotation, but types given as ascriptions
fn foo(Qux { a: x: Bar, _ }) // Error, can't find the type of the whole argument
fn foo(Qux { a: x, b: y}) // Error can't find type of y or the whole argument
```
A trivial cast is a cast `x as T` where `x` has type `U` and `U` is a subtype of
`T` (note that subtyping includes reflexivity).

Note the above changes mean moving some errors from parsing to later in type
checking. For example, all uses of patterns have optional types, and it is a
type error if there must be a type (e.g., in function arguments) but it is not
fully specified (currently it would be a parsing error).
A coercible cast is a cast `x as T` where `x` has type `U` and `x` can be
implicitly coerced to `T`, but `U` is not a subtype of `T`.

In type checking, if an expression is matched against a pattern, when matching
a sub-pattern the matching sub-expression must have the ascribed type (again,
this check includes subtyping and implicit coercion). Types in patterns play no
role at runtime.
A trivial numeric cast is a cast `x as T` where `x` has type `U` and `x` is
implicitly coercible to `T` or `U` is a subtype of `T`, and both `U` and `T` are
numeric types.

@eddyb has implemented the expressions part of this RFC,
[PR](https://github.com/rust-lang/rust/pull/21836).
Like any lints, these can be customised per-crate by the programmer. The trivial
cast lint is 'deny' by default (i.e., causes an error); the coercible cast and
trivial numeric cast lints are 'warn' by default.

Although this is a somewhat complex scheme, it allows code that works today to
work with only minor adjustment, it allows for a backwards compatible path to
'promoting' type conversions from explicit casts to implicit coercions, and it
allows customisation of a contentious kind of error (especially so in the
context of cross-platform programming).

# Drawbacks

Expand All @@ -205,11 +165,6 @@ difficult to support the same syntax as field initialisers.

We could do nothing and force programmers to use temporary variables to specify
a type. However, this is less ergonomic and has problems with scopes/lifetimes.
Patterns can be given a type as a whole rather than annotating a part of the
pattern.

We could allow type ascription in expressions but not patterns. This is a
smaller change and addresses most of the motivation.

Rely on explicit coercions - the current plan [RFC 401](https://github.com/rust-lang/rfcs/blob/master/text/0401-coercions.md)
is to allow explicit coercion to any valid type and to use a customisable lint
Expand All @@ -221,35 +176,12 @@ which require more programmer attention. This also does not help with patterns.

We could use a different symbol or keyword instead of `:`, e.g., `is`.

# Unresolved questions

Is the suggested precedence correct? Especially for patterns.

Does type ascription on patterns have backwards compatibility issues?

Given the potential confusion with struct literal syntax, it is perhaps worth
re-opening that discussion. But given the timing, probably not.
# Unresolved questions

Should remove integer suffixes in favour of type ascription?
Is the suggested precedence correct?

### `as` vs `:`
Should we remove integer suffixes in favour of type ascription?

A downside of type ascription is the overlap with explicit coercions (aka casts,
the `as` operator). Type ascription makes implicit coercions explicit. In RFC
401, it is proposed that all valid implicit coercions are valid explicit
coercions. However, that may be too confusing for users, since there is no
reason to use type ascription rather than `as` (if there is some coercion). It
might be a good idea to revisit that decision (it has not yet been implemented).
Then it is clear that the user uses `as` for explicit casts and `:` for non-
coercing ascription and implicit casts. Although there is no hard guideline for
which operations are implicit or explicit, the intuition is that if the
programmer ought to be aware of the change (i.e., the invariants of using the
type change to become less safe in any way) then coercion should be explicit,
otherwise it can be implicit.

Alternatively we could remove `as` and require `:` for explicit coercions, but
not for implicit ones (they would keep the same rules as they currently have).
The only loss would be that `:` doesn't stand out as much as `as` and there
would be no lint for trivial coercions. Another (backwards compatible)
alternative would be to keep `as` and `:` as synonyms and recommend against
using `as`.
Style guidelines - should we recommend spacing or parenthesis to make type
ascription syntax more easily recognisable?

0 comments on commit 05e10ca

Please sign in to comment.