-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enum variant types #2593
Closed
Closed
Enum variant types #2593
Changes from all commits
Commits
Show all changes
14 commits
Select commit
Hold shift + click to select a range
b9b9cd1
Initial draft for enum variant types
varkor ad4c445
Add a Future possibilities section
varkor 6278421
Address first round of comments
varkor e64df31
Acquiesce to Centril's hatred of one
varkor 507464c
Add extra clarification
varkor 508b43b
Update the start date
varkor f9451c3
Mention refutable matching on variant types
varkor 2b00420
Mention Scala's Either type
varkor 20e6d9d
Minor clarifications
varkor 56f71e6
Switch to a type-inference based method
varkor 415af1f
Mention space-optimised enum variant types
varkor c821c29
Reference optimize(size)
varkor 9e34d30
Add impls as an unanswered question
varkor d4b6187
More conservative type inference
varkor File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,330 @@ | ||
- Feature Name: `enum_variant_types` | ||
- Start Date: 10-11-2018 | ||
- RFC PR: | ||
- Rust Issue: | ||
|
||
# Summary | ||
[summary]: #summary | ||
|
||
Consider enum variants types in their own rights. This allows them to be irrefutably matched | ||
upon. Where possible, type inference will infer variant types, but as variant types may always be | ||
treated as enum types this does not cause any issues with backwards-compatibility. | ||
|
||
```rust | ||
enum Either<A, B> { L(A), R(B) } | ||
|
||
fn all_right<A, B>(b: B) -> Either<A, B>::R { | ||
Either::R(b) | ||
} | ||
|
||
let Either::R(b) = all_right::<(), _>(1729); | ||
println!("b = {}", b); | ||
``` | ||
|
||
# Motivation | ||
[motivation]: #motivation | ||
|
||
When working with enums, it is frequently the case that some branches of code have assurance that | ||
they are handling a particular variant of the enum ([1], [2], [3], [4], [5], etc.). This is especially the case when abstracting | ||
behaviour for a certain enum variant. However, currently, this information is entirely hidden to the | ||
compiler and so the enum types must be matched upon even when the variant is certainly known. | ||
|
||
[1]: https://github.com/rust-lang/rust/blob/69a04a19d1274ce73354ba775687e126d1d59fdd/src/liballoc/borrow.rs#L245-L248 | ||
[2]: https://github.com/rust-lang/rust/blob/69a04a19d1274ce73354ba775687e126d1d59fdd/src/liballoc/raw_vec.rs#L424 | ||
[3]: https://github.com/rust-lang/rust/blob/69a04a19d1274ce73354ba775687e126d1d59fdd/src/librustc_mir/transform/simplify.rs#L162-L166 | ||
[4]: https://github.com/rust-lang/rust/blob/69a04a19d1274ce73354ba775687e126d1d59fdd/src/librustc_resolve/build_reduced_graph.rs#L301 | ||
[5]: https://github.com/rust-lang/rust/blob/69a04a19d1274ce73354ba775687e126d1d59fdd/src/librustc_resolve/macros.rs#L172-L175 | ||
|
||
By treating enum variants as types in their own right, this kind of abstraction is made cleaner, | ||
avoiding the need for code patterns such as: | ||
- Passing a known variant to a function, matching on it, and use `unreachable!()` arms for the other | ||
variants. | ||
- Passing individual fields from the variant to a function. | ||
- Duplicating a variant as a standalone `struct`. | ||
|
||
However, though abstracting behaviour for specific variants is often convenient, it is understood | ||
that such variants are intended to be treated as enums in general. As such, the variant types | ||
proposed here have identical representations to their enums; the extra type information is simply | ||
used for type checking and permitting irrefutable matches on enum variants. | ||
|
||
# Guide-level explanation | ||
[guide-level-explanation]: #guide-level-explanation | ||
|
||
The variants of an enum are considered types in their own right, though they are necessarily | ||
more restricted than most user-defined types. This means that when you define an enum, you are more | ||
precisely defining a collection of types: the enumeration itself, as well as each of its | ||
variants. However, the variant types act identically to the enum type in the majority of cases. | ||
|
||
Specifically, variant types act differently to enum types in the following case: | ||
- When pattern-matching on a variant type, only the constructor corresponding to the variant is | ||
considered possible. Therefore you may irrefutably pattern-match on a variant: | ||
|
||
```rust | ||
enum Sum { A(u32), B, C } | ||
|
||
fn print_A(a: Sum::A) { | ||
let A(x) = a; | ||
println!("a is {}", x); | ||
} | ||
``` | ||
However, to be backwards-compatible with existing handling of variants as enums, matches on | ||
variant types will permit (and simply ignore) arms that correspond to other variants: | ||
varkor marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
```rust | ||
let a = Sum::A(20); | ||
|
||
match a { | ||
A(x) => println!("a is {}", x), | ||
B => println!("a is B"), // ok, but unreachable | ||
C => println!("a is C"), // ok, but unreachable | ||
} | ||
``` | ||
|
||
To avoid this behaviour, a new lint, `strict_variant_matching` will be added that will forbid | ||
matching on other variants. | ||
|
||
- You may project the fields of a variant type, similarly to tuples or structs: | ||
|
||
```rust | ||
fn print_A(a: Sum::A) { | ||
println!("a is {}", a.0); | ||
} | ||
``` | ||
|
||
Variant types, unlike most user-defined types are subject to the following restriction: | ||
- Variant types may not have inherent impls, or implemented traits. That means `impl Enum::Variant` | ||
and `impl Trait for Enum::Variant` are forbidden. This dissuades inclinations to implement | ||
abstraction using behaviour-switching on enums (for example, by simulating inheritance-based | ||
subtyping, with the enum type as the parent and each variant as children), rather than using traits | ||
as is natural in Rust. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm a fan of the proposed style, but it might be worth stating why Rust the language wants to dissuade this pattern. |
||
|
||
```rust | ||
enum Sum { A(u32), B, C } | ||
|
||
impl Sum::A { // ERROR: variant types may not have specific implementations | ||
// ... | ||
} | ||
``` | ||
|
||
``` | ||
error[E0XXX]: variant types may not have specific implementations | ||
--> src/lib.rs:3:6 | ||
| | ||
3 | impl Sum::A { | ||
| ^^^^^^ | ||
| | | ||
| `Sum::A` is a variant type | ||
| help: you can try using the variant's enum: `Sum` | ||
``` | ||
|
||
Variant types may be aliased with type aliases: | ||
|
||
```rust | ||
enum Sum { A(u32), B, C } | ||
|
||
type SumA = Sum::A; | ||
// `SumA` may now be used identically to `Sum::A`. | ||
``` | ||
|
||
If a value of a variant type is explicitly coerced or cast to the type of its enum using a type | ||
annotation, `as`, or by passing it as an argument or return-value to or from a function, the variant | ||
information is lost (that is, a variant type *is* different to an enum type, even though they behave | ||
similarly). | ||
|
||
Note that enum types may not be coerced or cast to variant types. Instead, matching must be | ||
performed to guarantee that the enum type truly is of the expected variant type. | ||
|
||
```rust | ||
enum Sum { A(u32), B, C } | ||
|
||
let s: Sum = Sum::A; | ||
|
||
let a = s as Sum::A; // error | ||
let a: Sum::A = s; // error | ||
|
||
if let a @ Sum::A(_) = s { | ||
// ok, `a` has type `Sum::A` | ||
println!("a is {}", a.0); | ||
} | ||
``` | ||
|
||
If multiple variants are bound with a single binding variable `x`, then the type of `x` will simply | ||
be the type of the enum, as before (i.e. binding on variants must be unambiguous). | ||
|
||
Variant types interact as expected with the proposed | ||
[generalised type ascription](https://github.com/rust-lang/rfcs/pull/2522) (i.e. the same as type | ||
coercion in `let` or similar). | ||
|
||
## Type parameters | ||
Consider the following enum: | ||
```rust | ||
enum Either<A, B> { | ||
L(A), | ||
R(B), | ||
} | ||
``` | ||
Here, we are defining three types: `Either`, `Either::L` and `Either::R`. However, we have to be | ||
careful here with regards to the type parameters. Specifically, the variants may not make use of | ||
every generic parameter in the enum. Since variant types are generally considered simply as enum | ||
types, this means that the variants need all the type information of their enums, including all | ||
their generic parameters. This explictness has the advantage of preserving variance for variant | ||
types relative to their enum types, as well as permitting zero-cost coercions from variant types to | ||
enum types. | ||
|
||
So, in this case, we have the types: `Either<A, B>`, `Either<A, B>::L` and `Either::<A, B>::R`. | ||
|
||
# Reference-level explanation | ||
[reference-level-explanation]: #reference-level-explanation | ||
|
||
A new variant, `Variant(DefId, VariantDiscr)`, will be added to `TyKind`, whose `DefId` points to | ||
the enclosing enum for the variant and `VariantDiscr` is the discriminant for the variant in | ||
question. In most cases, the handling of `Variant` will simply delegate any behaviour to its `enum`. | ||
However, pattern-matching on the variant allows irrefutable matches on the particular variant. In | ||
effect, `Variant` is only relevant to type checking/inference and the matching logic. | ||
|
||
The discriminant of a `Variant` (as observed by [`discriminant_value`](https://doc.rust-lang.org/nightly/std/intrinsics/fn.discriminant_value.html)) is the discriminant | ||
of the variant (i.e. identical to the value observed if the variant is first coerced to the enum | ||
type). | ||
|
||
Constructors of variants, as well as pattern-matching on particular enum variants, are still | ||
inferred to have enum types, rather than variant types, for backwards compatibility. | ||
|
||
```rust | ||
enum Sum { | ||
A(u8), | ||
B, | ||
C, | ||
} | ||
|
||
let x: Sum::A = Sum::A(5); // x: Sum::A | ||
let Sum::A(y) = x; // ok, y = 5 | ||
|
||
let z = Sum::A(5); // x: Sum | ||
let Sum::A(y) = z; // error! | ||
|
||
fn sum_match(s: Sum) { | ||
match s { | ||
a @ Sum::A(_) => { | ||
let x = a; // ok, a: Sum::A | ||
} | ||
b @ Sum::B => { | ||
// b: Sum::B | ||
} | ||
c @ Sum::C => { | ||
// c: Sum::C | ||
} | ||
} | ||
} | ||
``` | ||
|
||
In essence, a value of a variant is considered to be a value of the enclosing `enum` in every matter | ||
but pattern-matching. | ||
|
||
Explicitly coercing or casting to the `enum` type forgets the variant information. | ||
|
||
```rust | ||
let x: Sum = Sum::A(5); // x: Sum | ||
let Sum::A(y) = x; // error: refutable match | ||
|
||
let x = Sum::A(5) as Sum; // x: Sum | ||
let Sum::A(y) = x; // error: refutable match | ||
``` | ||
|
||
Inference of the type of an `enum` or variant remains entirely conservative: without explicit type | ||
annotations, a value whose type is an `enum` or variant will always have the `enum` type inferred. | ||
That is to say: variant types are never inferred for values and much instead be explicitly declared. | ||
|
||
This is backwards-compatible with existing code, as variants act as `enum`s except in cases that | ||
were previously invalid (i.e. pattern-matching, where the extra typing information was previously | ||
unknown). | ||
|
||
Note that because a variant type, e.g. `Sum::A`, is not a subtype of the enum type (rather, it can | ||
simply be coerced to the enum type), a type like `Vec<Sum::A>` is not a subtype of `Vec<Sum>`. | ||
(However, this should not pose a problem as it should generally be convenient to coerce `Sum::A` to | ||
`Sum` upon either formation or use.) | ||
|
||
Note that we do not make any guarantees of the variant data representation at present, to allow us | ||
flexibility to explore the design space in terms of trade-offs between memory and performance. | ||
|
||
# Drawbacks | ||
[drawbacks]: #drawbacks | ||
|
||
- The loose distinction between the `enum` type and its variant types could be confusing to those | ||
unfamiliar with variant types. Error messages might specifically mention a variant type, which could | ||
be at odds with expectations. However, since they generally behave identically, this should not | ||
prove to be a significant problem. | ||
- As variant types need to include generic parameter information that is not necessarily included in | ||
their definitions, it will be necessary to include explicit type annotations more often than is | ||
typical. Although this is unfortunate, it is necessary to preserve all the desirable behaviour of | ||
variant types described here: namely complete backwards-compatibility precise type inference and | ||
variance (e.g. allowing `x` in `let x = Sum::A;` to have type `Sum::A` without explicit type | ||
annotations). | ||
|
||
# Rationale and alternatives | ||
[rationale-and-alternatives]: #rationale-and-alternatives | ||
|
||
The advantages of this approach are: | ||
- It naturally allows variants to be treated as types, intuitively. | ||
- As variant types and enum types are represented identically, there are no coercion costs. | ||
- It doesn't require value-tracking (save the degenerate kind performed by type inference) | ||
or complex type system additions such as | ||
[refinement types](https://en.wikipedia.org/wiki/Refinement_type). | ||
- It makes no backwards incompatible type inference changes. | ||
- Since complete (enum) type information is necessary for variant types, this should be forwards | ||
compatible with any extensions to enum types (e.g. | ||
[GADTs](https://en.wikipedia.org/wiki/Generalized_algebraic_data_type)). | ||
|
||
One obvious alternative is to represent variant types differently to enum types and then coerce them | ||
when used as an enum. This could potentially reduce memory overhead for smaller variants | ||
(additionally no longer requiring the discriminant to be stored) and reduce the issue with providing | ||
irrelevant type parameters. However, it makes coercion more expensive and complex (as a variant | ||
could coerce to various enum types depending on the unspecified generic parameters). It is proposed | ||
here that zero-cost coercions are more important. (In addition, simulating smaller variants is | ||
possible by creating separate mirroring structs for each variant for which this is desired and | ||
converting manually (though this is obviously not ideal), whereas simulating the proposed behaviour | ||
with the alternative is much more difficult, if possible at all.) | ||
|
||
Variant types have [previously been proposed for Rust](https://github.com/rust-lang/rfcs/pull/1450). | ||
However, it was closed due to uncertainty around the interaction with other forms of type inference | ||
(numeric type inference and default type parameters). The conservative type inference described here | ||
should not directly interact with these and it is sensible to open up this RFC without more | ||
substantial changes to the proposed implementation method. Furthermore, the method proposed here is | ||
implementationally simpler and more intuitive. | ||
|
||
# Prior art | ||
[prior-art]: #prior-art | ||
|
||
Type-theoretically, enums are sum types. A sum type `S := A + B` is a type, `S`, defined in relation | ||
to two other types, `A` and `B`. Variants are specifically types, but in programming it's usually | ||
useful to consider particular variants in relation to each other, rather than standalone (which is | ||
why `enum` *defines* types for its variants rather than using pre-existing types for its variants). | ||
|
||
However, it is often useful to briefly consider these variant types alone, which is what this | ||
RFC proposes. | ||
|
||
Although sum types are becoming increasingly common in programming languages, most do not choose to | ||
allow the variants to be treated as types in their own right. There are some languages that have | ||
analogues however: Scala's [`Either` type](https://www.scala-lang.org/api/2.9.3/scala/Either.html) | ||
has `Left` and `Right` subclasses that may be treated as standalone types, for instance. Regardless | ||
of the scarcity of variant types however, we propose that the patterns in Rust make variant types | ||
more appealing than they might be in other programming languages with variant types. | ||
|
||
# Unresolved questions | ||
[unresolved-questions]: #unresolved-questions | ||
|
||
- Is disallowing `impl`s on variant types too conservative or restrictive? Should we instead permit | ||
them (and potentially provide clippy lints to point out unidiomatic patterns). | ||
|
||
# Future possibilities | ||
[future-possibilities]: #future-possibilities | ||
|
||
It would be possible to remove some of the restrictions on enum variant types in the future, such as | ||
permitting `impl`s, supporting variant types that don't contain all (irrelevant) generic parameters | ||
or permitting variant types to be subtypes of enum types. This RFC has been written intentionally | ||
conservatively in this regard. | ||
|
||
In addition, we could offer a way to space-optimise variant types (rather than minimising | ||
conversion costs). By not committing to a specific representation now, this allows us to make a | ||
decision as to how to support this use case in the future, possibly through attributes on the enum, | ||
such as the [`#[optimize(size)]` attribute](https://github.com/rust-lang/rfcs/pull/2412); or through | ||
anonymous enum types, which often come up in such discussions. |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I disagree that this goal is going to be as widely achieved by this RFC as I would like due to the following point:
That means that if I have an enum with large variants:
Even the "small" variants (e.g.
Thing::Two
) are still going to take "a lot" of space.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If space is a concern then we could have it so variant types only convert to their enum by-value, so e.g. a
&Thing::Two
wouldn't be a valid&Thing
.That's weaker than something more akin to refinement typing, but maybe it's enough?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@eddyb I think that's already the case; the RFC doesn't state anywhere, as far as I can tell, that
&Thing::Two
is a valid&Thing
. Also note that the RFC explicitly states thatThing::Two
andThing
having the same layout is not a guarantee so we could change the layout to be more space efficient.