Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New trait: core::convert::IntoUnderlying #3046

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

illicitonion
Copy link

@illicitonion illicitonion commented Dec 27, 2020

This adds a trait to allow conversions from enums with a primitive
representation to their primitive representation.

This spun out of the related #3040.

Rendered

This adds a trait to allow conversions from enums with a primitive
representation to their primitive representation.

[Rendered](https://github.com/illicitonion/rfcs/blob/into-underlying/text/3046-into-underlying.md)
illicitonion added a commit to illicitonion/rust that referenced this pull request Dec 27, 2020
Currently this tends to be done using `as`, which has some flaws:
1. It will silently truncate. e.g. if you have a `repr(u64)` enum, and
   you write `value as u8` (perhaps because the repr changed), you get
   no warning or error that truncation will occur.
2. You cannot use enums in code which is generic over `From` or `Into`.
3. Not all enums _want_ to expose conversion to their underlying repr as
   part of their public API.

Instead, by allowing an `IntoUnderlying` implementation to be easily
derived, we can avoid these issues.

There are a number of crates which provide macros to provide this
implementation, as well as `TryFrom` implementations. I believe this is
enough of a foot-gun that it should be rolled into `std`.

See rust-lang/rfcs#3046 for more information.
@scottmcm scottmcm added T-lang Relevant to the language team, which will review and decide on the RFC. T-libs-api Relevant to the library API team, which will review and decide on the RFC. labels Dec 29, 2020
@scottmcm
Copy link
Member

/// This trait should only be implemented where conversion is trivial; for non-trivial conversions,
/// prefer to implement [`Into`].
pub trait IntoUnderlying {
    type Underlying;
    fn into_underlying(self) -> Self::Underlying;
}

That "should" got me thinking. How important is it? To explore:

  • It's not actually any more important than in From. Then it's kinda like trait IntoUnderlying : Into<Self::Underlying> { type Underlying; }.

  • It's very important, and we want people to be able to rely on it. Then it's like unsafe trait IntoUnderlying { type Underlying; }, but that's very much like the "archetype" idea from safe transmute. So it could also be thought of as trait IntoUnderlying : safe_transmute::MuckableInto<Self::Underlying> { type Underlying; }.

But I guess both of those lose the "method that doesn't need type annotations" part of the proposal. I think think it's interesting, though. Maybe there's a way to combine them? Sketch of an undeveloped idea:

pub trait Wrapper {
    type Underlying;
    fn into_underlying(self) -> Self::Underlying
        where Self: Into<Self::Underlying>
    {
        self.into()
    }
    fn from_underlying(x: Self::Underlying) -> Self
        where Self::Underlying: Into<Self>
    {
        x.into()
    }
    fn try_into_underlying(self) -> Result<Self::Underlying, Self::Error>
        where Self: TryInto<Self::Underlying>
    {
        self.try_into()
    }
    fn try_from_underlying(x: Self::Underlying) -> Result<Self, Self::Underlying::Error>
        where Self::Underlying: TryInto<Self>
    {
        x.try_into()
    }
}

So that there'd never be a reason to implement this directly. That does still have derive problems, though.

@illicitonion
Copy link
Author

/// This trait should only be implemented where conversion is trivial; for non-trivial conversions,
/// prefer to implement [`Into`].
pub trait IntoUnderlying {
    type Underlying;
    fn into_underlying(self) -> Self::Underlying;
}

That "should" got me thinking. How important is it? To explore:

  • It's not actually any more important than in From. Then it's kinda like trait IntoUnderlying : Into<Self::Underlying> { type Underlying; }.

I think it's not super important, but there is value in having what is effectively a marker trait to show the conversion is expected to be ~free. Maybe there should be a marker trait for Into (akin the relationship between Clone and Copy), something like TrivialInto, which shows that the conversion is trivial? Though maybe that's in practice just const Into vs Into... That said:

  • It's very important, and we want people to be able to rely on it. Then it's like unsafe trait IntoUnderlying { type Underlying; }, but that's very much like the "archetype" idea from safe transmute. So it could also be thought of as trait IntoUnderlying : safe_transmute::MuckableInto<Self::Underlying> { type Underlying; }.

I hadn't read up on the safer transmute work, but now that I have, I think this is exactly equivalent to what in an earlier version of the RFC was:

#[derive(PromiseTransmutableInto)]
#[repr(u8)]
enum Number {
    Zero,
    One,
}

fn main() {
    let n: u8 = Number::Zero.transmute_into();
    assert_eq!(n, 0_u8);
}

which is nicely more general.

Per the current RFC, I think we'd want Number: IntoUnderlying<Underlying=u8> to imply u8: TransmuteFrom<Number> which seems like a reasonable blanket impl. Unless I'm missing something, the RFC doesn't currently provide a TransmuteInto trait, but it feels like:

#[derive(TransmutableInto)]
#[repr(u8)]
enum Number {
    Zero,
    One,
}

would be a reasonable derive, and would obviate the need for IntoUnderlying - TransmutableInto feels less general than Into enough to not need the repr type to be named in the derive. @jswrenn do you have thoughts, both on this general proposal, and a guess as to the near-term future of #2981?

But I guess both of those lose the "method that doesn't need type annotations" part of the proposal. I think think it's interesting, though. Maybe there's a way to combine them? Sketch of an undeveloped idea:

pub trait Wrapper {
    type Underlying;
    fn into_underlying(self) -> Self::Underlying
        where Self: Into<Self::Underlying>
    {
        self.into()
    }
    fn from_underlying(x: Self::Underlying) -> Self
        where Self::Underlying: Into<Self>
    {
        x.into()
    }
    fn try_into_underlying(self) -> Result<Self::Underlying, Self::Error>
        where Self: TryInto<Self::Underlying>
    {
        self.try_into()
    }
    fn try_from_underlying(x: Self::Underlying) -> Result<Self, Self::Underlying::Error>
        where Self::Underlying: TryInto<Self>
    {
        x.try_into()
    }
}

So that there'd never be a reason to implement this directly. That does still have derive problems, though.

I think this is an interesting thing to consider, but as you note, we'd still need some way of indicating the derive, with or without this mechanism, at which point I feel like this would be an interesting layer, but probably something to contemplate alongside/after, rather than instead...

One thing on naming worth noting is that I'm somewhat wary of the term "wrapper" - I think there are two importantly distinct concepts here; a wrapping type and a transmutable type. A primitive-repr'd enum doesn't wrap a number, it is represented by a number. An Rc or a BufWriter wraps a value, but itself isn't transmutable into one. The grey areas are newtypes (where I'd say a #[repr(transparent)] newtype can be considered to do either, and a non-#[repr(transparent)] newtype is less clear). So if we were going down that path, I'd suggest Equivalent, Convertible, Interchangeable or similar, rather than Wrapper.

@jswrenn
Copy link
Member

jswrenn commented Dec 30, 2020

Thanks for pushing on this! Rather than trying to abstract over newtypes and enums, might I suggest a slightly more limited/gradual approach?

A Gradual Approach to Enum Discriminant Reform

Enums are tricky. For instance, although all enums have a discriminant type, that type isn't necessarily the same type used for storing the in-memory tag. And, although all variants have a logical discriminant, you can't necessarily access it using as. (There are also all sorts of issues with our existing vocabulary for talking about these intricacies.)

1. Referring to the Discriminant Type

Since all enums have a discriminant type, let's consider starting there, with just this compiler-intrinsic (i.e., automatically implemented + sealed) trait:

#[lang = "enum_trait"]
pub trait Enum {
    //// The discriminant type of the enum.
    type Discriminant;
}

With this, I can use as without shooting myself in the foot. Instead of writing:

enum Foo { Bar, Baz }

Foo::Bar as someprimitivetype

...I'd instead write:

Foo::Bar as Foo::Discriminant

With this associated type, the foot-gunny Foo::Bar as someprimitivetype form could be linted for and perhaps even be deprecated.

2. Uniform Discriminant Access

Although all variants have a logical discriminant, you can't necessarily access it using as. For instance:

enum Foo {
    Bar(),
    Baz,
}

assert_eq!(Foo::Baz as isize, 1);

The discriminant of Bar is 0. We know this because the discriminant of Baz is 1, because the discriminant of a variant, if unset, is specified to be equal to one plus the discriminant of the previous variant. But, if I try Foo::Bar as isize, I don't get 1, I get the result of casting a function pointer to an isize. (Footgun!)

If we add a field to Bar, we can't as cast Baz at all:

#![feature(arbitrary_enum_discriminant, never_type)]

#[repr(u8)]
enum Foo {
    Bar(!) = 42,
    Baz, // discriminant/tag is 43
}

let _ = Foo::Baz as u8; // error[E0605]: non-primitive cast: `Foo` as `u8`

Let's fix this. Since all enum variants arguably have a logical discriminant, let's extend our trait with a method to retrieve it:

#[lang = "enum_trait"]
pub trait Enum {
    //// The discriminant type of the enum.
    type Discriminant;

    /// Produces the discriminant of the given enum variant.
    fn discriminant(&self) -> Self::Discriminant { ... }
}

With this method, as can be altogether discouraged/deprecated as a mechanism for retrieving the discriminant of an enum.

3. Tag Reform?

Discriminants are logical values associated with enum variants that sometimes have a correspondence to the tags used in-memory representation. Can we provide a method for retrieving a variant's tag only when the discriminant and tag match? Yes, with safe transmute!

#[lang = "enum_trait"]
pub trait Enum {
    //// The discriminant type of the enum.
    type Discriminant;

    /// Produces the discriminant of the given enum variant.
    pub discriminant(&self) -> Self::Discriminant { ... }

    /// Produces the tag of the given enum variant.
    pub fn tag(&self) -> Self::Discriminant
    where
        Self::Discriminant: TransmuteFrom<Self>
    {
        Self::Discriminant::transmute_from(self)
    }
}

That where bound restricts this method to only types where the enum is transmutable into its discriminant; e.g., repr(C) and repr(<primitive>) enums, but not option-like repr(Rust) enums (whose layout is somewhat well-specified), nor any other repr(Rust) enums.

With this method, you no longer need unsafe code to get the tag of an enum!

@programmerjake
Copy link
Member

    pub fn tag(&self) -> Self::Discriminant
    where
        Self::Discriminant: TransmuteFrom<Self>
    {
        Self::Discriminant::transmute_from(self)
    }

That doesn't produce the discriminant for Option<NonZeroU8> -- it just produces the u8 value or 0.

@jswrenn
Copy link
Member

jswrenn commented Dec 30, 2020

That doesn't produce the discriminant tag for Option<NonZeroU8> -- it just produces the u8 value or 0.

Yep, well spotted. It does produce the tag for None, but for Some, the "tag" isn't really a thing that exists. This sort of edge case is why I put a question mark in that heading.

We'd need to reflect a bit more information about layouts into the type system before a Enum::tag method was viable — namely their repr. Something like this:

/// Implemented for all enum `repr`s that have an in-memory tag.
/// Namely: `C`, the primitive reprs (e.g., `u8`), and combinations of `C` and a primitive repr.
pub trait Tagged: Sealed {}

#[lang = "enum_trait"]
pub trait Enum {
    /// The discriminant type of the enum.
    type Discriminant;

    /// A type encoding the `repr` of the enum.
    type Repr;

    /// Produces the discriminant of the given enum variant.
    pub discriminant(&self) -> Self::Discriminant { ... }

    /// Produces the tag of the given enum variant.
    pub fn tag(&self) -> Self::Discriminant
    where
        Self::Repr: Tagged,
        Self::Discriminant: TransmuteFrom<Self>
    {
        Self::Discriminant::transmute_from(self)
    }
}

(Not pictured: a ton of design work and bikeshedding that would need to happen! But, let's not fill up this RFC's issue with comments on this tangent.)

Anyways, there's definitely a path all the way to tag retrieval with this gradual approach. But, even just doing the first two steps would comprehensively address the perils of as.

@illicitonion
Copy link
Author

Thanks @jswrenn for your thoughts here! I was not aware of the fn-pointer footgun, that's special.

On fn-pointers

I think that the bug here is allowing Bar() to be considered a primitive enum variant at all, rather than the fact that it happens to cast to a function pointer.

Ignoring stability concerns, I would suggest that the following simply be compile errors:

#[repr(u8)]
enum Number {
    Zero(),
}

and

enum Number {
    Zero(),
}

fn main() {
    println!("{}", Number::Zero as isize);
}

in the same way that Option cannot be given a primitive repr, and cannot be cast to a primitive. That said, we are where we are.

Possibly more controversially, I would also argue that empty tuple-like or struct-like enum variants should just be outright forbidden, or at least their presence should force an enum to no longer be considered as primitive types.

In practice, for this proposal, I think the refinement needed is to state that the derive-trait will only be supported where none of the variants have tuple-like or struct-like syntax (at least for the first cut). Manually implementing the trait should still be fine, though, and we could potentially expand support in the future. Equivalently, if we end up with an auto Enum trait, my suggestion would be that it not have a numeric Discriminant associated type (perhaps type Discriminant=!;).

On an automatically implemented intrinsic vs an explicitly implemented trait

One of the goals of this RFC is to make stability guarantees more clear, so I'm not sure about automatically implementing a trait vs needing to explicitly implement one (possibly via a derive).

as conversions currently force things like variant ordering to be part of the public API, whether or not that was the author's intent, and I suspect (without data) that this leads to accidental semver-breaking changes in the wild. These probably aren't a significant problem, but it feels in-keeping with the rest of the language that this API should be an opted-in offer of stability, rather than something implemented by default.

There's already std::mem::dicriminant for comparing enum discriminants without leaking the underlying discriminant values.

Is there a strong argument for automatically implementing a trait, rather than requiring opt-in?

On naming

I don't have strong views on having an enum-specific Enum/Discriminant vs a more general IntoUnderlying/Underlying.

On tags

I'd ideally like to avoid tags in this RFC - they feel like a more general form of metadata, rather than specifically concerned with conversion. If we end up with a sealed enum-specific trait we can naturally extend it in the future if we want, otherwise I suggest this should live elsewhere (either a separate trait, or alongside std::mem::discriminant).

@jswrenn
Copy link
Member

jswrenn commented Jan 1, 2021

Is there a strong argument for automatically implementing a trait, rather than requiring opt-in?

Yes: all enums already have an associated discriminant type, and knowing that type is important to as casting variants of the enum without shooting yourself in the foot. Both the discriminant type and the ability (or lack thereof) to as cast variants of an enum is dictated solely by the definition of the enum — not by the implementation of traits. Rust will currently tell you whether you can as cast a variant, but it won't tell you the correct discriminant type; you need to look at the definition yourself to find it. An intrinsic mem::Enum trait with an associated Discriminant type closes this gap.

Such a trait would be to make as casting less of a foot-gun, but not to make stability guarantees clearer. Combined with a lint, it'd be a small-yet-broadly-applicable improvement to the clarity of enum as casting in existing codebases.

This trait should be placed in mem (not convert) as it reflects the memory layout properties of a type and not its stability properties (like most everything else in mem).


If such a trait makes stability guarantees clearer, it's only insofar that it might draw more attention to the incredibly subtle stability implications of enum discriminants that already exist. If we could roll back the clock to pre-1.0, I'd say there isn't a compelling argument for discriminants and as-casting on #[repr(Rust)] enums. They are terribly implicit, whereas Rust favors the explicit. In virtually every use-case for as-casting on #[repr(Rust)] enums, an explicit From implementation would be a better choice.

In contrast, discriminants on #[repr(C)] and #[repr(<primitive>)] enums have a real purpose: they control the memory layout of variants. Here, a long-term approach to enum reform could include discouraging as casting and, once arbitrary_enum_discriminant is stabilized, discourage relying on implicitly-set discriminants.

Broadly: I see comprehensive enum reform as a long-term social challenge. Given that we can't merely turn back the clock, the best we can do is to nudge people in the right direction with lints over a suitably long period of time and then perhaps an edition change. Skipping right to introducing a new core::convert::IntoUnderlying trait could give the misleading impression that enum discriminants in the absence of an IntoUnderlying implementation aren't part of an enum's stability guarantee — but they are! We should tread carefully.

@illicitonion
Copy link
Author

tl;dr: If we're trying to make as-casts more safe, I think a blanket Enum implementation is a great incremental step. But it's a local optimum and a compromise. If we think we can get to a point where as-casts could be deprecated or even removed, we should avoid automatically exposing the discriminant, because it's a step in the opposite direction, and would be hard to undo.

I think that all generally makes sense, but there's a question as to whether in the long term we want discriminants to always be part of an enum's stability guarantee.

There are enums where their representation is an incidental detail, and not intended to be stable; imagine:

enum Color {
    Red,
    Green,
    Blue,
    ...
}

The enum author may want the freedom to grow the enum arbitrarily, restrict it to a smaller size, or re-order variants as they wish in the future.

I'm hoping we have a choice as to the road forwards; we can either accept that all enums must include their representation in their stability guarantee, or we can work towards enabling hiding representation from stability guarantees. Maybe because we shipped as-casts in 1.0, we don't have a choice here, and all enums must always have a stable representation, but I'm hoping after perhaps a very long path we can enable that optionality. But introducing an unconditional way of retrieving that repr makes that much harder to do.

I guess an alternative road forwards could be to introduce something like #[repr(opaque)] to explicitly opt out of stability guarantees, but if we can, the language tends to favour guarantees being things that are opted in to, rather than opted out of.

@jswrenn
Copy link
Member

jswrenn commented Jan 2, 2021

If we think we can get to a point where as-casts could be deprecated or even removed, we should avoid automatically exposing the discriminant, because it's a step in the opposite direction, and would be hard to undo.

Not so hard: we'd just deprecate the trait! :) But, the trait would continue to be very useful in the case of #[repr(C/primitive)] enums, since having such a trait would reduce some memory-safety hazards in the sort of code those enums tend to be used with (b.c. tag manipulations often involve unsafe code). And, again, nothing in mem carries semver stability guarantees so I don't see this posing much of a novel hazard. It's just like how the existence of size_of doesn't mean type size is part of stability guarantees; rather, it makes programmer assumptions about type size trivial to check programmatically.


The enum author may want the freedom to grow the enum arbitrarily, restrict it to a smaller size, or re-order variants as they wish in the future.

I totally agree with this ideal. Unfortunately, many stability hazards (and more!) stem from the existence as casting. They're stupidly subtle. For instance, it can be a breaking change to add variants to this enum:

#[non_exhaustive]
enum Foo {
    Bar
}

...because if we add the variant Baz(u8), suddenly all prior Foo::Bar as isize casts become compile errors.

And the stability hazards unrelated to as are even worse. For instance, in:

#[non_exhaustive]
pub enum Foo {
  Bar,
  // Baz,
}

#[repr(transparent)]
struct Fizz(u8, Foo);

...uncommenting the Baz variant causes this snippet to produce a compile error at the definition of Fizz.

A design of a more opaque enum would need to thoroughly account for these stability hazards, and more. (@joshlf's 2020 call for "private enum variants" comes to mind.) Otherwise, we might just end up with a new kind of enum that merely has its own distinct set of subtleties to worry about.

@jswrenn
Copy link
Member

jswrenn commented Jan 2, 2021

The enum author may want the freedom to grow the enum arbitrarily, restrict it to a smaller size, or re-order variants as they wish in the future.

Additionally: As written, I'm not sure that this RFC moves towards this goal. Take, for instance, this example in the RFC:

#[derive(IntoUnderlying)]
#[repr(u8)]
enum Number {
    Zero,
    One,
}

fn main() {
    assert_eq!(Number::Zero.into_underlying(), 0_u8);
}

This still exposes the ordering issues of implicit discriminants and as casting! If we reorder the variants, or insert a variant before One or Zero, the assertion will fail.

Here, as with as casts, the devil is in implicitly set discriminants. If the author had instead written:

#[repr(u8)]
enum Number {
    Zero = 0,
    One = 1,
}

...there would be no danger in reordering those variants, or adding new unit variants.

So, what would improve this situation is linting against implicitly set discriminants.

@mbartlett21
Copy link
Contributor

Having a Enum::try_from(int) would be useful.

For example, in a pratt parser, to get the next level of precedence, I am at the moment using mem::transmute(precedence as u8 + 1) to get the next variant of the enum (with a dummy variant at the end).

@illicitonion
Copy link
Author

If we think we can get to a point where as-casts could be deprecated or even removed, we should avoid automatically exposing the discriminant, because it's a step in the opposite direction, and would be hard to undo.

Not so hard: we'd just deprecate the trait! :)

But we don't like doing this in std!

But, the trait would continue to be very useful in the case of #[repr(C/primitive)] enums, since having such a trait would reduce some memory-safety hazards in the sort of code those enums tend to be used with (b.c. tag manipulations often involve unsafe code).

How would you feel about automatically deriving this trait only for explicitly repr'd enums, with no fields? i.e. no Enum impl for:

enum Color {
    Red,
    Blue,
}

or

#[repr(u8)]
enum Color {
    RGB(u8, u8, u8),
}

But one for:

#[repr(u8)]
enum Color {
    Red,
    Blue,
}

The idea being that by being explicit about the layout of your enum, you're opting in to exposing that information (and as you note, in a non-stability-opting-in way). I suspect (again, without data) that most enums which aren't intended to be interchangeable with numbers or passed over FFI boundaries don't have an explicit repr, though I'm not sure how confident I would be in claiming that most enums which do have an explicit repr are intended for these purposes.

That's the reason I'm looking for an explicit attribute, and I think an explicit repr is a sufficient explicit attribute.

And, again, nothing in mem carries semver stability guarantees so I don't see this posing much of a novel hazard. It's just like how the existence of size_of doesn't mean type size is part of stability guarantees; rather, it makes programmer assumptions about type size trivial to check programmatically.

👍

The enum author may want the freedom to grow the enum arbitrarily, restrict it to a smaller size, or re-order variants as they wish in the future.

I totally agree with this ideal. Unfortunately, many stability hazards (and more!) stem from the existence as casting. They're stupidly subtle. For instance, it can be a breaking change to add variants to this enum:

#[non_exhaustive]
enum Foo {
    Bar
}

...because if we add the variant Baz(u8), suddenly all prior Foo::Bar as isize casts become compile errors.

The thing I find upsetting about this is that this is valid:

#[repr(u8)]
pub enum Foo {
  Bar,
  Baz(u16),
}

which arguably it shouldn't be - what does this repr attribute even mean? But I think if we're happy with "explicit repr, no fields or constructor means you get an Enum definition", we can make this hazard go away easily.

And the stability hazards unrelated to as are even worse. For instance, in:

#[non_exhaustive]
pub enum Foo {
  Bar,
  // Baz,
}

#[repr(transparent)]
struct Fizz(u8, Foo);

...uncommenting the Baz variant causes this snippet to produce a compile error at the definition of Fizz.

A design of a more opaque enum would need to thoroughly account for these stability hazards, and more. (@joshlf's 2020 call for "private enum variants" comes to mind.) Otherwise, we might just end up with a new kind of enum that merely has its own distinct set of subtleties to worry about.

I did not realise this one! Again, though, requiring an explicit repr attribute makes this problem go away, as:

#[non_exhaustive]
#[repr(u8)]
pub enum Foo {
  Bar,
}

is not zero-sized.

The enum author may want the freedom to grow the enum arbitrarily, restrict it to a smaller size, or re-order variants as they wish in the future.

Additionally: As written, I'm not sure that this RFC moves towards this goal. Take, for instance, this example in the RFC:

#[derive(IntoUnderlying)]
#[repr(u8)]
enum Number {
    Zero,
    One,
}

fn main() {
    assert_eq!(Number::Zero.into_underlying(), 0_u8);
}

This still exposes the ordering issues of implicit discriminants and as casting! If we reorder the variants, or insert a variant before One or Zero, the assertion will fail.

Here, as with as casts, the devil is in implicitly set discriminants. If the author had instead written:

#[repr(u8)]
enum Number {
    Zero = 0,
    One = 1,
}

...there would be no danger in reordering those variants, or adding new unit variants.

So, what would improve this situation is linting against implicitly set discriminants.

I think that the explicitness of the attributes make this ok; either (and definitely both) of setting #[derive(IntoUnderlying)] and #[repr(u8)] is the author stating "The values of these enum variants are intentional", whether they're explicitly or implicitly stated. It's really the explicitness of "I've thought about these values" that I'm looking to have people opt in to, and I don't really mind how that opt-in happens.

@illicitonion
Copy link
Author

Having a Enum::try_from(int) would be useful.

For example, in a pratt parser, to get the next level of precedence, I am at the moment using mem::transmute(precedence as u8 + 1) to get the next variant of the enum (with a dummy variant at the end).

I completely agree! In the mean time, might I suggest the num_enum crate, which allows you to derive a TryFrom implementation :)

@jswrenn
Copy link
Member

jswrenn commented Jan 4, 2021

It's really the explicitness of "I've thought about these values" that I'm looking to have people opt in to, and I don't really mind how that opt-in happens.

If you want to signal that your enum should be SemVer-stably convertible to or from a primitive value, you (probably) should do so with the same mechanism you'd use for any other type: by implementing From, Into and Try{From,Into}. To make this transition extra ergonomic:

  • permit #[derive(Into)] on C-like enums with explicit discriminants
  • permit #[derive(TryFrom)] on data-free enums with explicit discriminants
Why the C-like and data-free restrictions?
  • A C-like enum is one in which every variant is a unit variant (i.e. not ending with {} or ()). Currently, it is only for C-like enums that every variant is as-castable into its discriminant.
  • A data-free enum is one in which every variant is fieldless. It is only for these enums that every variant can be wholly created from just its discriminant (in the logical sense; not necessarily in the in-memory layout sense).
Why require explicit discriminants?
  • If you were to write these implementations out by hand, you would need be explicit about the correspondence between variants and values. Requiring the same degree of explicitness for these derives demonstrates that you've thought about the particulars of the conversion.
Why not require an explicit repr?
  • Whether or not you use an explicit repr is mostly orthogonal to the issue of conversion. The primary purpose of an explicit repr is to manipulate the in-memory layout of an enum. There is very good reason to leave out the repr if you don't care about in-memory layout. The discriminant type of enum Foo { Bar = 0isize, Baz = 1isize } is isize, but Foo is represented with only one byte! That's not the case if you annotate Foo with repr(isize).

My primary concern about stability reform is that I'm sure that there have been Rust programmers who assumed that discriminants are SemVer stable because they're observable with as casting. In the absence of clear, official guidance, that's a totally reasonable position to take. But this sort of enum stability reform would effectively be breaking change for these folks. I'm not saying it's impossible, but it's more of a social challenge than a technical one.

@illicitonion
Copy link
Author

It's really the explicitness of "I've thought about these values" that I'm looking to have people opt in to, and I don't really mind how that opt-in happens.

If you want to signal that your enum should be SemVer-stably convertible to or from a primitive value, you (probably) should do so with the same mechanism you'd use for any other type: by implementing From, Into and Try{From,Into}. To make this transition extra ergonomic:

  • permit #[derive(Into)] on C-like enums with explicit discriminants
  • permit #[derive(TryFrom)] on data-free enums with explicit discriminants

If you take a look at #3040, that was actually how this proposal started (specifically #[derive(Into)]), but we changed direction because:

  1. #[derive(Into)] is not 100% clear what target type Into is being derived for.
  2. Into in general makes no claims about the cost of the conversion, and some users of as value that attestation of cheapness.

Maybe, though, the real answer is that we want both for different purposes; an Enum sealed auto-implemented trait which gives access to the discriminant type (and potentially other things), and the ability to derive stable conversion, and the two needn't be linked.

My primary concern about stability reform is that I'm sure that there have been Rust programmers who assumed that discriminants are SemVer stable because they're observable with as casting. In the absence of clear, official guidance, that's a totally reasonable position to take. But this sort of enum stability reform would effectively be breaking change for these folks. I'm not saying it's impossible, but it's more of a social challenge than a technical one.

Yes, I completely agree. I think that the aim of #3040 is to start the conversation about what that official guidance around as conversions should be. Regardless of that, though, I think that several of the steps (making it easier to perform the conversion with a trait, be that Into or IntoUnderlying), and linting for potentially truncating conversions, are stand-alone useful steps.

Thanks @jswrenn, by the way, for all of the time, thought, and effort you're putting in to helping to shape this direction (and all of the terrifying edge-cases you've been pointing out on the way), I really appreciate your thoughtful and knowledgable help!

@jswrenn
Copy link
Member

jswrenn commented Jan 4, 2021

1. #[derive(Into)] is not 100% clear what target type Into is being derived for.

If you require explicit discriminants for #[derive(Into)], it is pretty clear what the type and particulars of the conversion is: it's whatever the discriminants are.

I won't say it's 100% clear, but it's pretty close — the case that enum discriminants have a default type of isize still remains, but we're stuck with that. If we really want to make this case more explicit, we could implement a lint that warns on unsuffixed discriminant values on default repr enums. (And this could be done independently from any new traits.)

Non-trivial dervies always involve a sacrifice of explicitness. For instance, it's not explicitly clear how derive(Hash) computes hashes, nor how derive(PartialOrd) defines orderings — but that's just part of using a shorthand.

2. Into in general makes no claims about the cost of the conversion, and some users of as value that attestation of cheapness.

Those users would be mistaken to do so; as carries no formal promise of cheapness! And what constitutes "cheap" depends on the situation. It is potentially infinitely cheaper to work with values of enum Foo { Bar } than it is to work with Foo::Bar as isize. And while some as conversions are cheap truncations, others involve relatively expensive manipulations (like conversions to/from floating point numbers).

The issue of "cheapness" aside, there's already a version of Into that does carry some connotation of cheapness: AsRef.

Concretely, this:

#[derive(AsRef)]
enum Foo {
    A = 1,
    B = 2,
    C = 3,
}

...could expand to this:

impl AsRef<isize> for Foo {
    fn as_ref(&self) -> &'static isize {
        match self {
            Foo::A => &1,
            Foo::B => &2,
            Foo::C => &3,
        }
    }
}

So, let me amend my proposal:

  • permit #[derive(Into)] and #[derive(AsRef)] on C-like enums with explicit discriminants
  • permit #[derive(TryFrom)] on data-free enums with explicit discriminants

Maybe, though, the real answer is that we want both for different purposes; an Enum sealed auto-implemented trait which gives access to the discriminant type (and potentially other things), and the ability to derive stable conversion, and the two needn't be linked.

Yep, this is my thinking, too — these issues can be (and should be) treated basically independently from each other.

@gilescope
Copy link

bikeshedding: CastInto might be a better name to imply it's trivial rather than calling it TrivialInto?

@liigo
Copy link
Contributor

liigo commented Jun 18, 2021

IntoRaw?

@scottmcm
Copy link
Member

@liigo Looks like it's going in the direction of AsRepr, per the libs conversation in rust-lang/rust#81642 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T-lang Relevant to the language team, which will review and decide on the RFC. T-libs-api Relevant to the library API team, which will review and decide on the RFC.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants