Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add enumerations (tagged unions) #988

Open
jfecher opened this issue Mar 15, 2023 · 9 comments
Open

Add enumerations (tagged unions) #988

jfecher opened this issue Mar 15, 2023 · 9 comments
Labels
enhancement New feature or request

Comments

@jfecher
Copy link
Contributor

jfecher commented Mar 15, 2023

Problem

It would be useful in Noir to support types like Option and Result which hold different values based upon a tag value. Optional values in particular would be useful for future Noir contracts.

Proposed solution

Add rust-style enumerations into noir, pattern matching and all.

The main concern is how to represent enumerations given we cannot reinterpret-cast data into another type. I propose representing an arbitrary enumeration:

enum Foo {
    A,
    B(BData),
    C(CData, Other),
    ...,
}

As a struct:

struct Foo {
    tag: Field, // Set internally, 0=A, 1=B, 2=C, etc.
    a_data: (),
    b_data: BData,
    c_data: (CData, Other),
    ...,
}

Now, given a constructor call such as Foo::B(my_b_data), the compiler can translate this into a struct initialization. There is the question of what to do for the fields of other variants like (), CData, and Other. Since these values would be guaranteed to be inaccessible by the compiler when the tag value is B, and since all values in Noir are comprised of Fields, we can initialize all these values to 0 without changing behavior:

let b = Foo::B(my_b_data);

// translated to:
let b = Foo {
    tag: 1, // B is the second variant
    a_data: dep::std::unsafe::zeroed(),
    b_data: my_b_data,
    c_data: dep::std::unsafe::zeroed(),
};

A concrete example

Option in this scheme could be typed as:

enum Option<T> {
    None,
    Some(T),
}

fn main(x: Option<Field>) {
    let y = Some(2);

    let x_value = match x {
        None => 0,
        Some(value) => value,
    };

    constrain y == Some(x_value);
}

Which would be translated internally as:

struct Option<T> {
    tag: Field,
    some_value: T,
}

fn main(x: Option<Field>) {
    let y = Option { tag: 1, some_value: 2 };

    let x_value = 
        if x.tag == 0 { 0 }
        else { x.some_value }; // compiler guarantees that x != 0 implies x == 1

    constrain y == Option { tag: 1, some_value: x_value };
}

Alternatives considered

As a slight variant of the above translation of match, we could consider an alternate translation that explicitly checks each tag and fails if an impossible case is somehow reached:

let x_value = 
    if x.tag == 0 { 0 }
    else if x.tag == 1 { x.some_value }
    else { panic("Invalid tag for Option enum") }

Where panic would be defined as:

fn panic<T, N>(msg: str<N>) -> T {
    std::println(msg);
    constrain false;
    
    // Now construct a value of any type T.
    // It will never be used anyway due to the previous line
    dep::std::unsafe::zeroed::<T>()
}

Additional context

The actual translation of enums into structs could occur during monomorphisation since this would be the point all generic types become known and we could thus fill in default values for them. This would also be strictly after completeness checking for match expressions which would likely occur during type checking.

@kevaundray
Copy link
Contributor

@jfecher This seems like a good addition, along with the match syntax. The match syntax desugaring to if conditions could also be another separate issue that is orthogonal to this (it will be limited without enums)

@jfecher
Copy link
Contributor Author

jfecher commented Jul 17, 2023

Something else to note about this proposal is that it still wouldn't allow us to write recursive types. For example we wouldn't be able to define a List:

enum List<T> {
    Empty,
    Cons(T, List<T>),
}

// Translated as:
struct List<T> {
    tag: Field, // 0 = Empty, 1 = Cons
    empty_data: (),
    cons_data: (T, List<T>), // oh, no infinite recursion in types is not allowed
}

@Savio-Sou
Copy link
Collaborator

Lowering this to P-LOW for now as we are aiming to tackle this post-1.0.

@kevaundray kevaundray added this to the 0.24.0 milestone Jan 15, 2024
@Savio-Sou Savio-Sou modified the milestones: 0.24, 0.25 Jan 26, 2024
@Savio-Sou Savio-Sou removed this from the 0.25 milestone Feb 9, 2024
@michaeljklein
Copy link
Contributor

Proposed approach:

  • If time-pressure to implement, start with #[repr(u8)]-style enums
  • Otherwise, start with struct-style ((u8, A, B, ..)) enums so that abstractions include type metadata
    (which aren't needed for #[repr(u8)]-style enums)
  • Start with the simplest type of match .. { .. }: all cases must be matched
  • Then implement #[repr(T)] enums:
// T: IsDiscriminatable
#[repr(T)]
enum Foo {
  // T0: ToFrom<T>
  Case0(T0),

  // T1: ToFrom<T>
  Case1(T1),
  ..
}

trait IsDiscriminatable {
  fn discriminant(&self) -> u8;
}

trait ToFrom<T> {
  fn to_repr(self) -> T;
  fn from_repr(x: T) -> Self;
}

Summary:

  • repr(T) requires T to implement IsDiscriminatable and ToFrom<T> for each case type
  • (This requires T0, T1, .. to be distinct,
    which could be solved by including the case index or name,
    i.e. ToFrom<T, 0> or ToFrom<T, "Case0">.)

Runtime match:

match x {
    Case0(y) => f(y),
    ..
}

Becomes:

if x.discriminant() == discriminant_of(Case0) {
    f(ToFrom::from_repr(x)) // with some Case0 annotation
} else {
    ..
}

Runtime construction:

let y = Case0(x);

Becomes:

let y = x.to_repr(); // with some Case0 annotation

@jfecher
Copy link
Contributor Author

jfecher commented Jul 17, 2024

if x.discriminant() == discriminant_of(Case0) {

I think the one thing we're missing is how to determine discriminant_of(Case0) assuming we cannot construct a Case0 value.

repr(T) requires T to implement IsDiscriminatable and ToFrom for each case type

We should probably go with a name more specific than ToFrom<T> especially since it sounds like we'll need to include the discriminant somehow. Perhaps Variant in the name somewhere would make sense.

@michaeljklein
Copy link
Contributor

I think the one thing we're missing is how to determine discriminant_of(Case0) assuming we cannot construct a Case0 value.

It would be more accurate to say discriminant_of("ident_path::Case0"), since we'd be looking up the definition at compile-time from it's path/identifier to determine the discriminant, e.g. from user-listed values Red = 2, Blue = 3 or from their index in the definition.

We should probably go with a name more specific than ToFrom<T> especially since it sounds like we'll need to include the discriminant somehow. Perhaps Variant in the name somewhere would make sense.

Sounds good to me

@jfecher
Copy link
Contributor Author

jfecher commented Jul 19, 2024

e.g. from user-listed values Red = 2, Blue = 3 or from their index in the definition.

This is the part I think we're still missing. How should users specify this on their struct for example?
I'm not pointing this out because it is unsolvable - I'm just suggesting we should nail down what this'd look like.

@michaeljklein
Copy link
Contributor

Similarly to how they're specified in Rust here.

In short:

#[repr(u8)]
enum Enum {
    Unit = 3,

    // overlap is an error:
    // Overlapping = 3,

    Tuple(u16), // implicitly 4

    Struct {
        a: u8,
        b: u16,
    } = 10,

    Large = 255,
    TooBig, // implicitly 256, which overflows u8 and so is an error
}

@jfecher
Copy link
Contributor Author

jfecher commented Jul 19, 2024

@michaeljklein ok, I think I misunderstood us wanting to allow this on structs as those structs wouldn't also have a separate enum declaration. So if you have a struct that you want to treat as an enum representation you'll still need to define a separate type to actually act as the enum and declare its variants then.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Status: 📋 Backlog
Development

No branches or pull requests

4 participants