Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

enums should disallow multiple enumerations with the same value #2115

Closed
tiehuis opened this issue Mar 28, 2019 · 19 comments
Closed

enums should disallow multiple enumerations with the same value #2115

tiehuis opened this issue Mar 28, 2019 · 19 comments
Labels
accepted This proposal is planned. breaking Implementing this issue could cause existing code to no longer compile or have different behavior. proposal This issue suggests modifications. If it also has the "accepted" label then it is planned.
Milestone

Comments

@tiehuis
Copy link
Member

tiehuis commented Mar 28, 2019

@andrewrk: I re-opened this to propose reversing the decision


A C enum allows multiple values. e.g.

enum Foo {
    OPTION1 = 0,
    OPTION2 = 1,
    DEFAULT = OPTION1,
};

However, zig extern enum's do not (note: we cannot declare a member based on another member, another possibly improvement)

const Foo = extern enum {
    Option1 = 0,
    Option2 = 1,
    Default = 0,
};

test "asd" {
    const a = Foo.Default;
}

Fails with compilation

/tmp/t.zig:4:15: error: enum tag value 0 already taken
    Default = 0,
              ^
/tmp/t.zig:2:15: note: other occurrence here
    Option1 = 0,
              ^

We should allow extern enums to have the same tag value for multiple members. The specific use-case here would be for translate-c. Encountered this when trying to process gmp.h which has this enum present:

/* Available random number generation algorithms.  */
typedef enum
{
  GMP_RAND_ALG_DEFAULT = 0,
  GMP_RAND_ALG_LC = GMP_RAND_ALG_DEFAULT /* Linear congruential.  */
} gmp_randalg_t;

Alternatively for translate-c enums could be translated as sets of constants but this feels wrong. There are possibly some questions regarding switch cases and handling of duplicate values there that I haven't considered.

@tiehuis tiehuis added bug Observed behavior contradicts documented or intended behavior enhancement Solving this issue will likely involve adding new logic or components to the codebase. labels Mar 28, 2019
@tiehuis tiehuis added this to the 0.5.0 milestone Mar 28, 2019
@Hejsil
Copy link
Contributor

Hejsil commented Mar 28, 2019

I'll add something related to this, that I've hit before.
C users also uses enums for flags, so sometimes you'll see this in headers:

enum flags {
    A = 1 << 0,
    B = 1 << 1,
};
enum flags all = A | B;

translate_c will also fail to translate something like this if all is ever used.

@andrewrk andrewrk added proposal This issue suggests modifications. If it also has the "accepted" label then it is planned. and removed bug Observed behavior contradicts documented or intended behavior enhancement Solving this issue will likely involve adding new logic or components to the codebase. labels Mar 28, 2019
@tgschultz
Copy link
Contributor

tgschultz commented Mar 28, 2019

It's possible to do this using definitions:

pub const Compression = extern enum(u32)
{
    Uncompressed    =   0,
    RLE8            =   1,  //8bpp only
    RLE4            =   2,  //4bpp only
    Bitfields       =   3,  //Windows, 16bpp/32bpp
    
    //These seem to only exist for passthrough to
    // devices like printers and don't actually
    // exist in .bmp files
    JPEG            =   4,  //Windows v3
    PNG             =   5,  //Windows v3
    
    //This is legit
    AlphaBitfields  =   6,  //Windows v4
    
    //I think these only show up in Windows Metafiles,
    // not actual .bmp files
    CMYK            =   11,
    RLE8CMYK        =   12,
    RLE4CMYK        =   13,
    
    //These only appear in OS/2 files
    pub const Huffman1D = @This().Bitfields; //OS/2, Monochrome only
    pub const RLE24   = @This().JPEG;        //OS/2, 24bpp only
};

However the definitions don't work as enum literals.

@eira-fransham
Copy link

eira-fransham commented Jun 6, 2019

I think that C enums should be converted to:

pub const mylib_enumname: type = struct {
    pub const A: comptime_int = 0;
    pub const B: comptime_int = 1;
    // ...
};

Since that matches how C sees them. At the moment there are plenty of C headers that cannot be imported without modification.

@kkartaltepe
Copy link
Contributor

This also shows up in clang-c/libclang https://clang.llvm.org/doxygen/Index_8h_source.html in the CXCursor enumeration.

@gruebite
Copy link

Encountered this when creating bindings for Godot script API: https://github.com/GodotNativeTools/godot_headers/blob/master/nativescript/godot_nativescript.h#L45

@SpexGuy
Copy link
Contributor

SpexGuy commented Nov 20, 2019

I've been thinking about the current definition of extern enum, and I've come to the conclusion that @cImport should never use it. Here's a summary of my thoughts:

Zig's C translator replaces enums with integer types of the correct size. The constant values, however, are generated as extern enum. At first glance, this means that for all code that touches the external API, the zig code must use @enumToInt when passing values and @intToEnum when receiving values. But there's a problem that arises here related to Zig's strict rules about enum semantics. Consider the case where a C function returns an error code of an enumerated type. The temptation here is to write switch(@intToEnum(returnedValue)). But this approach has the potential to introduce dangerous undefined behaviour in the event that the C API returns an unnamed value. Such a case is totally valid according to the C API, and can happen easily if a DLL is updated. This means that the generated enum values can never be used safely without wrapping them in @enumToInt. Since Zig is trying to 'beat C at its own game', this presents an ergonomics problem that should be addressed. I see two valid fixes:

  • Adopt @Vurich's suggestion to compile enums to namespaced comptime_int values. This would allow us to define the C enum type as an integer type of a known size, which would also make dealing with flags much easier. This also allows us to put the "enum type" in structs and function prototypes, improving readability.
  • Expand the definition of extern enum (or make a new c_enum) to match the C definition. This would mean that extern enums are allowed to take on values that are not named, and switches on them must always specify an else case. We could also expand a step further and define integer operations on extern enums (but not implicit conversions). This would give us a chance to provide some improved syntax for flags, such as const value: MyEnumFlags = .ROOT_SET_BIT | .IGNORE_BIT;

Personally I like the second option better, but either of these would improve ergonomics and make interfacing with C code safer.

@daurnimator
Copy link
Contributor

daurnimator commented Nov 20, 2019

The temptation here is to write switch(@intToEnum(returnedValue)). But this approach has the potential to introduce dangerous undefined behaviour in the event that the C API returns an unnamed value.

@intToEnum should have checked behaviour (you should get the error "has no tag matching integer value"); if you expect such a thing to occur, that's the usecase for non-exhaustive enums

@SpexGuy
Copy link
Contributor

SpexGuy commented Nov 20, 2019

The behaviour is checked, but only in debug and safe builds. Mismatched dll versions are the sort of thing that happens often in production and very rarely during development.

@SpexGuy
Copy link
Contributor

SpexGuy commented Nov 20, 2019

Ah, I hadn't seen #2524. Having @cImport generate non-exhaustive enums solves the UB problem, but there is still a usability issue with C APIs that use flags requiring @enumToInt and @intToEnum. Defining bitwise operations on non-exhaustive enums of the same type might help but that's not really in line with the intended semantics of non-exhaustive enums. I think I'm still looking for a slightly different use case to make dealing with imported C APIs less painful.

@Vexu
Copy link
Member

Vexu commented Dec 14, 2019

I think the best way extern enum can work is @SpexGuy's second option with integer operations and exhaustive enums.

Pros:

  • Type safety of Zig enums. If you want to pass an extern enum to an integer you woukd still need to use @EnumToInt.
  • Easier interop with c code. Integer operations between same type extern enum are allowed.
  • translate-c can be simpler.

Cons:

  • Someone might abuse it somehow?

I have enums implemented in translate-c-2 in a way that would benefit from this proposal getting accepted.

@Vexu
Copy link
Member

Vexu commented Dec 14, 2019

I think the best way extern enum can work is ...

Implemented here Vexu@7353e17 for reference.

@andrewrk andrewrk added the accepted This proposal is planned. label Dec 14, 2019
@andrewrk
Copy link
Member

andrewrk commented Dec 14, 2019

Here's what's accepted:

  • extern enum to allow multiple fields with the same value.
  • Expand the definition of extern enum (or make a new c_enum) to match the C definition. This would mean that extern enums are allowed to take on values that are not named, and switches on them must always specify an else case.

What I'm not convinced of:

  • Allowing integer operations on extern enums.

Note that extern enums do not necessarily mean "compatible with the semantics of C" - they mean "compatible with the ABI of C". Enums are not a valid way to represent flags. An enum is a scalar value; flags is a set of booleans. As far as what this means for translate-c, I think the @enumToInt / @intToEnum conversions will be fine. The messiness is a good signal that this code could be better if it were converted to zig types properly, e.g. a struct with align(0) bools. One thing to note is that @intToEnum for extern enums would have no safety check, since an extern enum would allow values that are not named.

"Beating C at its own game" -> as long as the .h file does proper C types. Using enums for flags even in C is an abuse of the type system, and so I'm happy with that being awkward to deal with. The awkwardness is correct.

@andrewrk
Copy link
Member

This is now implemented and in master branch. However, note the bug #3991

@andrewrk andrewrk added the accepted This proposal is planned. label Apr 29, 2021
andrewrk added a commit that referenced this issue May 11, 2021
See #2115. The concept of `packed enum` and `extern enum` is
getting removed from the language.
@andrewrk andrewrk modified the milestones: 0.8.0, 0.9.0 May 19, 2021
ehaas added a commit to ehaas/zig that referenced this issue Jun 21, 2021
Translate enum types as the underlying integer type. Translate enum constants
as top-level integer constants of the correct type (which does not necessarily
match the enum integer type).

If an enum constant's type cannot be translated for some reason, omit it.

See discussion ziglang#2115 (comment)

Fixes ziglang#9153
ehaas added a commit to ehaas/zig that referenced this issue Jun 22, 2021
Translate enum types as the underlying integer type. Translate enum constants
as top-level integer constants of the correct type (which does not necessarily
match the enum integer type).

If an enum constant's type cannot be translated for some reason, omit it.

See discussion ziglang#2115 (comment)

Fixes ziglang#9153
Vexu pushed a commit that referenced this issue Jun 23, 2021
Translate enum types as the underlying integer type. Translate enum constants
as top-level integer constants of the correct type (which does not necessarily
match the enum integer type).

If an enum constant's type cannot be translated for some reason, omit it.

See discussion #2115 (comment)

Fixes #9153
elerch added a commit to elerch/aws-sdk-for-zig that referenced this issue Jun 30, 2021
There are two recent changes in zig that effect awshttp.

1. cf65ab8 disallows unused variables. Fair enough and backward
compatible.
2. ziglang/zig#2115 (comment)

This comment resulted in a backward incompatible change to use the
underlying value from a C enum rather than its symbol. This reduces
edge cases in the compiler. Ultimately we may want awshttp to define
zig enums that mirror the C enums, but for now I've commented the
definitions of the C enums used.
@andrewrk andrewrk modified the milestones: 0.9.0, 0.10.0 Nov 20, 2021
@andrewrk andrewrk changed the title extern enum should disallow multiple enumerations with the same value enums should disallow multiple enumerations with the same value Nov 26, 2021
@andrewrk andrewrk modified the milestones: 0.10.0, 0.11.0 Apr 16, 2022
@andrewrk andrewrk modified the milestones: 0.11.0, 0.12.0 Apr 9, 2023
@andrewrk andrewrk modified the milestones: 0.13.0, 0.12.0 Jun 29, 2023
@cryptocode
Copy link
Contributor

cryptocode commented Apr 23, 2024

People mentioned enum flags being used in C. Here's an idiom I've been using when interacting with certain OS data structures whose enums have certain values based on other enum values:

/// Page protection flags
pub const Protection = EnumFromDecl(vm_prot_t, false, struct {
    pub const READ: u32 = 0x01;
    pub const WRITE: u32 = 0x02;
    pub const READ_WRITE: u32 = READ | WRITE;
    ...
});

This mirrors the C definition/documentation nicely, while allowing use-sites to switch on the enum:

switch (seg64.initprot) {
    ...
    .READ_WRITE => std.debug.print("Read-write case\n", .{}),
    ...
}

EnumFromDecl just walks through the declarations and produce enum fields, basically the inverse to EnumFieldStruct which already exists in the std library.

The bool parameter determines if the resulting enum should be exhaustive or not.

@mlugg
Copy link
Member

mlugg commented Apr 23, 2024

@cryptocode, enum flags should be represented using packed struct.

pub const Protection = packed struct(u32) {
    read: bool, // this is the LSB
    write: bool, // and the next bit
    execute: bool, // etc
    _: u29 = 0, // remaining (padding) bits
};

@cryptocode
Copy link
Contributor

@mlugg yeah most of time either that or just constants. But sometimes you really do want an enum, it just so happens that some fields are based on others. That is, on use-sites you wanna switch on the named combinations rather than inspecting individual flags. Narrow usecase, but when it fits it's really nice as you get the benefits of using switch (exhaustiveness being one)

Not proposing any changes to language or std, just sharing an idiom that's been useful, say, when all named flag combinations must be handled.

@andrewrk
Copy link
Member

This was indeed reversed, the language now disallows multiple enumerations with the same value.

https://ziglang.org/download/0.8.0/release-notes.html#No-More-Extern-or-Packed-Enums

#9938 provides an alternative to enum tag aliasing.

@andrewrk andrewrk modified the milestones: 0.14.0, 0.8.0 Jan 25, 2025
@NewbiZ
Copy link

NewbiZ commented Jan 30, 2025

@cryptocode, enum flags should be represented using packed struct.

This works, until it doesn't. The whole posix and os packages are littered with this idiom, and they do not age well.

The packed struct trick is neat, but it falls down whenever you realize that using it implies a guarantee that the flags do not, and will never overlap. I'm not sure I would easily give that promise easily on my own code, much less on code over which I have no control, such as when using translate-c.

Also the whole renaming of enums O_x to O.X is just half baked abstraction that mostly confuses users without real benefits.

I support very much the decision to drop these idioms from any manual or translate-c and just stick to compime_ints. Resulting code will be less idiomatic, but much saner.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
accepted This proposal is planned. breaking Implementing this issue could cause existing code to no longer compile or have different behavior. proposal This issue suggests modifications. If it also has the "accepted" label then it is planned.
Projects
None yet
Development

Successfully merging a pull request may close this issue.