Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove anonymous struct types from the language #16865

Closed
mlugg opened this issue Aug 17, 2023 · 23 comments · Fixed by #21817
Closed

Remove anonymous struct types from the language #16865

mlugg opened this issue Aug 17, 2023 · 23 comments · Fixed by #21817
Labels
accepted This proposal is planned. proposal This issue suggests modifications. If it also has the "accepted" label then it is planned.
Milestone

Comments

@mlugg
Copy link
Member

mlugg commented Aug 17, 2023

This proposal is extracted from #16512, with more details and justification.

Background

Zig currently has the concept of an "anonymous struct type". This is a type which comes from an anonymous struct literal (.{ ... }) with no known result type. These types are special: they allow coercions based on structural equivalence which normal structs do not allow. For instance:

const anon = .{ .x = 123 };
const S = struct { x: u32 };
const s: S = anon;
_ = s;

This works because anon has an anonymous struct type, and all of its field types (in this case comptime x: comptime_int = 123) are coercible to those of S, so @TypeOf(anon) coerces to S field-wise. Anonymous struct types also allow even stranger coercions, such as allowing these coercions through pointers by creating new constants (e.g. *const @TypeOf(anon) coerces to *const S).

Justification

I'm not entirely sure why anonymous struct types exist. My guess is that they originated before RLS, as the method for anonymous initializers to initialize concrete types. In that world, the concept makes sense, but today - with RLS - untyped anonymous initializers are virtually never used. Retaining anonymous struct types significantly complicates the language:

  • It introduces a new kind of type which cannot be differentiated by metaprogramming
  • It introduces flawed coercions
  • It hides potentially expensive copies which could be trivially avoided when coercing to equivalent types
  • It leads to less readable code

To pick up on the last point in particular: the only case where anonymous struct types are really used today is when writing code such as the above example. This kind of code would really benefit from a type annotation: it's unclear what anon is meant to be! Beginners sometimes write this kind of code expecting Zig to use the information from the later lines in type inference (inferring that anon should have type S): but anonymous struct types actually mask the issue here, potentially making code "work" whilst being harder to read, slower, and potentially buggier.

Lastly, time to quantify a statement I made a moment ago:

...untyped anonymous initializers are virtually never used.

I looked at the ZIR for a few random files of real Zig code, and noted the following things:

  • The total number of struct init instructions
  • The number of anonymous struct inits (struct_init_anon or struct_init_anon_ref)
  • The number of those anonymous struct inits which would remain if Expand RLS for ref result type #16512 were implemented
Source Total Struct Inits Anonymous Struct Inits Would Remain With Better RLS
Sema.zig 1199 4 0
std/array_list.zig 29 1 1 (but removing improves code!)
std/mem.zig 47 1 0
std/Build.zig 38 0 0
Bun: js_parser.zig 814 0 0
Bun: js_ast.zig 278 0 0

You can see from these numbers that untyped inits rarely happen, and when they do, the proposed RLS improvements would eliminate them. Note that if you try, you can find some files which do genuinely use a lot of anonymous inits right now - for instance arch/x86_64/Lower.zig in the compiler has 107 at the time of writing - but as far as I can tell from a quick glance every single one of those would be eliminated by #16512. That proposal can essentially be considered a prerequisite of this one.

Proposal

Eliminate anonymous struct types from the language. Untyped struct initializations are still permitted - they are useful for metaprogramming (e.g. std.Build.dependency's args parameter) - but they return a "standard" struct type, with no extra allowed coercions etc.

There's not much else to say. This is a proposal to remove an unnecessary concept from Zig: simplifying the language, encouraging code readability, and making us less prone to issues such as #16862.

@rohlem
Copy link
Contributor

rohlem commented Aug 17, 2023

It introduces flawed coercions

The coercion is only flawed for pointers to those types, for which I agree they should be disallowed.

I also agree that simplifying the language by making them the same as regular struct-s without extra coercions is desirable,
since I think the primary use cases should be solvable by a manual copy implemented using @typeInfo-reflection.
If that ends up being too hairy / suboptimal, perhaps a builtin @construct taking an anytype initialization expression (similar to @apply taking the argument tuple of a function) would be useful.
(I imagine that could also be used to implement @unionInit in userspace - or we keep them as separate @unionInit and @structInit for clarity.)

@jacobly0 jacobly0 added proposal This issue suggests modifications. If it also has the "accepted" label then it is planned. frontend Tokenization, parsing, AstGen, Sema, and Liveness. labels Aug 17, 2023
@jacobly0 jacobly0 added this to the 0.13.0 milestone Aug 17, 2023
@jacobly0 jacobly0 removed the frontend Tokenization, parsing, AstGen, Sema, and Liveness. label Aug 17, 2023
@Jarred-Sumner
Copy link
Contributor

Jarred-Sumner commented Aug 22, 2023

Would this mean logging with named parameters now needs an explicit type?

Current:

std.debug.print("Hello {[name]s. Welcome to {[project]s}."", .{ .name = "John", .project = "Boop" });

After:

const DoesThisMeanIHaveToWriteThisOutWhenLoggingNow = struct { name: []const u8, project: []const u8 };

std.debug.print("Hello {[name]s. Welcome to {[project]s}.", DoesThisMeanIHaveToWriteThisOutWhenLoggingNow{ .name = "John", .project = "Boop" });

What are the implications for zon, which seems to rely a lot on anonymous types?

@mlugg
Copy link
Member Author

mlugg commented Aug 22, 2023

No, it would not, as mentioned in the proposal:

Untyped struct initializations are still permitted - they are useful for metaprogramming (e.g. std.Build.dependency's args parameter)

The exact same logic applies to the args parameter of std.fmt.format. Pretty much the only thing this proposal changes about the language is disallowing certain coercions.

@mpfaff
Copy link
Contributor

mpfaff commented Aug 29, 2023

Pretty much the only thing this proposal changes about the language is disallowing certain coercions.

The title suggests you are proposing to "remove anonymous struct types" from the language. If that is not an accurate summary of the proposal, should it be changed to reflect that?

@mlugg
Copy link
Member Author

mlugg commented Aug 29, 2023

The title suggests you are proposing to "remove anonymous struct types" from the language. If that is not an accurate summary of the proposal, should it be changed to reflect that?

That is an accurate summary. The only user-facing change to the language caused by removing these types is disallowing certain coercions that are currently allowed, because said coercions are the only difference between "standard" and anonymous struct types.

@rohlem
Copy link
Contributor

rohlem commented Aug 30, 2023

I think the source of confusion (and what tripped me up initially as well) is that "anonymous struct literal syntax" .{.a = 3} stays,
but instead of being of an ad-hoc created unnamed "anonymous struct type"
it is now of an ad-hoc created unnamed "regular struct type".

As "anonymous" has (afaik) never been well-defined, it can be misinterpreted to mean ad-hoc created, unnamed, or both -
in this case it instead means "special coercion rules" that have no real connection to the abstract concept of anonymity afaict.

An alternative formulation would be "remove special coercions from the type of untyped struct literals" because that's the minimal user-facing impact of the change.
IMO we should settle on a name like untyped, deduced-type, ad-hoc, or something not related to identification/naming
(as in Zig types are distinct => identifiable, and values including types are unnamed until they're assigned to a named location),
and consistently use that word for all similar features (in langref, code, discussion, etc.) to avoid confusion going forward.

@VortexCoyote
Copy link

pardon if i have misunderstood, but does this mean that the syntax for assigning variables with a struct literal will disappear as well? will this affect anonymous tuples as well?

so for instance:

const extra_args = .{ 32, "some text" };
std.log.info("{any}, {s}, {any}, {s}", .{ 64, "some other text" } ++ extra_args);

being able to assign variables with anonymous structs/tuples for later parsing/deduction is a very nice feature,
as it provides convenient declarative syntax for initializing recursive data structures (like an UI tree) or constructing types. and being able to declare anonymous tuples provides good reusability, as you can concatenate them to other anonymous tuples later on, as shown above.

@mlugg
Copy link
Member Author

mlugg commented Sep 10, 2023

As mentioned in both this comment and this statement in the original issue:

Untyped struct initializations are still permitted - they are useful for metaprogramming...

...no, that syntax remains, and the example you give will continue to work.

This change does not affect tuples at all, because tuple types already work on structural equivalence, which is one of their defining features. In essence, all tuple types are already anonymous, and that won't change.

As I said above: the only user-facing language change you will see from anonymous struct types being removed is certain coercions no longer working. These coercions are the defining property of anonymous struct types, and the only difference between them and concrete struct types.

@ikskuh
Copy link
Contributor

ikskuh commented Sep 19, 2023

Had a bit of a panic, but after the clarification, i'm all in for the change. Luuk explained that feature to me once and its really horrible and i think i only ever used it once with full intent.

So: Yeet that shit, make anonymous struct literals without result type just regular, implicitly declared struct values

mlugg added a commit to mlugg/zig that referenced this issue Sep 19, 2023
This commit introduces the new `ref_coerced_ty` result type into AstGen.
This represents a expression which we want to treat as an lvalue, and
the pointer will be coerced to a given type.

This change gives known result types to many expressions, in particular
struct and array initializations. This allows certain casts to work
which previously required explicitly specifying types via `@as`. It also
eliminates our dependence on anonymous struct types for expressions of
the form `&.{ ... }` - this paves the way for ziglang#16865, and also results
in less Sema magic happening for such initializations, also leading to
potentially better runtime code.

As part of these changes, this commit also implements ziglang#17194 by
disallowing RLS on explicitly-typed struct and array initializations.
Apologies for linking these changes - it seemed rather pointless to try
and separate them, since they both make big changes to struct and array
initializations in AstGen. The rationale for this change can be found in
the proposal - in essence, performing RLS whilst maintaining the
semantics of the intermediary type is a very difficult problem to solve.

This allowed the problematic `coerce_result_ptr` ZIR instruction to be
completely eliminated, which in turn also simplified the logic for
inferred allocations in Sema - thanks to this, we break even on line
count!

In doing this, the ZIR instructions surrounding these initializations
have been restructured - some have been added and removed, and others
renamed for clarity (and their semantics changed slightly). In order to
optimize ZIR tag count, the `struct_init_anon_ref` and
`array_init_anon_ref` instructions have been removed in favour of using
`ref` on a standard anonymous value initialization, since these
instructions are now virtually never used.

Resolves: ziglang#16512
Resolves: ziglang#17194
mlugg added a commit to mlugg/zig that referenced this issue Sep 19, 2023
This commit introduces the new `ref_coerced_ty` result type into AstGen.
This represents a expression which we want to treat as an lvalue, and
the pointer will be coerced to a given type.

This change gives known result types to many expressions, in particular
struct and array initializations. This allows certain casts to work
which previously required explicitly specifying types via `@as`. It also
eliminates our dependence on anonymous struct types for expressions of
the form `&.{ ... }` - this paves the way for ziglang#16865, and also results
in less Sema magic happening for such initializations, also leading to
potentially better runtime code.

As part of these changes, this commit also implements ziglang#17194 by
disallowing RLS on explicitly-typed struct and array initializations.
Apologies for linking these changes - it seemed rather pointless to try
and separate them, since they both make big changes to struct and array
initializations in AstGen. The rationale for this change can be found in
the proposal - in essence, performing RLS whilst maintaining the
semantics of the intermediary type is a very difficult problem to solve.

This allowed the problematic `coerce_result_ptr` ZIR instruction to be
completely eliminated, which in turn also simplified the logic for
inferred allocations in Sema - thanks to this, we break even on line
count!

In doing this, the ZIR instructions surrounding these initializations
have been restructured - some have been added and removed, and others
renamed for clarity (and their semantics changed slightly). In order to
optimize ZIR tag count, the `struct_init_anon_ref` and
`array_init_anon_ref` instructions have been removed in favour of using
`ref` on a standard anonymous value initialization, since these
instructions are now virtually never used.

Resolves: ziglang#16512
Resolves: ziglang#17194
mlugg added a commit to mlugg/zig that referenced this issue Sep 20, 2023
This commit introduces the new `ref_coerced_ty` result type into AstGen.
This represents a expression which we want to treat as an lvalue, and
the pointer will be coerced to a given type.

This change gives known result types to many expressions, in particular
struct and array initializations. This allows certain casts to work
which previously required explicitly specifying types via `@as`. It also
eliminates our dependence on anonymous struct types for expressions of
the form `&.{ ... }` - this paves the way for ziglang#16865, and also results
in less Sema magic happening for such initializations, also leading to
potentially better runtime code.

As part of these changes, this commit also implements ziglang#17194 by
disallowing RLS on explicitly-typed struct and array initializations.
Apologies for linking these changes - it seemed rather pointless to try
and separate them, since they both make big changes to struct and array
initializations in AstGen. The rationale for this change can be found in
the proposal - in essence, performing RLS whilst maintaining the
semantics of the intermediary type is a very difficult problem to solve.

This allowed the problematic `coerce_result_ptr` ZIR instruction to be
completely eliminated, which in turn also simplified the logic for
inferred allocations in Sema - thanks to this, we break even on line
count!

In doing this, the ZIR instructions surrounding these initializations
have been restructured - some have been added and removed, and others
renamed for clarity (and their semantics changed slightly). In order to
optimize ZIR tag count, the `struct_init_anon_ref` and
`array_init_anon_ref` instructions have been removed in favour of using
`ref` on a standard anonymous value initialization, since these
instructions are now virtually never used.

Resolves: ziglang#16512
Resolves: ziglang#17194
mlugg added a commit to mlugg/zig that referenced this issue Sep 20, 2023
This commit introduces the new `ref_coerced_ty` result type into AstGen.
This represents a expression which we want to treat as an lvalue, and
the pointer will be coerced to a given type.

This change gives known result types to many expressions, in particular
struct and array initializations. This allows certain casts to work
which previously required explicitly specifying types via `@as`. It also
eliminates our dependence on anonymous struct types for expressions of
the form `&.{ ... }` - this paves the way for ziglang#16865, and also results
in less Sema magic happening for such initializations, also leading to
potentially better runtime code.

As part of these changes, this commit also implements ziglang#17194 by
disallowing RLS on explicitly-typed struct and array initializations.
Apologies for linking these changes - it seemed rather pointless to try
and separate them, since they both make big changes to struct and array
initializations in AstGen. The rationale for this change can be found in
the proposal - in essence, performing RLS whilst maintaining the
semantics of the intermediary type is a very difficult problem to solve.

This allowed the problematic `coerce_result_ptr` ZIR instruction to be
completely eliminated, which in turn also simplified the logic for
inferred allocations in Sema - thanks to this, we break even on line
count!

In doing this, the ZIR instructions surrounding these initializations
have been restructured - some have been added and removed, and others
renamed for clarity (and their semantics changed slightly). In order to
optimize ZIR tag count, the `struct_init_anon_ref` and
`array_init_anon_ref` instructions have been removed in favour of using
`ref` on a standard anonymous value initialization, since these
instructions are now virtually never used.

Resolves: ziglang#16512
Resolves: ziglang#17194
mlugg added a commit to mlugg/zig that referenced this issue Sep 20, 2023
This commit introduces the new `ref_coerced_ty` result type into AstGen.
This represents a expression which we want to treat as an lvalue, and
the pointer will be coerced to a given type.

This change gives known result types to many expressions, in particular
struct and array initializations. This allows certain casts to work
which previously required explicitly specifying types via `@as`. It also
eliminates our dependence on anonymous struct types for expressions of
the form `&.{ ... }` - this paves the way for ziglang#16865, and also results
in less Sema magic happening for such initializations, also leading to
potentially better runtime code.

As part of these changes, this commit also implements ziglang#17194 by
disallowing RLS on explicitly-typed struct and array initializations.
Apologies for linking these changes - it seemed rather pointless to try
and separate them, since they both make big changes to struct and array
initializations in AstGen. The rationale for this change can be found in
the proposal - in essence, performing RLS whilst maintaining the
semantics of the intermediary type is a very difficult problem to solve.

This allowed the problematic `coerce_result_ptr` ZIR instruction to be
completely eliminated, which in turn also simplified the logic for
inferred allocations in Sema - thanks to this, we break even on line
count!

In doing this, the ZIR instructions surrounding these initializations
have been restructured - some have been added and removed, and others
renamed for clarity (and their semantics changed slightly). In order to
optimize ZIR tag count, the `struct_init_anon_ref` and
`array_init_anon_ref` instructions have been removed in favour of using
`ref` on a standard anonymous value initialization, since these
instructions are now virtually never used.

Resolves: ziglang#16512
Resolves: ziglang#17194
mlugg added a commit to mlugg/zig that referenced this issue Sep 23, 2023
This commit introduces the new `ref_coerced_ty` result type into AstGen.
This represents a expression which we want to treat as an lvalue, and
the pointer will be coerced to a given type.

This change gives known result types to many expressions, in particular
struct and array initializations. This allows certain casts to work
which previously required explicitly specifying types via `@as`. It also
eliminates our dependence on anonymous struct types for expressions of
the form `&.{ ... }` - this paves the way for ziglang#16865, and also results
in less Sema magic happening for such initializations, also leading to
potentially better runtime code.

As part of these changes, this commit also implements ziglang#17194 by
disallowing RLS on explicitly-typed struct and array initializations.
Apologies for linking these changes - it seemed rather pointless to try
and separate them, since they both make big changes to struct and array
initializations in AstGen. The rationale for this change can be found in
the proposal - in essence, performing RLS whilst maintaining the
semantics of the intermediary type is a very difficult problem to solve.

This allowed the problematic `coerce_result_ptr` ZIR instruction to be
completely eliminated, which in turn also simplified the logic for
inferred allocations in Sema - thanks to this, we almost break even on
line count!

In doing this, the ZIR instructions surrounding these initializations
have been restructured - some have been added and removed, and others
renamed for clarity (and their semantics changed slightly). In order to
optimize ZIR tag count, the `struct_init_anon_ref` and
`array_init_anon_ref` instructions have been removed in favour of using
`ref` on a standard anonymous value initialization, since these
instructions are now virtually never used.

Lastly, it's worth noting that this commit introduces a slightly strange
source of generic poison types: in the expression `@as(*anyopaque, &x)`,
the sub-expression `x` has a generic poison result type, despite no
generic code being involved. This turns out to be a logical choice,
because we don't know the result type for `x`, and the generic poison
type represents precisely this case, providing the semantics we need.

Resolves: ziglang#16512
Resolves: ziglang#17194
mlugg added a commit to mlugg/zig that referenced this issue Sep 23, 2023
This commit introduces the new `ref_coerced_ty` result type into AstGen.
This represents a expression which we want to treat as an lvalue, and
the pointer will be coerced to a given type.

This change gives known result types to many expressions, in particular
struct and array initializations. This allows certain casts to work
which previously required explicitly specifying types via `@as`. It also
eliminates our dependence on anonymous struct types for expressions of
the form `&.{ ... }` - this paves the way for ziglang#16865, and also results
in less Sema magic happening for such initializations, also leading to
potentially better runtime code.

As part of these changes, this commit also implements ziglang#17194 by
disallowing RLS on explicitly-typed struct and array initializations.
Apologies for linking these changes - it seemed rather pointless to try
and separate them, since they both make big changes to struct and array
initializations in AstGen. The rationale for this change can be found in
the proposal - in essence, performing RLS whilst maintaining the
semantics of the intermediary type is a very difficult problem to solve.

This allowed the problematic `coerce_result_ptr` ZIR instruction to be
completely eliminated, which in turn also simplified the logic for
inferred allocations in Sema - thanks to this, we almost break even on
line count!

In doing this, the ZIR instructions surrounding these initializations
have been restructured - some have been added and removed, and others
renamed for clarity (and their semantics changed slightly). In order to
optimize ZIR tag count, the `struct_init_anon_ref` and
`array_init_anon_ref` instructions have been removed in favour of using
`ref` on a standard anonymous value initialization, since these
instructions are now virtually never used.

Lastly, it's worth noting that this commit introduces a slightly strange
source of generic poison types: in the expression `@as(*anyopaque, &x)`,
the sub-expression `x` has a generic poison result type, despite no
generic code being involved. This turns out to be a logical choice,
because we don't know the result type for `x`, and the generic poison
type represents precisely this case, providing the semantics we need.

Resolves: ziglang#16512
Resolves: ziglang#17194
@andrewrk
Copy link
Member

I have the same question as @Jarred-Sumner - what about this example?

const foo: struct { ... } = @import("foo.zon");

@mlugg
Copy link
Member Author

mlugg commented Sep 23, 2023

That's one example that would indeed cease to function without further work - the result of the @import would have a distinct struct type, and hence could not coerce to that struct type.

If we want that to work (which I do think makes sense), we could fix this in the compiler simply by making @import consider a result type. For .zig imports, we don't actually use this result type. However, the ZIR generated for a ZON import could take a result type from external code (which can just be generic poison if no result type is actually given), and construct a value of the relevant struct type. So, like with other anon structs, rather than relying on a deep coercion, we are instead relying on RLS to construct the value correctly in the first place.

@andrewrk andrewrk added the accepted This proposal is planned. label Sep 23, 2023
@mlugg
Copy link
Member Author

mlugg commented Sep 23, 2023

Just to note, I have thought of another case this will break:

const x = if (runtime_condition) Foo{ .x = 123 } else .{ .x = 456 };

PTR will fail on these types, as the anonymous literal cannot coerce to Foo.

I don't think this is really a loss - this code is improved by annotating the type of x instead. I just thought I'd mention it.

mlugg added a commit to mlugg/zig that referenced this issue Sep 23, 2023
This commit introduces the new `ref_coerced_ty` result type into AstGen.
This represents a expression which we want to treat as an lvalue, and
the pointer will be coerced to a given type.

This change gives known result types to many expressions, in particular
struct and array initializations. This allows certain casts to work
which previously required explicitly specifying types via `@as`. It also
eliminates our dependence on anonymous struct types for expressions of
the form `&.{ ... }` - this paves the way for ziglang#16865, and also results
in less Sema magic happening for such initializations, also leading to
potentially better runtime code.

As part of these changes, this commit also implements ziglang#17194 by
disallowing RLS on explicitly-typed struct and array initializations.
Apologies for linking these changes - it seemed rather pointless to try
and separate them, since they both make big changes to struct and array
initializations in AstGen. The rationale for this change can be found in
the proposal - in essence, performing RLS whilst maintaining the
semantics of the intermediary type is a very difficult problem to solve.

This allowed the problematic `coerce_result_ptr` ZIR instruction to be
completely eliminated, which in turn also simplified the logic for
inferred allocations in Sema - thanks to this, we almost break even on
line count!

In doing this, the ZIR instructions surrounding these initializations
have been restructured - some have been added and removed, and others
renamed for clarity (and their semantics changed slightly). In order to
optimize ZIR tag count, the `struct_init_anon_ref` and
`array_init_anon_ref` instructions have been removed in favour of using
`ref` on a standard anonymous value initialization, since these
instructions are now virtually never used.

Lastly, it's worth noting that this commit introduces a slightly strange
source of generic poison types: in the expression `@as(*anyopaque, &x)`,
the sub-expression `x` has a generic poison result type, despite no
generic code being involved. This turns out to be a logical choice,
because we don't know the result type for `x`, and the generic poison
type represents precisely this case, providing the semantics we need.

Resolves: ziglang#16512
Resolves: ziglang#17194
@notcancername
Copy link
Contributor

notcancername commented Feb 19, 2024

By using @TypeOf and @compileLog, a programmer who is unaware of the distinction between anonymous struct types and regular struct types can be misled into concluding that structs with comptime fields only can coerce to other structs (
Struct with comptime fields refuses to coerce to struct with runtime fields on ziggit
):

It seems like structs with only comptime stuct fields can coerce to compatible structs:

@compileLog(@as(Rational(i32), .{.num = 6, .den = 9}));
@compileLog(@TypeOf(.{.num = 6, .den = 9}));
Compile Log Output:
@as(types.Rational(i32), .{.num = 6, .den = 9})
@as(type, struct{comptime num: comptime_int = 6, comptime den: comptime_int = 9})

Would these semantics be desirable? They would make anonymous struct initializations a special case of type coercion again, replacing anonymous structs with a less confusing, metaprogrammable alternative, and making the below examples equivalent again, and resolving some of the concerns expressed above:

const anon_struct_init: struct {foo: u8, bar: u8}  = .{.foo = 1, .bar = 2};
// Equivalent in status quo, wouldn't be in this issue, would again be with these semantics
const anon_struct = .{.foo = 1, .bar = 2};
const anon_struct_init: struct {foo: u8, bar: u8}  = anon_struct;

@rohlem
Copy link
Contributor

rohlem commented Feb 19, 2024

@notcancername Are you suggesting special-casing structs which have only comptime fields, or structs which also have comptime fields?
Note that status-quo anonymous struct types can have a mixture of comptime and run-time fields,
initializers which are run-time-known, such as values from var or depending on run-time branching, result in run-time fields.
An anonymous struct type in status-quo doesn't have to hold any comptime fields at all, but they still coerce to other struct types in status-quo.

TUSF pushed a commit to TUSF/zig that referenced this issue May 9, 2024
This commit introduces the new `ref_coerced_ty` result type into AstGen.
This represents a expression which we want to treat as an lvalue, and
the pointer will be coerced to a given type.

This change gives known result types to many expressions, in particular
struct and array initializations. This allows certain casts to work
which previously required explicitly specifying types via `@as`. It also
eliminates our dependence on anonymous struct types for expressions of
the form `&.{ ... }` - this paves the way for ziglang#16865, and also results
in less Sema magic happening for such initializations, also leading to
potentially better runtime code.

As part of these changes, this commit also implements ziglang#17194 by
disallowing RLS on explicitly-typed struct and array initializations.
Apologies for linking these changes - it seemed rather pointless to try
and separate them, since they both make big changes to struct and array
initializations in AstGen. The rationale for this change can be found in
the proposal - in essence, performing RLS whilst maintaining the
semantics of the intermediary type is a very difficult problem to solve.

This allowed the problematic `coerce_result_ptr` ZIR instruction to be
completely eliminated, which in turn also simplified the logic for
inferred allocations in Sema - thanks to this, we almost break even on
line count!

In doing this, the ZIR instructions surrounding these initializations
have been restructured - some have been added and removed, and others
renamed for clarity (and their semantics changed slightly). In order to
optimize ZIR tag count, the `struct_init_anon_ref` and
`array_init_anon_ref` instructions have been removed in favour of using
`ref` on a standard anonymous value initialization, since these
instructions are now virtually never used.

Lastly, it's worth noting that this commit introduces a slightly strange
source of generic poison types: in the expression `@as(*anyopaque, &x)`,
the sub-expression `x` has a generic poison result type, despite no
generic code being involved. This turns out to be a logical choice,
because we don't know the result type for `x`, and the generic poison
type represents precisely this case, providing the semantics we need.

Resolves: ziglang#16512
Resolves: ziglang#17194
@mlugg mlugg mentioned this issue Aug 6, 2024
4 tasks
mlugg added a commit to mlugg/zig that referenced this issue Oct 26, 2024
This commit reworks how anonymous struct literals and tuples work.

Previously, an untyped anonymous struct literal
(e.g. `const x = .{ .a = 123 }`) was given an "anonymous struct type",
which is a special kind of struct which coerces using structural
equivalence. This mechanism was a holdover from before we used
RLS / result types as the primary mechanism of type inference. This
commit changes the language so that the type assigned here is a "normal"
struct type. It uses a form of equivalence based on the AST node and the
type's structure, much like a reified (`@Type`) type.

Additionally, tuples have been simplified. The distinction between
"simple" and "complex" tuple types is eliminated. All tuples, even those
explicitly declared using `struct { ... }` syntax, use structural
equivalence, and do not undergo staged type resolution. Tuples are very
restricted: they cannot have non-`auto` layouts, cannot have aligned
fields, and cannot have default values with the exception of `comptime`
fields. Tuples currently do not have optimized layout, but this can be
changed in the future.

This change simplifies the language, and fixes some problematic
coercions through pointers which led to unintuitive behavior.

Resolves: ziglang#16865
mlugg added a commit to mlugg/zig that referenced this issue Oct 27, 2024
This commit reworks how anonymous struct literals and tuples work.

Previously, an untyped anonymous struct literal
(e.g. `const x = .{ .a = 123 }`) was given an "anonymous struct type",
which is a special kind of struct which coerces using structural
equivalence. This mechanism was a holdover from before we used
RLS / result types as the primary mechanism of type inference. This
commit changes the language so that the type assigned here is a "normal"
struct type. It uses a form of equivalence based on the AST node and the
type's structure, much like a reified (`@Type`) type.

Additionally, tuples have been simplified. The distinction between
"simple" and "complex" tuple types is eliminated. All tuples, even those
explicitly declared using `struct { ... }` syntax, use structural
equivalence, and do not undergo staged type resolution. Tuples are very
restricted: they cannot have non-`auto` layouts, cannot have aligned
fields, and cannot have default values with the exception of `comptime`
fields. Tuples currently do not have optimized layout, but this can be
changed in the future.

This change simplifies the language, and fixes some problematic
coercions through pointers which led to unintuitive behavior.

Resolves: ziglang#16865
mlugg added a commit to mlugg/zig that referenced this issue Oct 27, 2024
This commit reworks how anonymous struct literals and tuples work.

Previously, an untyped anonymous struct literal
(e.g. `const x = .{ .a = 123 }`) was given an "anonymous struct type",
which is a special kind of struct which coerces using structural
equivalence. This mechanism was a holdover from before we used
RLS / result types as the primary mechanism of type inference. This
commit changes the language so that the type assigned here is a "normal"
struct type. It uses a form of equivalence based on the AST node and the
type's structure, much like a reified (`@Type`) type.

Additionally, tuples have been simplified. The distinction between
"simple" and "complex" tuple types is eliminated. All tuples, even those
explicitly declared using `struct { ... }` syntax, use structural
equivalence, and do not undergo staged type resolution. Tuples are very
restricted: they cannot have non-`auto` layouts, cannot have aligned
fields, and cannot have default values with the exception of `comptime`
fields. Tuples currently do not have optimized layout, but this can be
changed in the future.

This change simplifies the language, and fixes some problematic
coercions through pointers which led to unintuitive behavior.

Resolves: ziglang#16865
mlugg added a commit to mlugg/zig that referenced this issue Oct 31, 2024
This commit reworks how anonymous struct literals and tuples work.

Previously, an untyped anonymous struct literal
(e.g. `const x = .{ .a = 123 }`) was given an "anonymous struct type",
which is a special kind of struct which coerces using structural
equivalence. This mechanism was a holdover from before we used
RLS / result types as the primary mechanism of type inference. This
commit changes the language so that the type assigned here is a "normal"
struct type. It uses a form of equivalence based on the AST node and the
type's structure, much like a reified (`@Type`) type.

Additionally, tuples have been simplified. The distinction between
"simple" and "complex" tuple types is eliminated. All tuples, even those
explicitly declared using `struct { ... }` syntax, use structural
equivalence, and do not undergo staged type resolution. Tuples are very
restricted: they cannot have non-`auto` layouts, cannot have aligned
fields, and cannot have default values with the exception of `comptime`
fields. Tuples currently do not have optimized layout, but this can be
changed in the future.

This change simplifies the language, and fixes some problematic
coercions through pointers which led to unintuitive behavior.

Resolves: ziglang#16865
mlugg added a commit to mlugg/zig that referenced this issue Oct 31, 2024
This commit reworks how anonymous struct literals and tuples work.

Previously, an untyped anonymous struct literal
(e.g. `const x = .{ .a = 123 }`) was given an "anonymous struct type",
which is a special kind of struct which coerces using structural
equivalence. This mechanism was a holdover from before we used
RLS / result types as the primary mechanism of type inference. This
commit changes the language so that the type assigned here is a "normal"
struct type. It uses a form of equivalence based on the AST node and the
type's structure, much like a reified (`@Type`) type.

Additionally, tuples have been simplified. The distinction between
"simple" and "complex" tuple types is eliminated. All tuples, even those
explicitly declared using `struct { ... }` syntax, use structural
equivalence, and do not undergo staged type resolution. Tuples are very
restricted: they cannot have non-`auto` layouts, cannot have aligned
fields, and cannot have default values with the exception of `comptime`
fields. Tuples currently do not have optimized layout, but this can be
changed in the future.

This change simplifies the language, and fixes some problematic
coercions through pointers which led to unintuitive behavior.

Resolves: ziglang#16865
mlugg added a commit to mlugg/zig that referenced this issue Oct 31, 2024
This commit reworks how anonymous struct literals and tuples work.

Previously, an untyped anonymous struct literal
(e.g. `const x = .{ .a = 123 }`) was given an "anonymous struct type",
which is a special kind of struct which coerces using structural
equivalence. This mechanism was a holdover from before we used
RLS / result types as the primary mechanism of type inference. This
commit changes the language so that the type assigned here is a "normal"
struct type. It uses a form of equivalence based on the AST node and the
type's structure, much like a reified (`@Type`) type.

Additionally, tuples have been simplified. The distinction between
"simple" and "complex" tuple types is eliminated. All tuples, even those
explicitly declared using `struct { ... }` syntax, use structural
equivalence, and do not undergo staged type resolution. Tuples are very
restricted: they cannot have non-`auto` layouts, cannot have aligned
fields, and cannot have default values with the exception of `comptime`
fields. Tuples currently do not have optimized layout, but this can be
changed in the future.

This change simplifies the language, and fixes some problematic
coercions through pointers which led to unintuitive behavior.

Resolves: ziglang#16865
mlugg added a commit to mlugg/zig that referenced this issue Oct 31, 2024
This commit reworks how anonymous struct literals and tuples work.

Previously, an untyped anonymous struct literal
(e.g. `const x = .{ .a = 123 }`) was given an "anonymous struct type",
which is a special kind of struct which coerces using structural
equivalence. This mechanism was a holdover from before we used
RLS / result types as the primary mechanism of type inference. This
commit changes the language so that the type assigned here is a "normal"
struct type. It uses a form of equivalence based on the AST node and the
type's structure, much like a reified (`@Type`) type.

Additionally, tuples have been simplified. The distinction between
"simple" and "complex" tuple types is eliminated. All tuples, even those
explicitly declared using `struct { ... }` syntax, use structural
equivalence, and do not undergo staged type resolution. Tuples are very
restricted: they cannot have non-`auto` layouts, cannot have aligned
fields, and cannot have default values with the exception of `comptime`
fields. Tuples currently do not have optimized layout, but this can be
changed in the future.

This change simplifies the language, and fixes some problematic
coercions through pointers which led to unintuitive behavior.

Resolves: ziglang#16865
mlugg added a commit to mlugg/zig that referenced this issue Oct 31, 2024
This commit reworks how anonymous struct literals and tuples work.

Previously, an untyped anonymous struct literal
(e.g. `const x = .{ .a = 123 }`) was given an "anonymous struct type",
which is a special kind of struct which coerces using structural
equivalence. This mechanism was a holdover from before we used
RLS / result types as the primary mechanism of type inference. This
commit changes the language so that the type assigned here is a "normal"
struct type. It uses a form of equivalence based on the AST node and the
type's structure, much like a reified (`@Type`) type.

Additionally, tuples have been simplified. The distinction between
"simple" and "complex" tuple types is eliminated. All tuples, even those
explicitly declared using `struct { ... }` syntax, use structural
equivalence, and do not undergo staged type resolution. Tuples are very
restricted: they cannot have non-`auto` layouts, cannot have aligned
fields, and cannot have default values with the exception of `comptime`
fields. Tuples currently do not have optimized layout, but this can be
changed in the future.

This change simplifies the language, and fixes some problematic
coercions through pointers which led to unintuitive behavior.

Resolves: ziglang#16865
mlugg added a commit to mlugg/zig that referenced this issue Oct 31, 2024
This commit reworks how anonymous struct literals and tuples work.

Previously, an untyped anonymous struct literal
(e.g. `const x = .{ .a = 123 }`) was given an "anonymous struct type",
which is a special kind of struct which coerces using structural
equivalence. This mechanism was a holdover from before we used
RLS / result types as the primary mechanism of type inference. This
commit changes the language so that the type assigned here is a "normal"
struct type. It uses a form of equivalence based on the AST node and the
type's structure, much like a reified (`@Type`) type.

Additionally, tuples have been simplified. The distinction between
"simple" and "complex" tuple types is eliminated. All tuples, even those
explicitly declared using `struct { ... }` syntax, use structural
equivalence, and do not undergo staged type resolution. Tuples are very
restricted: they cannot have non-`auto` layouts, cannot have aligned
fields, and cannot have default values with the exception of `comptime`
fields. Tuples currently do not have optimized layout, but this can be
changed in the future.

This change simplifies the language, and fixes some problematic
coercions through pointers which led to unintuitive behavior.

Resolves: ziglang#16865
@mlugg mlugg closed this as completed in d11bbde Nov 1, 2024
@steeve
Copy link

steeve commented Nov 5, 2024

Hello,

Congratulations on landing the change.
Quick question, now that this is completed, how does @call is supposed to work for structs now?

For example:

const std = @import("std");

fn hello(conf: struct {
    prefix: []const u8 = "",
    name: []const u8,
}) void {
    std.debug.print("{s} {s}\n", .{ conf.prefix, conf.name });
}

pub fn main() !void {
    @call(.auto, hello, .{ .{
        .name = "steeve",
    } });
}

What would be the proper syntax way of achieving that ?

Thanks!

@mlugg
Copy link
Member Author

mlugg commented Nov 5, 2024

@steeve Ah, that's an interesting point. The existence of comptime and anytype parameters means that @call can't provide a result type for its argument tuple.

The obvious solution here is to explicitly provide the type, e.g.:

const HelloConf = struct {
    prefix: []const u8 = "",
    name: []const u8,
};
fn hello(conf: HelloConf) void {
    std.debug.print("{s} {s}\n", .{ conf.prefix, conf.name });
}

pub fn main() !void {
    @call(.auto, hello, .{HelloConf{
        .name = "steeve",
    }});
}

If you're using @call because you're doing something generic-ey where the argument types might vary, then this is probably fine, since you're likely constructing the args list with something like inline for anyway. However, for the use case of using @call to override the call modifier, I don't consider this solution satisfactory; you should be able to write the arguments directly.

One possible solution would be to add a special case to the language such then when the args argument to @call is a tuple initializer, we forward result locations to the sub-expressions. Note that the C++ bootstrap compiler did actually do this:

zig/src/stage1/astgen.cpp

Lines 5150 to 5164 in 817cf6a

if (args_node->type == NodeTypeContainerInitExpr) {
if (args_node->data.container_init_expr.kind == ContainerInitKindArray ||
args_node->data.container_init_expr.entries.length == 0)
{
return astgen_fn_call_with_args(ag, scope, node,
fn_ref_node, CallModifierNone, options,
args_node->data.container_init_expr.entries.items,
args_node->data.container_init_expr.entries.length,
lval, result_loc);
} else {
exec_add_error_node(ag->codegen, ag->exec, args_node,
buf_sprintf("TODO: @call with anon struct literal"));
return ag->codegen->invalid_inst_src;
}
} else {

So, the idea there is that @call(foo, bar, qux) wouldn't forward any result location to qux, but the specific syntax @call(foo, bar, .{ x, y }) would forward result locations to x and y.

@andrewrk, do you have any thoughts here?

@steeve
Copy link

steeve commented Nov 6, 2024

Thanks for responding so fast.

Oh, I always (incorrectly apparently) that @call was using result type information.

For context, this is where we'd see a regression: https://github.com/zml/zml/tree/master/async (stackful coroutines).
Perhaps the ability to reify a tuple for anytype arguments (kind of like how we can infer return type via @TypeOf(@call() would be ok.

@rohlem
Copy link
Contributor

rohlem commented Nov 6, 2024

Perhaps the ability to reify a tuple for anytype arguments [...] would be ok.

@steeve To my understanding, anytype and "true" (mutable) comptime fields are not representable in the status-quo type system.
anytype fields were removed in #10705 .
"true" (mutable) comptime fields are proposed in #5675 .
This isn't just about whether users are "allowed to" create these types, they don't exist for the compiler to work with.

(Fwiw, if those proposals did get accepted (somehow),
I think everyone would agree that providing the result type to args would be the cleaner solution.
The proposed special casing of .{ ... } here is a local workaround to make this use case work without having to extend the type system.)

@steeve
Copy link

steeve commented Nov 7, 2024

This isn't just about whether users are "allowed to" create these types, they don't exist for the compiler to work with.

What I meant was, let's take this example:

fn max(a: anytype, b: @TypeOf(a)) @TypeOf(a)

What I'd like is to be able to have a way to get the ArgsTuple of that function. I can already get the result type via @TypeOf(@call(

Because then I could use ArgsTuple to get the tuple type needed to use @call. Not sure if it's the right solution.

Another example I'd expect to work at first glance:

fn add(a: i32, b: i32) i32 {
    return a + b;
}

fn xx() !void {
    var a: i16 = 2;
    a += 1;
    @call(.auto, a, .{ @intCast(a), @intCast(a) });
}

@rohlem
Copy link
Contributor

rohlem commented Nov 7, 2024

fn max(a: anytype, b: @TypeOf(a)) @TypeOf(a)
What I'd like is to be able to have a way to get the ArgsTuple of that function.

In status-quo there is no ArgsTuple for that function.
The language has no concept of a type struct { anytype, anytype }, and introducing it would add a lot of complexity to it.
If such a type did exist, we could provide it as result type for args in @call.
The proposed workaround is simpler to implement, because it doesn't need to introduce such a type,
and instead provides the type for each member of .{ ... } in @call individually.

@steeve
Copy link

steeve commented Nov 11, 2024

In status-quo there is no ArgsTuple for that function. The language has no concept of a type struct { anytype, anytype }, and introducing it would add a lot of complexity to it. If such a type did exist, we could provide it as result type for args in @call. The proposed workaround is simpler to implement, because it doesn't need to introduce such a type, and instead provides the type for each member of .{ ... } in @call individually.

It would be great to be able to get the type of b given the type of a. It seems that ultimately that ArgsTuple exists in the end, only we need a to be able to infer it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
accepted This proposal is planned. proposal This issue suggests modifications. If it also has the "accepted" label then it is planned.
Projects
None yet
Development

Successfully merging a pull request may close this issue.