Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

remove T{} syntax in favor of type coercion #5038

Open
andrewrk opened this issue Apr 14, 2020 · 22 comments
Open

remove T{} syntax in favor of type coercion #5038

andrewrk opened this issue Apr 14, 2020 · 22 comments
Labels
breaking Implementing this issue could cause existing code to no longer compile or have different behavior. proposal This issue suggests modifications. If it also has the "accepted" label then it is planned.
Milestone

Comments

@andrewrk
Copy link
Member

The following are semantically equivalent:

const a = T{};
const a: T = .{};
const a = @as(T, .{});

I propose to remove the first possibility, and rely on the other two.

The problem with this proposal as I can see it is inferred-size array literals:

const a = [_]u8{1, 2, 3};

There's no other way to write this. And if that works, then why wouldn't this work?

const a = [3]u8{1, 2, 3};

But now we're back to the proposal:

const a = [3]u8{1, 2, 3};
const a: [3]u8 = .{1, 2, 3};
const a = @as([3]u8, .{1, 2, 3});
@andrewrk andrewrk added breaking Implementing this issue could cause existing code to no longer compile or have different behavior. proposal This issue suggests modifications. If it also has the "accepted" label then it is planned. labels Apr 14, 2020
@andrewrk andrewrk added this to the 0.7.0 milestone Apr 14, 2020
@JesseRMeyer
Copy link

What problem does this solve?

const a = T{}; is the simplest expression of the three, in terms of concepts necessary to define the assignment semantics. .{} introduces anonymous structs, and @as() introduces a compiler 'function'.

I'm not convinced reducing the number of ways to define an assignment is worth removing the simplest solution to the problem, if I understand this correctly.

@jakwings
Copy link

How about this?

const a = T.{};

const a = [_]u8.{};
const a = [_]u8.{1, 2, 3};

@emekoi
Copy link
Contributor

emekoi commented Apr 14, 2020

@iology see #760, specifically this part.

@mogud
Copy link
Contributor

mogud commented Apr 15, 2020

// I prefer this
const a: [_]u8 = .{1, 2, 3};

@foobles
Copy link
Contributor

foobles commented Apr 16, 2020

I would definitely be in favor of this, just as long as the . in .{} is removed at the same time, which seems to be in the works.

@jibal
Copy link

jibal commented Jan 31, 2022

One could remove quoted string literal syntax since array literals could be used instead, or require all integer literals to be encoded in 0b<binary> form, but these are obviously carrying "only one way to do it" too far ... arguably this does too. And this proposal doesn't achieve the favored principle, since you still have

const foo: Foo = .{};

and

const foo = @as(Foo, .{});

And if there are any cases where @as would be required then I think that's a good reason to reject this proposal.

Speaking of which,

fn foo() !Foo {
    return .{ .field = try something() };
}

doesn't work because the compiler infers the type that isn't legal here (the error union), rather than the one that is (Foo), so with this proposal one must do either [first way to do it]

fn foo() !Foo {
    return @as(Foo, .{ .field = try something() });
}

or [second way to do it]

fn foo() !Foo {
    const result: Foo = .{ .field = try something() };
    return result;
}

Casts are to be avoided, temporary variables shouldn't be necessary, type inference is nice in some places but can be hard to read and understand in others. Foo{.field = ...} is clear and has none of the disadvantages of the other ways (its one disadvantage over the anonymous syntax is violation of DRY, but the anonymous syntax can be used when the tradeoffs favor it. Yes, tradeoffs ... sometimes there are good reasons to have more than one way to do things.)

@jibal
Copy link

jibal commented Jan 31, 2022

@theInkSquid

I would definitely be in favor of this, just as long as the . in .{} is removed at the same time, which seems to be in the works.

It's not in the works ... #5039 was closed two days before your comment.

@MKRhere
Copy link

MKRhere commented Feb 3, 2023

@jibal Both proposals are back on the table and I'm excited/hopeful for both again. :)

@kuon
Copy link
Contributor

kuon commented Feb 4, 2023

I think it should stay because of:

var foo = T{};
...
foo = T{} // make it clear what we are assigning.

I also do not see what we would gain from removing it.

@misanthrop
Copy link

This will make usage of anytype arguments inconvenient.

Example:

pub const Vec3 = @Vector(3, f32);

pub fn dot(a: anytype, b: @TypeOf(a)) std.meta.Child(@TypeOf(a)) {
  return @reduce(.Add, a*b);
}

test {
  _ = dot(@as(Vec3, .{1, 2, 3}), .{3, 2, 1}); // Looks ugly
  const a: Vec3 = .{1, 2, 3}; // Vec3{1, 2, 3} looks better
  const b: Vec3 = .{3, 2, 1};
  _ = dot(a, b);
}

With the change it will be always more convenient to use type arguments:

pub fn dot(comptime T: type, a: T, b: T) std.meta.Child(T) {
  return @reduce(.Add, a*b);
}

test {
  _ = dot(Vec3, .{1, 2, 3}, .{3, 2, 1}); // Looks OK?
  const a: Vec3 = .{1, 2, 3};
  const b: Vec3 = .{3, 2, 1};
  _ = dot(Vec3, a, b); // Vec3 here is redundant
}

@misanthrop
Copy link

misanthrop commented Jun 24, 2023

Another problem is a tuple with typed elements:

const t = .{ .a = A{}, .b = B{} }; // works now, but will not work
const t = .{ .a: A = .{}, .b: B = .{} }; // syntax is not supported

 // that's how it will look
const t1 = .{ .a = @as(A, .{}), .b = @as(B, .{}) };
const t2 = .{ @as(A, .{}), @as(B, .{}) };

Maybe in some cases the tuple with tuple elements (just .{}) will coarse properly when used, but it might decrease readability, as @kuon pointed out earlier.

@andrewrk andrewrk removed this from the 0.11.0 milestone Jul 20, 2023
@mlugg
Copy link
Member

mlugg commented Sep 23, 2023

Here's an idea of how to handle inferred-size array literals.

Currently, we have a special case to allow [_]T{ ... } syntax in array literals, where otherwise the [_]T part would be considered a normal type. I propose that we move this special case: rather than array literals, this syntactic form is permitted as a type annotation for a const or var decl (global or local). Like today, it is an exact syntactic form: for instance, const x: ([_]T) = ... is invalid just as ([_]T){ ... } is invalid today. When a const/var is marked with this "type", the initialization expression is given a new result location type. In terms of implementation, it will look something like this:

/// This expression is the initialization expression of a var decl whose type is an inferred-length array.
/// Every result sub-expression must use array initialization syntax. The array's length should be written
/// to `chosen_len` so the caller can retroactively set the array length.
inferred_len_array_ptr: struct {
    /// The array pointer to store results into.
    ptr: PtrResultLoc,
    /// This is initially `null`, and is set when an expression consumes this result location.
    /// If an expression has a length which does not match the currently-set one, it can use `src_node` to emit an error.
    chosen_len: *?struct {
        len: u32,
        src_node: Ast.Node.Index,
    },
},

The idea here is that every peer here must be an array initialization expression (.{ ... }), and their lengths must match. The var/const decl will create an alloc instruction for an array type whose length is rewritten to the correct value after lowering the init expression.

This result location type will trigger an error in all cases other than array initializers, such as struct inits and calls through to rvalue.

In practice, here's what this means:

// these are all valid
const x: [_]u8 = .{ 1, 2, 3 };
const y: [_]u8 = if (condition) .{ 1, 2 } else switch (x) {
    .foo => .{ 3, 4 },
    .bar => .{ 5, 6 },
    else => .{ 7, 8 },
};
const z: [_][]const u8 = blk: {
    if (foo) break :blk .{ "hello", "world" };
    break :blk .{ "foo", "bar" };
};

// this is invalid
// error: array length cannot be determined
// note: result must be array initialization expression
const a: [_]u8 = @as([3]u8, .{ 1, 2, 3 });
const b: [_]i16 = blk: {
    const result: [2]i16 = .{ 1, 2 };
    break :blk result;
};
const c: [_]u8 = if (cond) .{ 1, 2 } else something_else;

// this is also invalid
// error: array length '3' does not match array length '2'
// note: array with length '2' here
// note: inferred-length array must have a fixed length
const d: [_]u8 = if (cond) .{ 1, 2 } else .{ 3, 4, 5 };

@DerpMcDerp
Copy link

cpp2 uses the following syntax:

name: type = expr;

and allows you to omit at most one:

name := expr; // define name, type is inferred
name: type;   // define name, initialized later
:type = expr; // create anonymous r-value

so if :type = expr syntax is borrowed from cpp2 you can remove both @as and T{} from Zig:

:[3]u8 = .{1, 2, 3}
@as([3]u8, .{1, 2, 3}) // removed
[3]u8{1, 2, 3} // removed

@arthurmelton
Copy link

arthurmelton commented Apr 20, 2024

If you were to follow up on this, and require the type annotation, how about functions like ArrayLists? In my mind these functions feel similar with the type being on the right side.

const a = [0]T{};
const a = std.ArrayList(T).init(std.heap.GeneralPurposeAllocator);

If you want to try to remove the T{} syntax, the way to write the code would be the following:

const a: [0]T = .{};
const a = std.ArrayList(T).init(std.heap.GeneralPurposeAllocator);

To me, this just feels wrong, as both functions don't feel like they have the same “way” of writing them. I feel like if you want to remove the type annotation from the right, then you should maybe try to remove it from all functions like ArrayList and alloc.

@castholm
Copy link
Contributor

@arthurmelton See #9938 (decl literals), which would enable

const a: [0]T = .{};
const a: std.ArrayList(T) = .init(allocator);

@mlugg
Copy link
Member

mlugg commented Apr 26, 2024

I'd like to make one more argument against T{ ... } syntax.

Nowadays, the only place I ever really write T{ ... } myself is for certain APIs in the compiler. When doing DOD tricks, we want to be able to unpack arbitrary structs into a big flat array of values stored elsewhere (generally named extra). We have helper functions called addExtra to do this; so, we write code like foo.addExtra(TypeToStore{ ... }).

The problem here is that this is actually kind of type-unsafe. What would happen if I instead wrote foo.addExtra(.{ ... }) by mistake? Well, the function would be passed an anonymous struct which has a potentially different field order. It would probably compile fine, but if the field order were different, this would cause bugs down the line, since we would unpack values from the extra array in the wrong order. After discussing this a little with @silversquirl, it led us to a key observation.

Any given API generally expects a typed struct or an anonymous struct, rather than accepting either. Passing a typed struct where an anonymous struct is expected, or vice versa, is likely to lead to bugs.

At first glance, it may seem that if anything, this is an argument in favor of T{ ... } syntax: it's a separate form for when we want to use typed structs. The problem is that this difference is completely superficial. There's nothing to actually force you to write one over the other, so -- unless the API is doing its own weird type checks of some kind -- it doesn't really prevent this bug from slipping in.

Now, suppose that this proposal was implemented, so that .{ ... } became the only syntax to directly initialize a struct. What would these APIs look like then? A function argument which expects an arbitrary "concrete" struct type would take parameters comptime T: type, x: T. So my call above would become foo.addExtra(TypeToStore, .{ ... }), which is impossible to mess up. An API which expects an anonymous struct (something like std.Build.dependency) continues to take anytype; you can't get the syntax wrong, because there's only one way to init the struct, and reaching for @as in this context should be a pretty solid sign that you're doing something wrong. In essence, under this proposal, the form of an API implicitly prevents you from using it incorrectly by passing the wrong "kind" of struct.

@Flaminator
Copy link

@mlugg

I assume this addExtra api is using anytype as it's parameter? I see this more of an issue with anytype and anonymous struct literal syntax than with T{ ... }. Even when removing the T{ ... } syntax you will still run into this issue you described.

You can that for example in the following code:

const S2 = struct{
    x: i32,
    y: i32,
};

fn what_is(x: anytype) void
{
//  do stuff here
    _ = x;
}

fn any_test(i: i32) void
{
    const s1 = S2{.x = i+0, .y = i+1};
    const s2 = .{.x = i+2, .y = i+3};

    what_is(s1);                 // 1
    what_is(s2);                 // 2
    what_is(S2{.x=i+4, .y=i+5}); // 3
    what_is(.{.x=i+6, .y=i+7});  // 4
    what_is(.{.y=i+8, .x=i+9});  // 5
}

The calls 1 and 3 are the same, the calls 2 and 4 are the same and call 5 is different. So the difference will still be there, the only thing that will probably change is that people are more likely to either use @as or just have their code accidentally do the wrong thing because they are playing around with anonymous structs instead of a real typed struct.

Removing T{ ... } syntax also is a problem in the following piece of code when switching the parameter type of p1 or p2 in abcdef to S2 or S1. Here it's actually the anonymous struct literal syntax that will just compile fine when this could probably result in bugs:

const S1 = struct{
    x: i32,
    y: i32,
};

const S2 = struct{
    x: i32,
    y: i32,
};

fn abcdef(p1: S1, p2: S2) void
{
//  do stuff here
    _ = p1;
    _ = p2;
}

// Today
fn ghi() void
{
    const s1 = S1{.x = 1, .y = 2};     // const s1: S1 = .{.x = 1, .y = 2};
    const s2: S2 = .{.x = 3, .y = 4};  // const s2 = S2{.x = 3, .y = 4};

    // Will silently compile parameter type change     | 1 | 2 |
    abcdef(S1{.x = 1, .y = 2}, S2{.x = 3, .y = 4}); // | N | N |
    abcdef(.{.x = 1, .y = 2}, .{.x = 3, .y = 4});   // | Y | Y |
    abcdef(.{.x = 1, .y = 2}, S2{.x = 3, .y = 4});  // | Y | N |
    abcdef(S1{.x = 1, .y = 2}, .{.x = 3, .y = 4});  // | N | Y | 
    abcdef(s1, S2{.x = 3, .y = 4});                 // | N | N |
    abcdef(s1, .{.x = 3, .y = 4});                  // | N | Y |
    abcdef(S1{.x = 1, .y = 2}, s2);                 // | N | N |
    abcdef(.{.x = 1, .y = 2}, s2);                  // | Y | N | 
    abcdef(s1, s2);                                 // | N | N |
}

// Tomorrow
fn jkl() void
{
    const s1: S1 = .{.x = 1, .y = 2};
    const s2: S2 = .{.x = 1, .y = 2};

    // Will silently compile parameter type change   | 1 | 2 |
    abcdef(.{.x = 1, .y = 2}, .{.x = 3, .y = 4}); // | Y | Y |
    abcdef(s1, .{.x = 3, .y = 4});                // | N | Y |
    abcdef(.{.x = 1, .y = 2}, s2);                // | Y | N |
    abcdef(s1, s2);                               // | N | N |
}

or you would have to start littering your code with @as calls everywhere everytime but that doesn't improve readability at all but does make your code "correct".

So imo your example is not really a valid argument against removing T{ ... } syntax.

@expikr
Copy link
Contributor

expikr commented Jun 8, 2024

If @as is so indispensable as to displace existing language syntax, why not simply unify it as part of the language as opposed to being a builtin function?

I propose incorporating @as into the language itself via a streamlined T.{} dot syntax that is unified between primitive and compound types:

const a: i32 = 1;
const b = i32.{1};

const c: [3]i32 = .{1, 2, 3};
const d = [3]i32.{1, 2, 3}; // or [_]i32.{1, 2, 3};

const e: Vec2 = .{.x=1, .y=2};
const f = Vec2.{.x=1, .y=2};

const NativeFloat = if (8==@sizeOf(usize)) f64 else f32;
const g = NativeFloat.{0.25};

The three disparate syntax T{}, [n]T{} and @as(T,.{}) are removed and replaced with a single cohesive T.{} syntax.

@expikr
Copy link
Contributor

expikr commented Jun 10, 2024

Here's an idea of how to handle inferred-size array literals.

Currently, we have a special case to allow [_]T{ ... } syntax in array literals, where otherwise the [_]T part would be considered a normal type. I propose that we move this special case: rather than array literals, this syntactic form is permitted as a type annotation for a const or var decl (global or local). Like today, it is an exact syntactic form: for instance, const x: ([_]T) = ... is invalid just as ([_]T){ ... } is invalid today. When a const/var is marked with this "type", the initialization expression is given a new result location type. In terms of implementation, it will look something like this:

/// This expression is the initialization expression of a var decl whose type is an inferred-length array.

/// Every result sub-expression must use array initialization syntax. The array's length should be written

/// to `chosen_len` so the caller can retroactively set the array length.

inferred_len_array_ptr: struct {

    /// The array pointer to store results into.

    ptr: PtrResultLoc,

    /// This is initially `null`, and is set when an expression consumes this result location.

    /// If an expression has a length which does not match the currently-set one, it can use `src_node` to emit an error.

    chosen_len: *?struct {

        len: u32,

        src_node: Ast.Node.Index,

    },

},

The idea here is that every peer here must be an array initialization expression (.{ ... }), and their lengths must match. The var/const decl will create an alloc instruction for an array type whose length is rewritten to the correct value after lowering the init expression.

This result location type will trigger an error in all cases other than array initializers, such as struct inits and calls through to rvalue.

In practice, here's what this means:

// these are all valid

const x: [_]u8 = .{ 1, 2, 3 };

const y: [_]u8 = if (condition) .{ 1, 2 } else switch (x) {

    .foo => .{ 3, 4 },

    .bar => .{ 5, 6 },

    else => .{ 7, 8 },

};

const z: [_][]const u8 = blk: {

    if (foo) break :blk .{ "hello", "world" };

    break :blk .{ "foo", "bar" };

};



// this is invalid

// error: array length cannot be determined

// note: result must be array initialization expression

const a: [_]u8 = @as([3]u8, .{ 1, 2, 3 });

const b: [_]i16 = blk: {

    const result: [2]i16 = .{ 1, 2 };

    break :blk result;

};

const c: [_]u8 = if (cond) .{ 1, 2 } else something_else;



// this is also invalid

// error: array length '3' does not match array length '2'

// note: array with length '2' here

// note: inferred-length array must have a fixed length

const d: [_]u8 = if (cond) .{ 1, 2 } else .{ 3, 4, 5 };

I would advise making this suggestion into its own separate proposal to potentially expedite its acceptance. Regardless of whether the parent proposal will be accepted or not, I think moving the inferred array notation to type annotations is a significant improvement in its own right.

@Fri3dNstuff
Copy link
Contributor

After originally being very opposed to the proposal I've come to like it quite a bit... Here is my argument in favour of this proposal, that I haven't seen mentioned so far:

Though using type coercion for aggregate types is quite alien at first (especially coming from the many languages that require the programmer to mention a type's name in order to create a value thereof), if you think about it for some time, you'll see that we are already using this coercion system quite a bit:

integer literals have type comptime_int, we use coercion to convert them to usize, i32, etc.

float literals have type comptime_float, we use coercion to convert them to f64, f32, etc.

string literals have types *const [N:0]u8 for some N, we use coercion to convert them to []const u8.

.{} can be thought of as the aggregate value literal.

@16hournaps
Copy link

I would like to point out that removing T{} would completely bust the barely working autocompletion. Cureently if ZLS is confused T{} is always an option to help it produce field completions. Leaving only .{} at this stage of ZLS would make dev experience much worse.

@mlugg
Copy link
Member

mlugg commented Oct 28, 2024

Deficits in the functionality of third-party projects do not impact the design of Zig. A project like ZLS could handle inferred types / RLS properly with very little effort if it were appropriately designed for it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking Implementing this issue could cause existing code to no longer compile or have different behavior. proposal This issue suggests modifications. If it also has the "accepted" label then it is planned.
Projects
None yet
Development

No branches or pull requests