Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Generic call syntax, or: Wrath of the Turbofish #1

Closed
romdotdog opened this issue Sep 25, 2022 · 7 comments
Closed

RFC: Generic call syntax, or: Wrath of the Turbofish #1

romdotdog opened this issue Sep 25, 2022 · 7 comments

Comments

@romdotdog
Copy link
Owner

romdotdog commented Sep 25, 2022

Summary

Jake cannot support TypeScript's generic call syntax (id<u32>(1)) due to how ambiguous it is. It is unresolvable by a conventional LR(1) parser. I have come up with three solutions, and this RFC is for either voicing approval for one of them or suggesting other solutions.

Solution 1: the Turbofish

The turbofish id::<u32>(1) is present in Rust's syntax and generally considered an eyesore. Rust usually handwaves this away as syntax that doesn't occur very frequently or something that just has to be lived with. Furthermore, the :: doesn't really belong in Jake's syntax, seeing as though it appears nowhere else. Replacing it with, for example, id.<u32>(1) may be awkward. A pro of the turbofish is that it follows the precedent of a major language and it can be easily transferred from knowledge of Rust.

Example

function id<T>(x: T): T {
    return x;
}

id::<u32>(1);

Solution 2: Dependent Types

Generics are in fact a special case of dependent types (the dependency being a kind, represented by *). Thus, this solution might make the most theoretical sense, but feel completely foreign to a new user. In a nutshell, types are supplied as regular parameters to functions, and other parameters use that parameter's name to refer to the type. Dependent types will be implemented, but it's not estimated to be any time soon, and generics may as a result be delayed by the same amount of time.

Examples

function id(T: *, x: T): T {
    return x;
}

id(u32, 1);

Cons with this method include confusion if types look like terms. For example, if Jake had values-as-types:

id(4, 4); // identity takes two parameters?
// in reality, the first `4` is a type, while the second is the actual value

Solution 3: Type Ascription

This solution may be the simplest and easiest to understand, however, it may conflict with other parts of the language. This conflict is described after the following example. This feature has actually been accepted into Rust for over half a decade, never being stabilized.

Examples

function id<T>(x: T): T {
    return x;
}

id(1): u32; // 1 is coerced to be u32 to fit T
// or
id(1: u32); // T is inferred to be u32 to fit 1

Conflicts

Jake was initially intended to use canonical pattern matching in much the same way as Rust, but to be more familiar to TypeScript users and less noisy in general (i.e. no need for if let), the current Jake spec uses the same syntax for "flow-sensitive" pattern matching. Should this solution be accepted, this conflict would have to be resolved in some way.

@jtenner
Copy link

jtenner commented Sep 25, 2022

It feels like all these solutions are not ergonomic in some way. Personally I'm a big fan of solution 2, provided there are good ways of inferring types like strings, objects, maps ect.

I disagree that it won't be intuitive, and this is my opinion after using assemblyscript and typescript for years. If none of these solutions feel right, then it might be a good idea to come up with another solution outside of the box.

@romdotdog
Copy link
Owner Author

romdotdog commented Sep 25, 2022

My sense is that solution 2 sort of negates the necessity of any angle bracket use in the syntax. For example:

type Option<T> = Some: T | None: []; // why do we have angle brackets here?
// to be consistent, it would require that this were a function taking a type that returns a type

let x: Option<u32> = Some(1); // now this would actually be Option(u32)
// lo and behold, the syntax is now unintuitive

I think it's a bit dangerous to do away with angle brackets or otherwise risk inconsistency.

Note
To add, I forgot to mention that angle brackets would still be ambiguous in calls

id(MyType<A, B>, x); // the parser doesn't know whether to parse `MyType < A` or `MyType<A, B>`

To give my opinion, I think the third solution is the best choice considering that it kills two birds with one stone, both resolving this issue and also providing a way to coerce values like integers. That is, instead of casting them which is a lot stronger than simple coercion.

@spotandjake
Copy link

spotandjake commented Sep 25, 2022

After talking it out with @romdotdog, type ascription seems to make the most sense because 99% of the time generics are linked to the parameter and return types, it doesn't add any edge case syntax, and it is also easy to understand. For the other 1% of the time type ascription doesn't work such as size_of<T>(): u32. There is no way to define the type of T. I suggest implementing type ascription and opening a new issue for determining how to deal with the size_of example. If turbofish is added and Jake gets normal TypeScript-style generic call syntax in the future, the ugly turbofish couldn't be phased out. One suggestion was to implement turbofish for every instance but I would argue that causes confusion. If there is a generic syntax that should be standardized everywhere with no special edge cases my alternative suggestion was requiring ::<T> everywhere. Even though it doesn't look as nice, it is standardized, reducing confusion.

@romdotdog
Copy link
Owner Author

romdotdog commented Sep 25, 2022

The conflict described in the summary can be solved by determining whether the expression is a narrowable identifier. Identifiers are not coercible anyways.

As for the size_of case, @jtenner and I were thinking that it's best implemented as an operator:

let num = sizeof(u32);

@romdotdog
Copy link
Owner Author

The consensus seems to be that type ascription is the best way forward. Thank you to all who participated.

@scottmcm
Copy link

scottmcm commented Dec 2, 2022

I know this is long-closed, but I saw it today so figured I'd toss in some extra context.

For the other 1% of the time type ascription doesn't work such as size_of(): u32.

Rust doing size-in-bytes that way is, IMHO, a wart that would definitely not have happened if it were invented today.

Remember that associated consts in rust are a relatively-new feature: https://rust-lang.github.io/rfcs/0195-associated-items.html. So size_of is plausibly just an old bit of cruft that was never important enough to clean up before 1.0, and now the churn would be way too high to make everyone switch.

Today it could (though isn't, in Rust) be spelled u32::SIZE_IN_BYTES via an associated const on the Sized trait, which I personally think is clearer than some random nullary function in mem anyway, and doesn't have turbofishing problems at all. (An operator is also a reasonable answer. I see people use that as shorthand in conversations about rust all the time anyway.)

So 👍 to leaning on types as they way forward for this As a bonus, turbofish in rust makes changing your generic parameters on a function a breaking change, so by not having it you'll give library authors a bit of extra freedom in refactoring.

@romdotdog
Copy link
Owner Author

Hi @scottmcm,

Thank you for providing your input. In the months that have passed, there have been some new developments. I have yet to reflect this in the spec, but there are a new set of rules that allow sizeof to become a function rather than a function-like hard-coded operator. Before I get to that, I would like to preface by stating that we have decided to allow unbounded lookaheads solely to parse the grammar first thought too ambiguous in the original post.

That said, type parameters that are used by a parameter will appear between angle brackets:

function id<T: *>(x: T): T {
    return x;
}

And type parameters that are not used by another parameter will appear between parentheses:

function sizeof(T: *): u32 {
    /* intrinsic */
}

There are three motivations for this:

  1. Angle brackets are friendlier to TypeScript developers.
  2. Type parameters that can be inferred will still be inferred (by contrast, the dependent types proposal above does not allow this without counterintuitive nonsense).
  3. One can partially apply type parameters:
id<u32> // u32 -> u32

Thank you again for taking the time to provide your thoughts to improve Jake.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants