Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is the list of disambiguating tokens in the constructor tearoffs spec too long? #1806

Closed
stereotype441 opened this issue Aug 13, 2021 · 6 comments
Labels
question Further information is requested

Comments

@stereotype441
Copy link
Member

@eernstg raised a question in #1805, regarding the list of look-ahead tokens which force the prior tokens to be type arguments:

This whole list is somewhat dangerous: We're basically deciding that we will never have an expression that starts with :, ., &, |, ^, + (note that Kotlin uses unary plus a lot), *, %, and now as (which is already possible), or at least that those expressions will be 2nd class citizens because they must be enclosed in parentheses when they occur as the right operand in a greater-than expression.

Maybe we should think about this list one more time: It could be small (and sufficient, in some sense) rather than maximal (in terms of which tokens we currently know can't be the beginning of an expression).

How much should the rest of the language pay for the ability to write f<int, String>?

For reference, the list is currently:

( ) ] } : ; , . ? == != .. ?. ?? ?..

& | ^ + * % / ~/

And #1805 proposes to add is and as.

@stereotype441 stereotype441 added the bug There is a mistake in the language specification or in an active document label Aug 13, 2021
@stereotype441
Copy link
Member Author

This whole list is somewhat dangerous: We're basically deciding that we will never have an expression that starts with :, ., &, |, ^, + (note that Kotlin uses unary plus a lot), *, %, and now as (which is already possible), or at least that those expressions will be 2nd class citizens because they must be enclosed in parentheses when they occur as the right operand in a greater-than expression.

Thinking about it a bit more, I don't think this is exactly correct. The disambiguation only occurs if the parser has found a < after an expression, followed by things that look like they could be type arguments, followed by >. So, for example, let's say that at some point in the future we decided to allow unary +, so that expressions could start with +. Assuming we did that, there would be no problem in handling expressions such as:

bool isGreaterThanZero = x > +0;

Even if there were a matching <, there would still be no ambiguity if the things in between didn't look like type arguments. So for example this would be ok:

f(x < 0, y > +0);

The only time you would need extra parentheses would be in code like this:

f(x < y, z > +0);

Personally I think that sort of thing is a rare enough corner case that it's not a problem to disambiguate in favor of type arguments.

@eernstg eernstg added question Further information is requested and removed bug There is a mistake in the language specification or in an active document labels Aug 16, 2021
@stereotype441
Copy link
Member Author

Based on an internal email discussion I believe we're currently leaning toward paring the list down to:

  • the "continuation tokens" (, ., ==, and !=
  • the "stop tokens" ), ], }, ;, :, and ,

The names "continuation tokens" and "stop tokens" reflect the reason why we're proposing to choose these tokens:

  • Continuation tokens are tokens that we can reasonably imagine a programmer wanting to place after a type argument selector to continue the expression, so we accept type argument selectors that are followed by those tokens (rather than trying to treat the < as a relational operator). For example, if (List<int> == T) ... is allowed.
  • Stop tokens are tokens that can't possibly follow a > that is a relational operator, because they stop the expression that's in progress, so we accept type argument selectors that are followed by those tokens to avoid an inevitable syntax error. For example, var x = List<int>; is allowed.

Per the email discussion, we're proposing to not accept any other tokens than these after a type arguement selector. Which means that if someone wants to do something weird like List<int> + 1 (which could technically be given meaningful semantics using an extension method on the type Type), they're going to have to use parentheses: (List<int>) + 1.

@eernstg
Copy link
Member

eernstg commented Aug 26, 2021

Given that we've arrived at an answer that we are happy about and this issue is a 'question', I'll close this issue.

@eernstg eernstg closed this as completed Aug 26, 2021
@eernstg
Copy link
Member

eernstg commented Aug 26, 2021

@stereotype441, is there an issue for the implementation? If not, maybe it isn't necessary?

@stereotype441
Copy link
Member Author

@stereotype441, is there an issue for the implementation? If not, maybe it isn't necessary?

There's no issue that I'm aware of. The full implementation is in https://dart-review.googlesource.com/c/sdk/+/210941, which I hope to land today. If I can't land it today I'll create an implementation issue.

dart-bot pushed a commit to dart-lang/sdk that referenced this issue Aug 27, 2021
…iguities.

When the parser encounters a `<` after an expression, it must choose
whether to interpret it as a relational operator or a <typeArguments>
selector.  The disambiguation rule is: if the `<` and the tokens
following it *can* be parsed as <typeArguments>, and the token that
follows is a member of a privileged set of tokens, then it is treated
as a <typeArguments> selector; otherwise it is treated as a relational
operator.

This change reduces the privileged set of tokens to the following:

- the "continuation tokens" `(`, `.`, `==`, and `!=`
- the "stop tokens" `)`, `]`, `}`, `;`, `:`, and `,`

The names "continuation tokens" and "stop tokens" reflect the
rationale for choosing these tokens:

- Continuation tokens are tokens that we can reasonably imagine a
  programmer wanting to place after a type argument selector to
  *continue* the expression.  For example, `if (List<int> == T) ...`
  is allowed.

- Stop tokens are tokens that can't possibly follow a `>` that is a
  relational operator, because they *stop* the expression that's in
  progress.  For example, `var x = List<int>;` is allowed.

If a user wants to follow a <typeArguments> selector with a token
other than the ones above, they'll have to parenthesize the
expression.  So for example, if they want to do `List<int> + 1` (which
could be meaningful if an extension method defined `operator +` for
the type `Type`), they will have to use parentheses, and instead write
`(List<int>) + 1`.

Bug: dart-lang/language#1806
Change-Id: I2816cdac24e55eac3cb3e9920e276404c1228d46
Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/210941
Commit-Queue: Paul Berry <paulberry@google.com>
Reviewed-by: Lasse R.H. Nielsen <lrn@google.com>
Reviewed-by: Konstantin Shcheglov <scheglov@google.com>
Reviewed-by: Johnni Winther <johnniwinther@google.com>
@eernstg
Copy link
Member

eernstg commented Aug 27, 2021

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants