Proposed changes for deconstruction, declaration expressions, and discards #14794

gafter · 2016-10-28T18:18:49Z

This week's LDM meetings (2016-10-25 and 2016-10-26) proposed some possible changes to the handling of deconstruction, declaration expressions, and wildcards that, even if done later, would affect the shape of compiler APIs today. That suggests we would want to consider what to do today so that we are compatible with that possible future. This is a summary of the proposed changes and their API impact.

Wildcards

The LDM is proposing to change the character that we would use to represent a "wildcard" from * to _. We also considered some alternatives, but both * and ? have a logical (if not technical) ambiguity, because int? and int* are valid types. _ doesn't likely have the same kind of syntactic ambiguity, because it is already an identifier in all of the places where we want to support wildcards. But it may be a semantic ambiguity for that same reason. The reason we like _ is that users already introduce variables, for example parameters, named _ when their intention is to ignore them.

We want to support both the "short form" _ and the "long form" var _ just in case there is an ambiguity with, for example, a field in scope. We'll return to this later.

Declaration Expressions

We currently represent an out variable declaration using a "declaration expression" node in the syntax model. That was done because we were thinking that we may want to generalize declaration expressions in the future.

There is some slight discomfort with declaration expressions as they appear in the current API and proposed language spec, because while they are expressions in the syntax model, they are not expressions in the draft spec, and do not have types, and therefore there are special rules called out for them wherever they may appear in the specification. Besides the mismatch between the spec and model, we expect we may want to allow declaration expressions in more contexts in the future, in which case we will want them to be treated as expressions. In that case we will be better served by treating them as expressions (i.e. they can have a type) today.

Deconstruction

Possibly generalizing declaration expressions in the future makes us want to reconsider our syntax model for deconstruction today. For example, given the statement

(int x, int y) = e;

We can think of this as a special deconstruction declaration statement (as it is today), or alternatively we can think of it as a statement expressions containing an assignment expression with a tuple expression on the left-hand-side. In that case the tuple expression contains two declaration expressions. The latter formulation makes more sense in the possible future in which we generalize declaration expressions.

Similarly, given the statement

var (x, y) = e;

We can think of this as a special deconstruction declaration statement (as it is today), or alternatively we can think of it as a statement expressions containing an assignment expression with a declaration expression on the left-hand-side. The latter formulation makes more sense in the possible future in which we generalize declaration expressions.

This reformulation of deconstruction allows us to remove from the syntax model the new statement form for a deconstruction declaration. It also allows us to generalize what we allow in the future:

int x;
(x, int y) = e;

Here, the left-hand-side contains a mixture of already-existing variables (in this case x) and newly declared variables (int y). And it can be used in an expression context as well (e.g. as the body of an expression-bodied method).

Wildcards (revisited)

Given this new understanding of the direction of the syntax, there are four forms that wildcards can take. First, it can take the place of an identifier in a designator (i.e. in a declaration expression):

var (x, _) = e;
(int x, int _) = e;
M(out int _);

Since _ is already an identifier, no syntax model change is required. However, semantically we want this to create an anonymous variable, and shadow any true variable (e.g. parameter or field) from an enclosing scope named _. There is no name conflict error if wildcards are declared this way more than once in a scope.

Second, it can similarly be used to declare a pattern variable:

switch (o)
{
    case int _:
    case long _:
        Console.WriteLine("int or long");
        break;
}

Third, it can take the place of an identifier in a simple expression where an lvalue is expected and is used as a target in a deconstruction assignment or out parameter, but in that case its special behavior as a wildcard only occurs if looking up _ doesn't find a variable of that name

M(out _);
(x, _) = e;

This special name lookup is similar to the way we handle var.

Finally, it can be used where a parameter can be declared today. However, we relax the single-definition rule to allow multiple conflicting declarations (same scope or nested scopes), in which case the identifier _ binds as a wildcard.

Func<int, int, int> f = (_,_) => 3;

We have to be careful with these changes so that any program that uses _ as an identifier and is legal today continues to compile with the same meaning under these revised rules.

Syntax model changes

This allows us to simplify the handling of the for loop to handle deconstruction. Now the deconstruction is just one of the expressions in the expression list of the initializer part, and doesn't require its own placeholder in the syntax. That means that the syntax node for the for loop remains unchanged from the C# 6 version.

This requires a change to the way we handle the deconstruction form of the foreach loop. Because we want the left-hand-side to be capable of representing all of these forms

foreach ((int x, int y) in e) ...
foreach ((int x, _) in e) ...
foreach (var (x, _) in e) ...

we now need to use expression for the syntax node before the in keyword.

We can remove the syntax node for the deconstruction declaration statement, because that is just an assignment statement in this model.

Syntax.xml changes

The following changes are proposed compared to the current implementation in master. We remove

  <AbstractNode Name="VariableComponentSyntax" Base="CSharpSyntaxNode">
  </AbstractNode>
  <Node Name="TypedVariableComponentSyntax" Base="VariableComponentSyntax">
    <Kind Name="TypedVariableComponent"/>
    <Field Name="Type" Type="TypeSyntax"/>
    <Field Name="Designation" Type="VariableDesignationSyntax"/>
  </Node>
  <Node Name="ParenthesizedVariableComponentSyntax" Base="VariableComponentSyntax">
    <Kind Name="ParenthesizedVariableComponent"/>
    <Field Name="OpenParenToken" Type="SyntaxToken">
      <Kind Name="OpenParenToken"/>
    </Field>
    <Field Name="Variables" Type="SeparatedSyntaxList&lt;VariableComponentSyntax&gt;"/>
    <Field Name="CloseParenToken" Type="SyntaxToken">
      <Kind Name="CloseParenToken"/>
    </Field>
  </Node>
  <Node Name="DeconstructionDeclarationStatementSyntax" Base="StatementSyntax">
    <Kind Name="DeconstructionDeclarationStatement"/>
    <Field Name="Modifiers" Type="SyntaxList&lt;SyntaxToken&gt;"/>
    <Field Name="Assignment" Type="VariableComponentAssignmentSyntax"/>
    <Field Name="SemicolonToken" Type="SyntaxToken">
      <Kind Name="SemicolonToken"/>
    </Field>
  </Node>
  <Node Name="VariableComponentAssignmentSyntax" Base="CSharpSyntaxNode">
    <Kind Name="VariableComponentAssignment"/>
    <Field Name="VariableComponent" Type="VariableComponentSyntax"/>
    <Field Name="EqualsToken" Type="SyntaxToken">
      <Kind Name="EqualsToken"/>
    </Field>
    <Field Name="Value" Type="ExpressionSyntax"/>
  </Node>

and we remove the Deconstruction field from the ForStatementSyntax

We change the VariableComponent field of ForEachComponentStatementSyntax to be an ExpressionSyntax, and probably change the name of ForEachComponentStatementSyntax.

And we change

  <Node Name="DeclarationExpressionSyntax" Base="ExpressionSyntax">
    <Kind Name="DeclarationExpression"/>
    <Field Name="VariableComponent" Type="VariableComponentSyntax">
    </Field>
  </Node>

to

  <Node Name="DeclarationExpressionSyntax" Base="ExpressionSyntax">
    <Kind Name="DeclarationExpression"/>
    <Field Name="Type" Type="TypeSyntax"/>
    <Field Name="Designation" Type="VariableDesignationSyntax"/>
  </Node>

We leave unchanged

  <AbstractNode Name="VariableDesignationSyntax" Base="CSharpSyntaxNode">
  </AbstractNode>
  <Node Name="SingleVariableDesignationSyntax" Base="VariableDesignationSyntax">
    <Kind Name="SingleVariableDesignation"/>
    <Field Name="Identifier" Type="SyntaxToken">
      <Kind Name="IdentifierToken"/>
    </Field>
  </Node>
  <Node Name="ParenthesizedVariableDesignationSyntax" Base="VariableDesignationSyntax">
    <Kind Name="ParenthesizedVariableDesignation"/>
    <Field Name="OpenParenToken" Type="SyntaxToken">
      <Kind Name="OpenParenToken"/>
    </Field>
    <Field Name="Variables" Type="SeparatedSyntaxList&lt;VariableDesignationSyntax&gt;"/>
    <Field Name="CloseParenToken" Type="SyntaxToken">
      <Kind Name="CloseParenToken"/>
    </Field>
  </Node>

SemanticModel changes

We may want to change the behavior of GetTypeInfo on a declaration expression, depending on how the shape of the specification evolves.

We probably need to consider what the behavior of SemanticModel APIs should be on wildcards.

Summary

The changes to declaration expressions and deconstruction should be done today so that we don't have an incompatible change later.

Wildcards are an interesting problem. Even if we don't want to implement them for C# 7, we want to wall off the semantic space so that valid C# 7 programs don't change meaning or become invalid in a later language version. I suspect the simplest way to do that is to implement wildcards today.

/cc @dotnet/ldm @dotnet/roslyn-compiler

The text was updated successfully, but these errors were encountered:

gafter · 2016-10-28T21:16:42Z

There is an interesting syntactic ambiguity arising from declaration expressions. Consider the code

(A < B > C, X) = e;

Here the "tuple expression" on the left has a declaration for a variable C of type A<B>. However, the same "tuple expression" elsewhere

e = (A < B > C, X);

has a first subexpression equivalent to (A < B) > C. Writing the specification (and implementing the parser) to distinguish these two contexts will be fun.

gafter · 2016-10-28T21:21:09Z

Similarly, consider the code

(A < B, D > C, X) = e;

Here the "tuple expression" on the left has a declaration for a variable C of type A<B,D>. However, the same "tuple expression" elsewhere

e = (A < B, D > C, X);

has a first subexpression equivalent to (A < B) and a second equivalent to (D > C).

gafter · 2016-10-28T22:22:20Z

Another advantage of adding support for wildcards today rather than later is that we can warn for unused pattern and out variables.

HaloFour · 2016-10-28T23:11:55Z

While I understand the ambiguity issues of * it still seems like a better choice than _. I can just imagine the massive pit of failure that will be people accidentally deconstructing into legally declared and used variables.

CyrusNajmabadi · 2016-10-28T23:17:02Z

@gafter Can you show hte changes you intend to make to the foreach-syntax model?

Thanks!

CyrusNajmabadi · 2016-10-28T23:18:13Z

Actually, n/m. I can see you describe it as:

We change the VariableComponent field of ForEachComponentStatementSyntax to be an ExpressionSyntax, and probably change the name of ForEachComponentStatementSyntax.

alrz · 2016-10-28T23:20:26Z

@HaloFour You can't accidentally do anything. because basically any mistake is a compile-time error. And more to the point of this thread, I can understand how it's easier to represent in AST because it would be merely a semantical change. In fact, just because it's an identifier, it allows you to use T _ to represent type-check patterns or _ to ignore lamba args without introducing any new syntax and further ambiguities.

alrz · 2016-10-28T23:31:25Z

@gafter Regarding declaration expression ambiguities, I think it's better to require them to always be initialized. Any use case involving an uninitialized declaration expression can use a pattern instead.

gafter · 2016-10-29T00:04:22Z

@gafter Regarding declaration expression ambiguities, I think it's better to require them to always be initialized. Any use case involving an uninitialized declaration expression can use a pattern instead.

That was our intent. Declaration expressions can appear in an out argument, or in a tuple expression that is in a deconstruction context such as an out argument or the left-hand-side of an assignment expression or the target of a foreach loop.

HaloFour · 2016-10-29T00:09:38Z

@alrz

Sure you can. It would be impossible for the compiler to tell if you intend to deconstruct into a local variable or field in scope that happens to be named _ or if you intend to use a wildcard:

int _ = 123;
var pt = new Point(0, 456);
int y = 789;
// later
(_, y) = pt; // oops, what did I really intend to do here?

Is this necessarily a common scenario? Probably not. But as a legal identifier it could never be ruled out.

vbcodec · 2016-10-29T00:19:12Z

@HaloFour

for pt as 2D point
(_, y) = pt;
change _ for 0, but for 3D point, variable _ will remain unchanged

HaloFour · 2016-10-29T00:34:29Z

@vbcodec

Considering that parameters of deconstructor methods are always out it doesn't matter what method is resolved the _ variable would always be overwritten. I imagine that Point3D won't have a two parameter deconstructor, though.

vbcodec · 2016-10-29T00:45:24Z

@HaloFour
I meant that Point3D has only one method with three outs. If it has also second method with two outs, then compiler probably will pick that method, and variable _ will be changed. But method with two outs (for Point3D), will lead to messy results.

HaloFour · 2016-10-29T00:47:43Z

@vbcodec

I'm not entirely sure what that has to do with this discussion. Either the compiler will find a suitable deconstructor method and _ will be overwritten, or the compiler will not find a suitable deconstructor method and that will result in a compiler error.

vbcodec · 2016-10-29T00:51:32Z

@HaloFour
Wildcards allow to use methods with higher number outs than target need (two in your example). So without support for wildcards compiler generate error, but with support for wildcards, there won't be error if last (or first) parameter will be _.

HaloFour · 2016-10-29T00:57:44Z

@vbcodec

Wildcards allow to use methods with higher number outs than targed need

I'm quite sure that this is not the case. If you wanted to deconstruct Point3D and it only offered a 3 parameter deconstructor you'd be required to specify a combination of 3 legal identifiers and wildcards combined:

(x, _) = pt; // illegal
(x, _, _) = pt; // legal
(_, y, _) = pt; // legal
(_, _, z) = pt; // legal

vbcodec · 2016-10-29T01:11:18Z

@HaloFour
Ok, I got it,, concluded my version from https://blogs.msdn.microsoft.com/dotnet/2016/08/24/whats-new-in-csharp-7-0/
If it do not allow to skip unwanted data (in term of quantity, not names), then there is low benefit.

alrz · 2016-10-29T06:30:06Z

@gafter I'm saying that in all those places, instead of a declaration expression we use a pattern so that the following would be possible,

F(out {X:var x});
foreach(let {Result:var result} in tasks)

Or something like that. Perhaps, only complete patterns would be allowed in these contexts.

@HaloFour This is also applicable to var. That's a legal identifier and it could never be ruled out, right? But you'd never use a lowercase identifier for class name — ok, you shouldn't but you can, and when you do you should be aware of the consequences. Your example uses a new feature (deconstruction) so we'd never have to deal with such ambiguity. In all other existing code, backward compatibility is a must and as mentioned in OP the code using _ would never stop compiling. In fact, this is just another reason to implement wildcards now, otherwise we could never use _ for wildcards.

orthoxerox · 2016-10-29T09:20:37Z

@HaloFour The easiest way out is to forbid (or warn against) tuple deconstruction, out var and pattern matching into variables named _ now in v7, before wildcards are implemented. Then only out _ remains syntactically valid and the vNext compiler can emit a warning if variable _ is shadowed by a wildcard of a different type.

HaloFour · 2016-10-29T13:29:22Z

@alrz

The difference would be much more subtle. The declaration of var as a valid type only prevents use of var as a contextual keyword, which at worst would lead to a compiler error when trying to use them both together. That's not the case here. Accidental use of both together could easily lead to perfectly legal code that results in unexpected overwriting of existing values in variables. Types also have much simpler scopes than variables. Because _ is used as both a wildcard and as a property selector shorthand relatively commonly today there are more scenarios where this will collide with wildcards.

@orthoxerox

That's a pretty reasonable suggestion but how much it limit where wildcards could be used within the language? For example there are proposals to allow wildcards to ignore lambda arguments as well as to declare ignored variables. In both of those cases it's already perfectly legal today to use _ as the identifier name.

@gafter @MadsTorgersen

What I don't get is where the ambiguity lies with *. Yes, there is the potential collision with pointer types, but you can't use pointer types in type switch (is int* has always been illegal syntax). And in out var the lack of a following identifier would immediately disambiguate it.

M(out int *) // wildcard out declaration
M(out int * x) // pointer out declaration

if (o is int *) // illegal in C# 1.0 - 6.0, wildcard in C# 7.0.

Is it because that last case would make it seem like is supported pointers?

AdamSpeight2008 · 2016-11-09T17:17:21Z

Another alternative is ... (triple dot) .

var (x, ...) = e;
(int x, int ...) = e;
M(out int ...);
switch (o)
{
    case int  ...:
    case long ...:
        Console.WriteLine("int or long");
        break;
}

gafter · 2016-11-17T00:21:16Z

This work is being placed into the branch https://github.com/dotnet/roslyn/tree/features/wildcard during development.

aluanhaddad · 2016-11-17T14:19:40Z

Actually, in your example the _ field wouldn't be affected because you're using the "long form" of wildcards where you're declaring a new scope. In that case the compiler can unambiguously determine that _ is supposed to be a wildcard. I do believe this to be the prototypical case of using wildcards with deconstruction. The ambiguities only arise from the "short form" where the syntax might allow deconstruction or assignment into an existing identifier called _ if it happens to exist.

@HaloFour Thank you for explaining this.

ErikSchierboom · 2016-11-25T16:08:36Z

@gafter The link to the wildcard branch is incorrect. It should be: https://github.com/dotnet/roslyn/tree/features/wildcard

…ds. (#15548) * Combine deconstruction assignment and declaration, and support discards. - Combine deconstruction assignment and declaration, and support discards. - Add wildcards.work.md to track outstanding work. - Bind each type syntax once in a deconstruction. - Because tuples may contain declarations, adjust lambda disambiguation and adjust parsing of argument lists. - Diagnose tuple element names on the left of a deconstruction. - Add relational operators to disambiguating tokens in 7.5.4.2 * Disallow deconstruction declarations except at the statement level. This is now a semantic restriction (until we decide to remove it). * Revise logic to detect `var` in a declaration expression. Plus other changes per code review. * Add a test that GetTypeInfo on a discard expression doesn't crash. * Small changes per code review. * Add (skipped) test for var invocation in parens. * Rename "Discarded" to "Discard" * Changes recommended via code review. * Minor changes to the handling of declaration expressions per code review. * Addressing blocking feedback while Neal is OOF Fixes #14794 Fixes #14832

gafter · 2017-01-14T21:13:16Z

This work has been completed; closing.

dsaf · 2017-03-10T09:21:39Z

scala/scala3#2041

gafter · 2017-03-27T21:51:23Z

Issue moved to dotnet/csharplang #365 via ZenHub

gafter added Language-C# Design Notes Area-Language Design Feature Specification New Language Feature - Pattern Matching Pattern Matching New Language Feature - Tuples Tuples New Language Feature - Out Variable Declaration Out Variable Declaration labels Oct 28, 2016

gafter added this to the 2.0 (RC.2) milestone Oct 28, 2016

gafter assigned gafter and MadsTorgersen Oct 28, 2016

gulshan mentioned this issue Nov 10, 2016

Proposal: Allow using discard for throwaway variables #8074

Closed

jaredpar changed the title ~~Proposed changes for deconstruction, declaration expressions, and wildcards~~ Proposed changes for deconstruction, declaration expressions, and discards Nov 16, 2016

jcouv mentioned this issue Nov 16, 2016

Support discards in deconstructions, out var and patterns #12619

Closed

jcouv self-assigned this Nov 16, 2016

jcouv modified the milestones: 2.0 (RC.3), 2.0 (RC.2) Nov 22, 2016

DavidArno mentioned this issue Dec 2, 2016

Add "any" option to tuple matching DavidArno/SuccincT#8

Closed

gafter modified the milestones: 2.0 (RTM), 2.0 (RC.3) Dec 2, 2016

jcouv mentioned this issue Dec 12, 2016

Combine deconstruction assignment and declaration, and support discards #15826

Closed

gafter mentioned this issue Dec 12, 2016

Combine deconstruction assignment and declaration, and support discards. #15548

Merged

jcouv mentioned this issue Dec 12, 2016

Combine deconstruction assignment and declaration, and support discards (merging to master) #15842

Merged

jcouv closed this as completed in #15842 Dec 13, 2016

jcouv reopened this Dec 16, 2016

jcouv removed their assignment Dec 18, 2016

bondsbw mentioned this issue Jan 5, 2017

Permit discards in lambda parameters #16255

Closed

gafter closed this as completed Jan 14, 2017

HaloFour mentioned this issue Oct 31, 2017

Champion "deprecate _ (underbar) as an identifier" dotnet/csharplang#1064

Closed

5 tasks

HaloFour mentioned this issue May 24, 2018

Why doesn't use an invalid name in discards？ #27101

Closed

HaloFour mentioned this issue Feb 24, 2020

Question: Use of discard in deconstruction syntax dotnet/csharplang#3229

Closed

This was referenced Apr 30, 2021

Name Lookup for Discards (C# 7.0) dotnet/csharpstandard#284

Closed

Proposed changes for deconstruction, declaration expressions, and discards (C# 7.0) dotnet/csharpstandard#285

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposed changes for deconstruction, declaration expressions, and discards #14794

Proposed changes for deconstruction, declaration expressions, and discards #14794

gafter commented Oct 28, 2016 •

edited

Loading

gafter commented Oct 28, 2016

gafter commented Oct 28, 2016

gafter commented Oct 28, 2016

HaloFour commented Oct 28, 2016

CyrusNajmabadi commented Oct 28, 2016

CyrusNajmabadi commented Oct 28, 2016

alrz commented Oct 28, 2016 •

edited

Loading

alrz commented Oct 28, 2016

gafter commented Oct 29, 2016 •

edited

Loading

HaloFour commented Oct 29, 2016

vbcodec commented Oct 29, 2016

HaloFour commented Oct 29, 2016

vbcodec commented Oct 29, 2016 •

edited

Loading

HaloFour commented Oct 29, 2016

vbcodec commented Oct 29, 2016 •

edited

Loading

HaloFour commented Oct 29, 2016

vbcodec commented Oct 29, 2016 •

edited

Loading

alrz commented Oct 29, 2016 •

edited

Loading

orthoxerox commented Oct 29, 2016

HaloFour commented Oct 29, 2016

AdamSpeight2008 commented Nov 9, 2016

gafter commented Nov 17, 2016 •

edited

Loading

aluanhaddad commented Nov 17, 2016

ErikSchierboom commented Nov 25, 2016

gafter commented Jan 14, 2017

dsaf commented Mar 10, 2017

gafter commented Mar 27, 2017

Proposed changes for deconstruction, declaration expressions, and discards #14794

Proposed changes for deconstruction, declaration expressions, and discards #14794

Comments

gafter commented Oct 28, 2016 • edited Loading

Wildcards

Declaration Expressions

Deconstruction

Wildcards (revisited)

Syntax model changes

Syntax.xml changes

SemanticModel changes

Summary

gafter commented Oct 28, 2016

gafter commented Oct 28, 2016

gafter commented Oct 28, 2016

HaloFour commented Oct 28, 2016

CyrusNajmabadi commented Oct 28, 2016

CyrusNajmabadi commented Oct 28, 2016

alrz commented Oct 28, 2016 • edited Loading

alrz commented Oct 28, 2016

gafter commented Oct 29, 2016 • edited Loading

HaloFour commented Oct 29, 2016

vbcodec commented Oct 29, 2016

HaloFour commented Oct 29, 2016

vbcodec commented Oct 29, 2016 • edited Loading

HaloFour commented Oct 29, 2016

vbcodec commented Oct 29, 2016 • edited Loading

HaloFour commented Oct 29, 2016

vbcodec commented Oct 29, 2016 • edited Loading

alrz commented Oct 29, 2016 • edited Loading

orthoxerox commented Oct 29, 2016

HaloFour commented Oct 29, 2016

AdamSpeight2008 commented Nov 9, 2016

gafter commented Nov 17, 2016 • edited Loading

aluanhaddad commented Nov 17, 2016

ErikSchierboom commented Nov 25, 2016

gafter commented Jan 14, 2017

dsaf commented Mar 10, 2017

gafter commented Mar 27, 2017

gafter commented Oct 28, 2016 •

edited

Loading

alrz commented Oct 28, 2016 •

edited

Loading

gafter commented Oct 29, 2016 •

edited

Loading

vbcodec commented Oct 29, 2016 •

edited

Loading

vbcodec commented Oct 29, 2016 •

edited

Loading

vbcodec commented Oct 29, 2016 •

edited

Loading

alrz commented Oct 29, 2016 •

edited

Loading

gafter commented Nov 17, 2016 •

edited

Loading