Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open LDM Issues in Pattern-Matching #1054

Closed
gafter opened this issue Oct 28, 2017 · 312 comments
Closed

Open LDM Issues in Pattern-Matching #1054

gafter opened this issue Oct 28, 2017 · 312 comments

Comments

@gafter
Copy link
Member

gafter commented Oct 28, 2017

Here are my top open issues for pattern matching after C# 7.

Open LDM Issues in Pattern-Matching

Recursive pattern forms

The following is the working grammar for recursive patterns. It requires LDM review.

pattern
	: declaration_pattern
	| constant_pattern
	| deconstruction_pattern
	| property_pattern
	;
declaration_pattern
	: type identifier
	;
constant_pattern
	: expression
	;
deconstruction_pattern
	: type? '(' subpatterns? ')' property_subpattern? simple_designation?
	;
subpatterns
	: subpattern
	| subpattern ',' subpatterns
	;
subpattern
	: pattern
	| identifier ':' pattern
	;
property_subpattern
	: '{' subpatterns? '}'
	;
property_pattern
	: type? property_subpattern simple_designation?
	;

simple_designation
	: single_variable_designation
	| discard_designation
        ;

It is a semantic error if any subpattern of a property_pattern does not contain an identifier (it must be of the second form, which has an identifier).

Note that a null-checking pattern falls out of a trivial property pattern. To check if the string s is non-null, you can write any of the following forms

if (s is object o) ... // o is of type object
if (s is string x) ... // x is of type string
if (s is {} x) ... // x is of type string
if (s is {}) ...

I suspect that the LDM will prefer is instead of : in the property_subpattern.

Resolution 2017-11-20 LDM: we want to forbid a deconstruction pattern with a single subpattern but an omitted type. We may consider ways to relax this later. The identifiers in the syntax above should be designations.]

Matching via ITuple

We need to specify/decide under what conditions the compiler will attempt to match a positional pattern using ITuple.

Resolution 2017-11-20 LDM permit matching via ITuple only for object, ITuple, and types that are declared to implement ITuple but contain no Deconstruct methods.]

Parenthesized expression vs tuple pattern

There is an ambiguity between a constant pattern expressed using a parenthesized expression, and a “positional” (Deconstruct) recursive pattern with one element:

case (2):

Here are the two possible meanings:

switch (2)
{
    case (2): // a parenthesized expression

and

class C { public void Deconstruct(out int x)}

switch (new C())
{
    case (2): // Call Deconstruct once.

While it is possible for the programmer to disambiguate (e.g. case +(2): or case (2) {}), we still need a policy for the compiler to disambiguate when the programmer has not done so. The proposed policy is that this will be parsed as a parenthesized expression. However, if the type of the parenthesized expression is not suitable given the switch expression, semantic analysis will notice the set of parentheses and fall back to trying Deconstruct. That will allow both of the above examples to “work” as written.

It does this recursively, so this will work too:

class C { public void Deconstruct(out D d)}
class D { public void Deconstruct(out int x)}

switch (new C())
{
    case ((2)): // Call Deconstruct twice.

A semantic ambiguity arises between these two interpretations when the switched type is, for example, object (we deconstruct using ITuple). Another way the ambiguity could arise would be if someone were to write an (extension) Deconstruct method for one of the built-in types. In either case we will just treat it as a parenthesized expression (i.e. the simplest interpretation that is semantically meaningful), and require explicit disambiguation if another interpretation is intended.

Resolution 2017-11-20 LDM: disallow a deconstruction pattern that contains a single subpattern but for which the type is omitted. We'll look at other ways of disambiguating later, such as perhaps permitting var to infer the type, or a trailing comma. This also keeps the design space open for using parens for grouping patterns in the future, e.g. if we introduce or and and patterns.]

Cast expression ambiguity

There is a syntactic ambiguity between a cast expression and a positional pattern. The switch case

case (A)B:

could be either a cast (i.e. existing code, if A is the name of an enum type, Int32 or a using-alias)

case (Int32)MyConstantValue:

or a single-element deconstruction pattern with the constant subpattern A being named B.

case (MyConstantValue) x:

When such an ambiguity occurs, it (is proposed that it) will be parsed as a cast expression (for compatibility with existing code). To disambiguate, you can add an empty property pattern part:

case (MyConstantValue) {} x;

Resolution 2017-11-20 LDM: this issue is moot given the resolution to the previous issue. It is a cast if it can be a cast, otherwise it is an error.]

Short discard

I think that in recursive patterns the syntax for a discard var _ will be a bit too verbose. I'd rather see _ used as an alternative, e.g. o is Point(3, _). But that conflicts with its use as a constant pattern. That also gets mixed up with the parenthesized expression ambiguity described above. My proposal would be to say that _ in a pattern is always a discard, but with an error reported if there is a constant named _ in scope. That error prevents a silent change to existing code, where x is _ may treat the _ as a constant.

This was part of the motivation for #1064, but that may be going too far.

Resolution 2017-11-20 LDM: we want to permit the short discard. We don't want to break existing code that tests for a type, e.g. expr is _ where there is a type named _. So that will continue to work. However, we do not want to permit a short discard at the top level in an is-pattern expression. We also want to forbid matching a named constant named _. The latter may be a breaking change, but only from a recently permitted coding pattern, so it is probably a break we can take.]

Match expression

We need to define the syntax and semantics of the match expression.

There are many syntax options listed below. Here is one proposal to get the discussion started:

    state = (state, action) switch {
        (DoorState.Closed, Action.Open) => DoorState.Opened,
        (DoorState.Opened, Action.Close) => DoorState.Closed,
        (DoorState.Closed, Action.Lock) => DoorState.Locked,
        (DoorState.Locked, Action.Unlock) => DoorState.Closed,
        _ => state};

this is based on the following grammar

switch_expression
    : null_coalescing_expression switch '{' switch_expression_case_list '}'
    ;
switch_expression_case_list
    : switch_expression_case
    | switch_expression_case_list ',' switch_expression_case
    ;
switch_expression_case
    : pattern when_clause? '=>' expression
    ;

With switch_expression added at roughly the same precedence as a conditional expression but left associative.

non_assignment_expression
    : conditional_expression
    | lambda_expression
    | query_expression
    | switch_expression
    ;

The main semantic question is what should happen when the set of patterns is incomplete: should the compiler silently accept the code (or perhaps with a warning) and allow it to throw a runtime exception when the input is unmatched, or should it be a compile-time error?

Discussion 2018-04-04 LDM: Syntax OK for now. We will probably revisit. Warning and exception OK.

Scope of expression variables in initializers

Resolved See #32 for details.

Order of evaluation in pattern-matching

Giving the compiler flexibility in reordering the operations executed during pattern-matching can permit flexibility that can be used to improve the efficiency of pattern-matching. The (unenforced) requirement would be that properties accessed in a pattern, and the Deconstruct methods, are required to be "pure" (side-effect free, idempotent, etc). That doesn't mean that we would add purity as a language concept, only that we would allow the compiler flexibility in reordering operations.

Resolution 2018-04-04 LDM: confirmed: the compiler is permitted to reorder calls to Deconstruct, property accesses, and invocations of methods in ITuple, and may assume that returned values are the same from multiple calls. The compiler should not invoke functions that cannot affect the result, and we will be very careful before making any changes to the compiler-generated order of evaluation in the future.

default pattern

Once we start having a match expression that doesn't use the keyword case, a constant pattern that uses the simple literal default will look like a default branch of the switch expression. To avoid any confusion, I would like to disallow a simple expression default as a constant pattern. We could leave it as a warning, as today, or make it an error in a switch statement. Note that you can always write either null, 0, or '\0' instead of default.

Resolution: See dotnet/roslyn#23499 which documents our tentative decision, taken via email, to forbid default as a pattern.

Range Pattern

If we have a range operator 1..10, would we similarly have a range pattern? How would it work?

    if (ch is in 'a' to 'z')
    switch (ch) {
        case in 'a' to 'z':

Discussion 2018-04-04 LDM: Keep this on the back burner. It is not clear how important it is. We would want to discuss this in the context of also extending other constructs:

from i in 1 to 100 select ...

foreach (var i in 0 to s.Length-1) ...

var deconstruct pattern

It would be nice if there were a way to pattern-match a tuple (or Deconstructable) into a set of variables declared only by their designator, e.g. the last line in this match expression

    var newState = (GetState(), action, hasKey) switch {
        (DoorState.Closed, Action.Open, _) => DoorState.Opened,
        (DoorState.Opened, Action.Close, _) => DoorState.Closed,
        (DoorState.Closed, Action.Lock, true) => DoorState.Locked,
        (DoorState.Locked, Action.Unlock, true) => DoorState.Closed,
        var (state, _, _) => state };

(Perhaps not the best example since it only declares one thing on the last line)

This would be based on some grammar like this

var_pattern
    : 'var' variable_designation
    ;

where the variable_designation could be a parenthesized_variable_designation, i.e. generalizing the current construct.

To make this syntactically unambiguous, we would no longer allow var to bind to a user-declared type in a pattern. Forbidding it from binding to a constant would also simplify things, but probably isn't strictly necessary.

I’m suggesting this now because at a recent LDM it was suggested that var could perhaps be used as a placeholder for an unknown type taken from context. But that would conflict with this usage because this usage changes the syntax of what is permitted between the parens (designators vs patterns).

Resolution 2018-04-04 LDM: YES! Approved.

ref/lvalue-producing pattern switch expression

As currently designed, the “switch expression” yields an rvalue.

    e switch { p1 when c1 => v1, p2 when c2 => v2 }

@agocke pointed out that it might be valuable for there to be a variant that produces a ref or an lvalue.

  1. Should we pursue this?
  2. What would the syntax be?
    e switch { p1 when c1 => ref v1, p2 when c2 => ref v2 }
  3. Would the same syntax be used to produce a ref and an lvalue?

Discussion 2018-04-04 LDM: Lets keep this on the back burner and see if there are requests based on actual use cases.

switching on a tuple literal

In order to switch on a tuple literal, you have to write what appear to be redundant parens

switch ((a, b))
{

It has been proposed that we permit

switch (a, b)
{

Resolution 2018-04-04 LDM: Yes. We discussed a couple of ways of doing this, and settled on making the parens of the switch statement optional when the expression being switched on is a tuple literal.

Short discard diagnostics

Regarding the previous short discard diagnostics approved by the LDM (see above), there are two possibly breaking situations.

case _ where there is a named constant _ is now an error.

e is _ is proposed to be a warning: The name '_' refers to the type '{0}', not the discard pattern. Use '@_' for the type, or 'var _' to discard.

Nullable Reference Types vs switch analysis

We give a warning when a switch expression does not exhaustively handle all possible inputs. What if the input is a non-nullable string s, in the code

s switch { string t => t }

Is a warning deserved because the switch expression does not handle all possible inputs (it doesn't handle a possible null)? Note that although we think s cannot be null, we still generate code for a null check and throw an exception when the null input is not handled.

If we decide not to generate a warning here, we may have to carefully structure the phases of the compiler so we do this exhaustiveness analysis after the nullable pass of the compiler, which will now have to record its conclusions in a rewritten bound tree.

Similarly, in the code

int i;
switch (s)
{
    case string t: i = 0; break;
}
Console.WriteLine(i); // is i definitely assigned?

Is the last line an error because i is not definitely assigned?

Resolution 2018-10-10: The diagnostics should take the nullable analysis into account. In practice that means that the initial binding pass will assume that reference nulls can't appear in pattern inputs, and the nullable analysis will only warn for branches of the decision tree containing matches for reference nulls. We should make sure to test for situations involving unconstrained generics.

Use type unification to rule out some pattern-matching situations?

See dotnet/roslyn#26100. The possible new warnings for the is-type operator correspond to possible errors for the is-pattern operator. We could use type unification subroutines, which are already in the compiler, to detect some situations and make them errors. Should we? Sooner is better.

Permit optionally omitting the pattern on the last branch of the switch expression

We wonder if it would be helpful to sometime omit the pattern on the last branch of a switch expression, in cases where the pattern would be _.

Resolution 2018-10-10: No.

Should exhaustiveness affect definite assignment

In a non-pattern switch statement, we do not currently (in any previous C# version) do "exhaustiveness" analysis:

    int M(bool b)
    {
        int result;
        switch (b)
        {
            case false:
                result = 0;
                break;
            case true:
                result = 1;
                break;
        }
        return result; // error: not definitely assigned
    }

But in a pattern-based switch statement we do:

    int M(bool b)
    {
        int result;
        switch (b)
        {
            case false when true:
                result = 0;
                break;
            case true:
                result = 1;
                break;
        }
        return result; // ok: definitely assigned
    }

We should decide and confirm the intended behavior for both cases. This also includes reachability of statements. The current prototype does exhaustiveness analysis for switch statements based on a previous informal decision. We should confirm that in the LDM.

Resolution 2018-10-10: The current behavior is confirmed as intended.

Switch expression as a statement expression

It has been requested that we permit a switch expression to be used as a statement expression. In order to make this work, we'd

  1. Permit its type to be void.
  2. Permit a switch expression to be used as a statement expression if all of the switch arm expressions are statement expressions.
  3. Adjust type inference so that void is more specific than any other type, so it can be inferred as the result type.

Resolution 2018-10-10: Yes! This requires a more precise spec, and it needs to handle a?.M() where M() returns an uncontrained T.

Single-element positional deconstruction

The current LDM decision on single-element positional deconstruction pattern is that the type is required.

This is particularly inconvenient for tuple types and other generic types, and perhaps impossible for values deconstructed by a dynamic check via ITuple.

We should consider if other disambiguation should be permitted (e.g. the presence of a designation).

if (o is (3) _)

Resolution 2018-10-10: Discussed but not resolved.

Disambiguating deconstruction using parameter names

We could permit more than one Deconstruct method if we permit disambiguating by parameter names.

class Point
{
    public void Deconstruct(double Angle, double Length) => ...;
    public void Deconstruct(int X, int Y) => ...;
}

Point p = ...;
if (p is (X: 3, Y: 4)) ...;

Should we do this?

var pattern for 0 and 1 elements

The var pattern currently requires 2 or more elements because it inherits the grammar for variable_designation. However, both 0-element and 1-element var patterns would make sense and are syntactically unambiguous.

public class C {
    public void Deconstruct() => throw null;
    public void Deconstruct(out int i) => throw null;
    public void Deconstruct(out int i, out int j) => throw null;
    void M() {
        if (this is var ()) { }          // error
        if (this is var (x1)) { }        // error
        if (this is var (x2, y2)) { }    // ok
    }
}

I propose we relax the grammar to permit 0-element and 1-element var patterns.

We could do the same for deconstruction.

Syntax model for var patterns

In C# 7, we treat both Type identifier and var identifier as a DeclarationPatternSyntax, even though they have quite different semantics.

In C# 8, we introduce a more general var pattern that accepts any designation on the right-hand-side.

We would prefer to use the new node for var identifier even though those were represented by a declaration pattern previously. Clients of the syntax API would see a change (to using a new node) when parsing existing code.

Is this an acceptable change to make?

Kind of member in a property pattern

A property pattern { PropName : Pattern } is used to check if the value of a property or field matches the given pattern.

Besides readable properties and fields, what other kinds of members should be permitted here?

  • An indexed property? Possible indexed with a constant?
  • An event reference?
  • Anything else?

Restricted types

A recent bug (dotnet/roslyn#27803) revealed how pattern matching interacts with restricted types in interesting ways (when doing type checks).
We should discuss with LDM on whether this should be banned (like pointer types), and if there are actual use cases that need it.

Matching using ITuple

I assume that we only intend to permit matching using ITuple when the type is omitted?

IComparable x = ...;
if (x is ITuple(3, 4)) // (1) permitted?
if (x is object(3, 4)) // (2) permitted?
if (x is SomeTypeThatImplementsITuple(3, 4)) // (3) permitted?

Matching using ITuple in the presence of an extension Deconstruct

When an extension Deconstruct method is present and a type also implements the ITuple interface, it is not clear which should take priority. I believe the current LDM position (probably not intentional) is that the extension method is used for the deconstruction declaration or deconstruction assignment, and ITuple is used for pattern-matching. We probably want to reconcile these to be the same.

public class C : ITuple
{
    int ITuple.Length => 3;
    object ITuple.this[int i] => i + 3;
    public static void Main()
    {
        var t = new C();
        Console.WriteLine(t is (3, 4)); // true? or a compile-time error?
        Console.WriteLine(t is (3, 4, 5)); // true? or a compile-time error?
    }
}
static class Extensions
{
    public static void Deconstruct(this C c, out int X, out int Y) => (X, Y) = (3, 4);
}

Discussion 2018-10-24: No conclusion. We need a proposal to consider.

ITuple vs unconstrained type parameter

Our current rules require an error in the following, and it will not attempt to match using ITuple. Is that what we want?

    public static void M<T>(T t)
    {
        Console.WriteLine(t is (3, 4)); // error: no Deconstruct
    }

Resolution 2018-10-24: Lets keep it an error for now but be open to relaxing it in the future.

Discard pattern in switch

Is the discard pattern permitted at the top level of a switch statement? If so, it could change the meaning of existing code. I suggest we require you write case var _ to avoid any ambiguity.

Resolution 2018-10-10: A discard pattern in the presence of a constant will bind to the constant only in a switch statement case, with a warning given under a warning wave. Elsewhere in a pattern it is a discard pattern. A discard pattern in an is-expression in the presence of a type will bind to the type, with a warning under a warning wave.

Is a pattern permitted to match a pointer type?

Is a var pattern such as var x permitted to match an input that is of a pointer type? It would not be permitted with an explicitly provided pattern type, as the syntax doesn't allow that.

Resolution 2018-10-24: A pointer type can match a var pattern (e.g. var x), a discard pattern (e.g. _), and a constant pattern with the value null (e.g. null). No other pattern forms would permit an input of a pointer type.

@gafter gafter self-assigned this Oct 28, 2017
@gafter gafter added this to the 8.0 candidate milestone Oct 28, 2017
@alrz
Copy link
Member

alrz commented Oct 28, 2017

Please don't give (<pattern>) a semantic meaning. IMO parens (alone) should be insignificant whether it's a pattern or expression. making ((2)) call Deconstruct twice based on target type is terribly confusing. I'd suggest we require type name when we want to do a single out deconstruction (as a disambiguation mechanism not a semantical restriction).

PS: plus, this will retain the possibility of introducing parenthesized patterns once we have pattern operators like OR-patterns to determine precedence. Yes, the compiler could infer types and all, but conciseness wouldn't always improve code readability/productivity.

@iam3yal
Copy link
Contributor

iam3yal commented Oct 28, 2017

The main semantic question is what should happen when the set of patterns is incomplete

My opinion about it is if possible it should be a compile-time error; otherwise, if it poses a problem it should issue a warning and throw a run-time error.

@gafter
Copy link
Member Author

gafter commented Oct 30, 2017

Some options for the syntax of the match expression

All of these assume the introduction of a new match keyword or contextual keyword.

Infix Lambda-like option (👍)

var result = ComputeSomething() match (string s => s.Length, int i => i, Point(var x, var y) when x > y => x*x + y*y, var _ => 0);
var result = ComputeSomething() match (
    string s => s.Length,
    int i => i,
    Point(var x, var y) when x > y => x*x + y*y,
    var _ => 0);

Switch-like option (😄)

var result = match(ComputeSomething()) {case string s: s.Length, case int i: i, case Point(var x, var y) when x > y: x*x + y*y, case var _: 0};
var result = match(ComputeSomething()) {
    case string s: s.Length,
    case int i: i,
    case Point(var x, var y) when x > y: x*x + y*y,
    case var _: 0};

Invocation-lambda-like (🎉)

var result = match (ComputeSomething(), string s => s.Length, int i => i, Point(var x, var y) when x > y => x*x + y*y, var _ => 0);
var result = match (ComputeSomething()
    string s => s.Length,
    int i => i,
    Point(var x, var y) when x > y => x*x + y*y,
    var _ => 0);

Infix Switch-like option ( ❤️ )

var result = ComputeSomething() match (case string s: s.Length, case int i: i, case Point(var x, var y) when x > y: x*x + y*y, case var _: 0);
var result = ComputeSomething() match (
    case string s: s.Length,
    case int i: i,
    case Point(var x, var y) when x > y: x*x + y*y,
    case var _: 0);

@HaloFour
Copy link
Contributor

@gafter

Back to a match keyword?

My vote would probably be for either of the infix options, probably leaning more towards "lambda-like" as I like the ability to omit the case keyword and not treat the patterns like they're labels.

@VisualMelon
Copy link

VisualMelon commented Oct 30, 2017

I think the 'lambda-line' style options would be better, because the => symbol is already associated with an expression returning a value. The colon of the switch/case statement denotes a target for flow control (switch or goto) and cases require an explicit escape statement (where a return returns from the whole method rather than some local functional scope)).

Looking at 'multi-statement' cases (which I'm naively assuming will be supported, but the consistency is valuable regardless), the lambda => is again more familiar:

string s => { /*some code*/; return s.Length; }

It is not immediately apparent how the switch style would be written: if a multi-statement case contained a return then it would look like a return from the method. Simply, cases contain statements, the right hand side of a lambda 'feels' like an expression.

@orthoxerox
Copy link

Invocation style match is probably the weirdest.

I used to like the switch-like match and it has some spiritual brothers ({,,,}) in array initializers, anonymous classes and hopefully with. I wonder if it will look better with pseudo lambdas:

var result = match (ComputeSomething()) {string s => s.Length, int i => i, Point(var x, var y) when x > y => x*x + y*y, var _ => 0};
var result = match (ComputeSomething()) {
    string s => s.Length,
    int i => i,
    Point(var x, var y) when x > y => x*x + y*y,
    var _ => 0, //optional trailing comma
};

Both => and case : require a single shift-register key, but => is shorter.

@orthoxerox
Copy link

Oh, and pseudo-lambdas will work well with deconstruction of parameters in real lambdas and method headers. 😉.

@mungojam
Copy link

Having got very used to piping and method chaining with Linq and with the R dplyr package, I would definitely want one of the infix options. I'm wondering if it could it be implemented in some clever way as an extension method?

@gafter
Copy link
Member Author

gafter commented Oct 30, 2017

Looking at 'multi-statement' cases (which I'm naively assuming will be supported

The multi-statement cases will work the same way that they work in every other context where a subexpression appears in the language. I would recommend using a local function, but see also #72 and #377.

@iam3yal
Copy link
Contributor

iam3yal commented Oct 30, 2017

I like either of the infix options.

The switch-like option is appealing only due to consistency with the switch statement but I really like the lambda-like option more, the syntax feels different (in a good way) and consistent with the language.

@HaloFour
Copy link
Contributor

@gafter

That's probably the one big disadvantage to the lambda-like syntax. It would be inconsistent with the rest of the language if you weren't able to follow the '=>' with an anonymous_function_body containing a block. If you are required to have an expression there and have to resort to something like sequence expressions or local functions for anything more complicated I would say that it's probably better to go with the "infix switch-like" option.

@gafter
Copy link
Member Author

gafter commented Oct 31, 2017

It would be inconsistent with the rest of the language if you weren't able to follow the '=>' with an anonymous_function_body containing a block.

Huh?

class A
{
    public int X => { return 3; } // syntax error
}

@iam3yal
Copy link
Contributor

iam3yal commented Oct 31, 2017

@gafter

I thought that the reason you added local functions is so it would be crystal clear for the developer that he/she can only call a local function from within the method that defines it but now the only user of that function would be the poor branch inside the match expression? and the justification for this local function would be because some branch of the match expression spans over multiple lines of code whereas the reason should be because it needs to be called multiple times or/and because it documents better the intention behind the code.

It would be an unfortunate design decision and one that in my opinion abuses local functions because if anything it wouldn't be a solution but a workaround of a shortcoming to a design decision, people would end up wondering why, why they can't have the logic as part of the branch that it belongs to whereas with a switch statement they can.

I don't think people would like to see caseXA(...) { }, caseXB(...) { }, ... in their methods.. it feels wrong and incomplete.

So all in all if the infix switch-like option allows to have a block without having to resort to something like local functions or things like #377 that we don't have a clue of when you're going to add it then yeah I'd love to have the infix switch-like option over the infix lambda-like option.

@alrz
Copy link
Member

alrz commented Oct 31, 2017

I'd go with switch-like "header" match (expr) {} instead of infix form if we want to permit usages as a statement,

match (e)
{
  string s => expr, ...
} // ideally, no semicolon, so effectively, not an expression_statement

Other than that, pushing match to the end of the target expression might look like this;

var result = match (longExpression) // we know that the next line is a match body
{ ... };
var result = longExpression match // `match` could get out of sight!
{ ... };

I also liked the lambda-like sections because:

  • case is redundant afterall since there is no ambigiuties
  • it's a lot less noise in one-liners
  • multi-case sections aren't required if we go with OR-patterns

Also it would eliminated an expectation of default: since we can use _ => .. as with other languages.

PS: but as others mentioned, I think I'd go with another token as => is too loaded to be used here.

@orthoxerox
Copy link

orthoxerox commented Oct 31, 2017

An extra-large dose of 🍝. Practically all implementations of match expressions in other languages are included, plus their permutations and some more. The only not-really-handled question is the question of commas or other clause terminators. Some styles, like those beginning with case, do not really require comma terminators. On the other hand, both languages using bar-lambdas do not require them, but this is ambiguous in C# with its | operator.

I personally like Rust/Kotlin's approach. Nemerle/Reason's is clean, but doesn't work well with existing C# syntax.

(practically) All possible permutations

Switch-like, pseudo-lambdas (Rust, Kotlin)

var result = match (ComputeSomething()) {string s => s.Length, int i => i, Point(var x, var y) when x > y => x*x + y*y, var _ => 0};
var result = match (ComputeSomething()) {
    string s => s.Length,
    int i => i,
    Point(var x, var y) when x > y => x*x + y*y,
    var _ => 0,
};

Switch-like, cases

var result = match (ComputeSomething()) {case string s: s.Length, case int i: i, case Point(var x, var y) when x > y: x*x + y*y, case var _: 0};
var result = match (ComputeSomething()) {
    case string s: s.Length,
    case int i: i,
    case Point(var x, var y) when x > y: x*x + y*y,
    case var _: 0,
};

Switch-like, case-lamdbas

var result = match (ComputeSomething()) {case string s => s.Length, case int i => i, case Point(var x, var y) when x > y => x*x + y*y, case var _ => 0};
var result = match (ComputeSomething()) {
    case string s => s.Length,
    case int i => i,
    case Point(var x, var y) when x > y => x*x + y*y,
    case var _ => 0,
};

Switch-like, bar-lamdbas (Nemerle, Reason)

var result = match (ComputeSomething()) {| string s => s.Length, | int i => i, | Point(var x, var y) when x > y => x*x + y*y, | var _ => 0};
var result = match (ComputeSomething()) {
    | string s => s.Length,
    | int i => i,
    | Point(var x, var y) when x > y => x*x + y*y,
    | var _ => 0,
};

Switch-like, colons

var result = match (ComputeSomething()) {string s: s.Length, int i: i, Point(var x, var y) when x > y: x*x + y*y, var _: 0};
var result = match (ComputeSomething()) {
    string s: s.Length,
    int i: i,
    Point(var x, var y) when x > y: x*x + y*y,
    var _: 0,
};

Infix, pseudo-lambdas, braces

var result = ComputeSomething() match {string s => s.Length, int i => i, Point(var x, var y) when x > y => x*x + y*y, var _ => 0};
var result = ComputeSomething() match {
    string s => s.Length,
    int i => i,
    Point(var x, var y) when x > y => x*x + y*y,
    var _ => 0,
};

Infix, cases, braces

var result = ComputeSomething() match {case string s: s.Length, case int i: i, case Point(var x, var y) when x > y: x*x + y*y, case var _: 0};
var result = ComputeSomething() match {
    case string s: s.Length,
    case int i: i,
    case Point(var x, var y) when x > y: x*x + y*y,
    case var _: 0,
};

Infix, case-lamdbas, braces (Scala)

var result = ComputeSomething() match {case string s => s.Length, case int i => i, case Point(var x, var y) when x > y => x*x + y*y, case var _ => 0};
var result = ComputeSomething() match {
    case string s => s.Length,
    case int i => i,
    case Point(var x, var y) when x > y => x*x + y*y,
    case var _ => 0,
};

Infix, bar-lamdbas, braces

var result = ComputeSomething() match {| string s => s.Length, | int i => i, | Point(var x, var y) when x > y => x*x + y*y, | var _ => 0};
var result = ComputeSomething() match {
    | string s => s.Length,
    | int i => i,
    | Point(var x, var y) when x > y => x*x + y*y,
    | var _ => 0,
};

Infix, colons, braces

var result = ComputeSomething() match {string s: s.Length, int i: i, Point(var x, var y) when x > y: x*x + y*y, var _: 0};
var result = ComputeSomething() match {
    string s: s.Length,
    int i: i,
    Point(var x, var y) when x > y: x*x + y*y,
    var _: 0,
};

Infix, pseudo-lambdas, parens

var result = ComputeSomething() match (string s => s.Length, int i => i, Point(var x, var y) when x > y => x*x + y*y, var _ => 0);
var result = ComputeSomething() match (
    string s => s.Length,
    int i => i,
    Point(var x, var y) when x > y => x*x + y*y,
    var _ => 0);

Infix, cases, parens

var result = ComputeSomething() match (case string s: s.Length, case int i: i, case Point(var x, var y) when x > y: x*x + y*y, case var _: 0);
var result = ComputeSomething() match (
    case string s: s.Length,
    case int i: i,
    case Point(var x, var y) when x > y: x*x + y*y,
    case var _: 0);

Infix, case-lamdbas, parens

var result = ComputeSomething() match (case string s => s.Length, case int i => i, case Point(var x, var y) when x > y => x*x + y*y, case var _ => 0);
var result = ComputeSomething() match (
    case string s => s.Length,
    case int i => i,
    case Point(var x, var y) when x > y => x*x + y*y,
    case var _ => 0,
);

Infix, bar-lamdbas, parens

var result = ComputeSomething() match (| string s => s.Length, | int i => i, | Point(var x, var y) when x > y => x*x + y*y, | var _ => 0);
var result = ComputeSomething() match (
    | string s => s.Length,
    | int i => i,
    | Point(var x, var y) when x > y => x*x + y*y,
    | var _ => 0,
);

Infix, colons, parens

var result = ComputeSomething() match (string s: s.Length, int i: i, Point(var x, var y) when x > y: x*x + y*y, var _: 0);
var result = ComputeSomething() match (
    string s: s.Length,
    int i: i,
    Point(var x, var y) when x > y: x*x + y*y,
    var _: 0);

Invocation-like, pseudo-lambdas

var result = match(ComputeSomething(), string s => s.Length, int i => i, Point(var x, var y) when x > y => x*x + y*y, var _ => 0);
var result = match(ComputeSomething(),
    string s => s.Length,
    int i => i,
    Point(var x, var y) when x > y => x*x + y*y,
    var _ => 0);

Invocation-like, cases

var result = match(ComputeSomething(), case string s: s.Length, case int i: i, case Point(var x, var y) when x > y: x*x + y*y, case var _: 0);
var result = match(ComputeSomething(),
    case string s: s.Length,
    case int i: i,
    case Point(var x, var y) when x > y: x*x + y*y,
    case var _: 0);

Invocation-like, case-lambdas

var result = match(ComputeSomething(), case string s => s.Length, case int i => i, case Point(var x, var y) when x > y => x*x + y*y, case var _ => 0);
var result = match(ComputeSomething(),
    case string s => s.Length,
    case int i => i,
    case Point(var x, var y) when x > y => x*x + y*y,
    case var _ => 0);

Invocation-like, bar-lambdas

var result = match(ComputeSomething(), | string s => s.Length, | int i => i, | Point(var x, var y) when x > y => x*x + y*y, | var _ => 0);
var result = match(ComputeSomething(),
    | string s => s.Length,
    | int i => i,
    | Point(var x, var y) when x > y => x*x + y*y,
    | var _ => 0);

Invocation-like, colons

var result = match(ComputeSomething(), string s: s.Length, int i: i, Point(var x, var y) when x > y: x*x + y*y, var _: 0);
var result = match(ComputeSomething(),
    string s: s.Length,
    int i: i,
    Point(var x, var y) when x > y: x*x + y*y,
    var _: 0);

@VisualMelon
Copy link

@gafter thanks for pointing me at those issues. Personally, I am less than keen on the idea of having non-functional multi-statement expression things (I'm less than sold on match without discriminated unions).

Nonetheless, case and : still just say "Follow these instructions, and find your own way out" in the language at the moment. => says "I'm going to transform this input", which is what the match it doing.

@alrz
Copy link
Member

alrz commented Oct 31, 2017

@gafter

Shouldn't we accept type { identifier : pattern } under property-pattern (without parens)?

Also the grammar doesn't seem to accept var (x, (y, z)) under deconstruction-pattern (edit: actually it does but identifiers are being interpreted as constant patterns).

@iam3yal
Copy link
Contributor

iam3yal commented Oct 31, 2017

@VisualMelon

I am less than keen on the idea of having non-functional multi-statement expression things.

Yes because all of the C# features need to satisfy the functional crowd. :)

At times people want to go from declarative to imperative programming without sacrificing code readability, I mean, people might want to change a LINQ expression into a loop for say performance or whatever and at the end return a value so why would they need to go out of their ways and resort to a local function where the logic should be contained in a branch of the match expression? I really don't understand it.

I love the fact that C# is inspired by functional languages but even so it should allow for it and not be rigid/strict about it.

@HaloFour, @alrz, @orthoxerox

Currently, the lambda expression allows us to have the lambda operator followed by a block so why is this a problem here? excuse my ignorance.

I mean just by looking at the grammar:

lambda_expression
    : anonymous_function_signature '=>' anonymous_function_body

But I'm not sure why it poses a problem in this case.

@HaloFour
Copy link
Contributor

@gafter

You're right, it doesn't apply to expression-bodied members. But it does apply to lambdas which people are likely a bit more familiar with. Anywho, I thought it was worth a mention as it already immediately generated questions as to whether you can follow the '=>' with a block of statements that return a value.

@HaloFour
Copy link
Contributor

@eyalsk

These wouldn't be lambda expressions, they would just borrow the syntax. They would behave more like expression-bodied members where you're required to follow the '=>' operator with an expression. So the only way you'd be able to execute multiple statements within the case of a match expression would be to invoke some function (local or otherwise) or rely on something like sequence expressions.

@jnm2
Copy link
Contributor

jnm2 commented Oct 31, 2017

An extra-large dose of image

I needed to Google that emoji to figure out what it was :-/
I'll just leave it at that 😆

@gafter
Copy link
Member Author

gafter commented Oct 31, 2017

@eyalsk

I thought that the reason you added local functions is so it would be crystal clear for the developer that he/she can only call a local function from within the method that defines it but now the only user of that function would be the poor branch inside the match expression? and the justification for this local function would be because some branch of the match expression spans over multiple lines of code whereas the reason should be because it needs to be called multiple times or/and because it documents better the intention behind the code.

Local functions are a really nice way to name and concisely use any complex (multi-statement) logic that you need to compute some expression. So they are really useful, even if used only once. There is nothing special about the match statement that makes it a more important context for this use.

@alrz

Shouldn't we accept type { identifier : pattern } under property-pattern (without parens)?

Yes, fixed.

Also the grammar doesn't seem to accept var (x, (y, z)) under deconstruction-pattern (edit: actually it does but identifiers are being interpreted as constant patterns).

Correct, it wasn't intended to do that. Inferred var will not be permitted for positional or property patterns, so it is not impossible that we'd support that in the future.

@alrz
Copy link
Member

alrz commented Nov 5, 2017

The multi-statement cases will work the same way that they work in every other context where a subexpression appears in the language. I would recommend using a local function

I suppose this works for any multi-statement match arm without using local functions,

var x = match (e) {
  _ => ((Func<int>)(()=>{return 1;})(),
};

Just a tiny bit of allocation there 💀

@ufcpp
Copy link

ufcpp commented Jun 14, 2018

class X
{
    // auto-implemented event generates
    // event: public event Action E { get; set; }
    // field: private Action E;
    public event Action E;
}

This private field?

@Happypig375
Copy link
Member

@ufcpp But then you are assuming that the event is auto-implemented, which may not be the case. The event can possibly direct the add/remove to another field, which the compiler cannot determine without reflection into the code, which will not be consistent.

@svick
Copy link
Contributor

svick commented Jun 14, 2018

@Happypig375 Only code that's within the same type could access that field, which means it will also know whether the event is auto-implemented (or rather, field-like). I don't think anyone is suggesting that pattern matching should bypass accessibility checks or use reflection.

@gafter
Copy link
Member Author

gafter commented Jun 14, 2018

Added

Restricted types

A recent bug (dotnet/roslyn#27803) revealed how pattern matching interacts with restricted types in interesting ways (when doing type checks).
We should discuss with LDM on whether this should be banned (like pointer types), and if there are actual use cases that need it.

@gafter
Copy link
Member Author

gafter commented Jun 18, 2018

Added

Matching using ITuple

I assume that we only intend to permit matching using ITuple when the type is omitted?

IComparable x = ...;
if (x is ITuple(3, 4)) // (1) permitted?
if (x is object(3, 4)) // (2) permitted?
if (x is SomeTypeThatImplementsITuple(3, 4)) // (3) permitted?

@gafter
Copy link
Member Author

gafter commented Jul 25, 2018

Added

Matching using ITuple in the presence of an extension Deconstruct

When an extension Deconstruct method is present and a type also implements the ITuple interface, it is not clear which should take priority. I believe the current LDM position (probably not intentional) is that the extension method is used for the deconstruction declaration or deconstruction assignment, and ITuple is used for pattern-matching. We probably want to reconcile these to be the same.

public class C : ITuple
{
    int ITuple.Length => 3;
    object ITuple.this[int i] => i + 3;
    public static void Main()
    {
        var t = new C();
        Console.WriteLine(t is (3, 4)); // true? or a compile-time error?
        Console.WriteLine(t is (3, 4, 5)); // true? or a compile-time error?
    }
}
static class Extensions
{
    public static void Deconstruct(this C c, out int X, out int Y) => (X, Y) = (3, 4);
}

@gafter
Copy link
Member Author

gafter commented Aug 1, 2018

Added

ITuple vs unconstrained type parameter

Our current rules require an error in the following, and it will not attempt to match using ITuple. Is that what we want?

    public static void M<T>(T t)
    {
        Console.WriteLine(t is (3, 4)); // error: no Deconstruct
    }

@gafter
Copy link
Member Author

gafter commented Aug 15, 2018

added

Discard pattern in switch

Is the discard pattern permitted at the top level of a switch? If so, it could change the meaning of existing code. I suggest we require you write var _ to avoid any ambiguity.

@gafter
Copy link
Member Author

gafter commented Sep 11, 2018

added

Is a pattern permitted to match a pointer type?

Is a var pattern such as var x permitted to match an input that is of a pointer type? It would not be permitted with an explicitly provided pattern type, as the syntax doesn't allow that.

@keithn
Copy link

keithn commented Oct 14, 2018

I really would like this syntax for tuples -

    var newState = (GetState(), action, hasKey) switch {
        (DoorState.Closed, Action.Open, _) => DoorState.Opened,
        (DoorState.Opened, Action.Close, _) => DoorState.Closed,
        (DoorState.Closed, Action.Lock, true) => DoorState.Locked,
        (DoorState.Locked, Action.Unlock, true) => DoorState.Closed,
        var (state, _, _) => state };

but just curious, there's a bunch of cases I'd like it to handle, but not sure if it's possible with this syntax -

(DoorState.Closed, Action.Open, var blah) => DoorState.Opened   // match the two values, but capture the any variable

or would it work

 var (DoorState.Closed, Action.Open, blah)

or would be usefull...

(DoorState.Closed or DoorState.Ajar, _, _) => DoorState.Opened   //  or in more values

This one is a bit complicated and could be done as seperate switch statements, but is actually logically one case, I really don't know what would be good syntax, but I want to match either situation, and project which one matched into a variable.

var ((Player.Winner, _ ) as "One"  or (_, Player.Winner)  as "Two) player  =>    Console.WriteLine($"Player {player} won");

@HaloFour
Copy link
Contributor

@keithn

You'd use var in the recursive pattern, so like this:

    var newState = (GetState(), action, hasKey) switch {
        (DoorState.Closed, Action.Open, _) => DoorState.Opened,
        (DoorState.Opened, Action.Close, _) => DoorState.Closed,
        (DoorState.Closed, Action.Lock, true) => DoorState.Locked,
        (DoorState.Locked, Action.Unlock, true) => DoorState.Closed,
        (var state, _, _) => state };

"or" patterns are championed here: #1350

As for your final comment, the closest thing I can think of would be declaration expressions:

(var player = players switch { (Player.Winner, _) => "One", (_, Player.Winner) => "Two" }; Console.WriteLine($"Player {player} won"));

I don't know that it buys you anything though. Both your example and this example seem unnecessarily difficult to read.

@gafter
Copy link
Member Author

gafter commented Oct 22, 2018

I have updated the original post with resolutions from 2018-10-10. Search for that date to find the changes.

@gafter
Copy link
Member Author

gafter commented Oct 26, 2018

I have updated the original post with resolutions from 2018-10-24. Search for that date to find the changes.

@gafter
Copy link
Member Author

gafter commented Nov 5, 2018

Notes from today's LDM meeting

Deconstruction vs ITuple

Proposed:

  1. If the type is a tuple type (any arity >=0; see below), then use the tuple semantics
  2. if the type has no accessible instance Deconstruct, and satisfies the ITuple deconstruct constraints, use ITuple semantics
  3. Otherwise attempt Deconstruct semantics (instance or extension)

Alternative

  1. If the type is a tuple type (any arity >= 0; see below), then use the tuple semantics
  2. If "binding" a Deconstruct invocation would finds one or more applicable methods, use Deconstruct.
  3. if the satisfies the ITuple deconstruct constraints, use ITuple semantics

In both cases, 4. Error

Decision: Alternative

Proposed:

  1. Permit pattern-matching tuple patterns with 0 and 1 elements (appropriately disambiguated as previously decided)
if (e is ()) ...

if (e is (1) _) ...
if (e is (x: 1)) ...

if (e is (1, 2)) ...

Conclusion: Approved

Proposed:

As part of this, I propose that we consider System.ValueTuple<T> instantitations to be considered tuple types. I do not propose any syntax changes related to this.

As part of this, I propose that we consider System.ValueTuple to be considered a tuple type. I do not propose any syntax changes related to this.

Conclusion: Approved

@CyrusNajmabadi
Copy link
Member

CyrusNajmabadi commented Nov 5, 2018

As part of this, I propose that we consider System.ValueTuple<T> instantitations to be considered tuple types. I do not propose any syntax changes related to this.

Interesting. Do they need to be valid instantiations? i.e. does hte 'Rest' type parameter need to be a ValueTuple itself?

@gafter
Copy link
Member Author

gafter commented Nov 5, 2018

@CyrusNajmabadi You did not quote me correctly. I said that we would consider System.ValueTuple<T> instantitations to be considered tuple types (one element). That type has a single type argument. We would also consider the non-generic System.ValueTuple to be a (zero element) tuple type.

As for System.ValueTuple<T1, T2, T3, T4, T5, T6, T7, TRest>, see dotnet/roslyn#20648 (comment):

  • Instances of System.ValueTuple<T1, T2, T3, T4, T5, T6, T7, TRest> is a tuple type if TRest is a tuple type with more than one element. The 0-element tuple type does not satisfy the requirement for TRest to permit the instantiation to be considered a tuple type.

@CyrusNajmabadi
Copy link
Member

@CyrusNajmabadi You did not quote me correctly.

Sorry, i wasn't trying to do that. I just copied/pasted the text, forgetting that github doesn't keep code quoted in that case. Sorry for misinterpretting. I read your post as applying ot all ValueTuple instantiations, not just the single-type-parameter case. Thanks for clearing that up!

@gafter
Copy link
Member Author

gafter commented Dec 20, 2018

Closing this issue, and moving remaining open issues to #2095

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests