Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pattern matching as an alternative to if-variables #1977

Closed
leafpetersen opened this issue Nov 19, 2021 · 23 comments
Closed

Pattern matching as an alternative to if-variables #1977

leafpetersen opened this issue Nov 19, 2021 · 23 comments
Labels
field-promotion Issues related to addressing the lack of field promotion

Comments

@leafpetersen
Copy link
Member

leafpetersen commented Nov 19, 2021

In this issue, I proposed an alternative syntax for the if-variable feature described in #1201 . Discussion in #1975 suggests that the two alternative syntax proposal have problems on orthogonal axes. In the original syntax, it is not obvious where variables are being defined but in so far as one correctly understands where variables are bound, it is clear what they are bound to; and in the alternative syntax, it is obvious where variables are being defined, but it is not necessarily obvious to what they are bound. This proposal explores using a pattern matching syntax as a solution to make it clear(er) both where variables are bound, and to what they are bound.

This isn't a full syntactic proposal: this is just intended to explore whether a path forward might lie via the larger feature of pattern matching that we have discussed separately. For the purposes of this exploration, let's assume the ability to write a pattern of the form var _.<identifier> <whereclause>? = <expression> (and similarly for final) where identifier is an identifier name, expression is a standard Dart expression, and whereclause is of the form where <expression> where expression is a boolean valued expression. The variable identifier is bound in the where clause.

A pattern with a where clause is refutable, and is considered not to match if the boolean expression evaluates to false. A refutable pattern may be used as the condition in an if statement: if the pattern is not refuted the if branch is evaluated, and all of the variables bound by the pattern are in scope in the if statement. Otherwise, the failure branch is taken (i.e. either the else clause or the next statement of the enclosing block).

As a syntactic shorthand, we might consider allowing the form Type _.identifier = expression to stand for var _.identifier where identifier is Type = expression.

As another syntactic shorthand, we might consider allowing the form var! _.identifier = expression to stand for var _.identifier where identifier != null = expression.

Example:

void test(List<int> l) {
  if(var _.length where length > 0 = l) {
    print("Non-empty list of length $length");
  }
}

We might also consider allowing negative conditions to bind variables in the continuation as discussed in the original if-variable proposal, but I'm not sure it's clear what the set of negative conditions would be. As an alternative, we could add a guard syntax which binds in the continuation and requires the body of the if to throw, e.g.:

if! (var _.x where x == null = obj) return;
x.isEven;

Simple Example

Continuing by example, using the examples from the previous proposals:

class C {
  Object obj;
  test() {
    if (var _.obj where obj is int = this) obj + 1; // OK!
    if (int _.obj = this) obj + 1; // OK if we allow the shorthands.
  }
}

Promoting on null checks

class C {
  int? n;
  test() {
    if (var _.n where n != null = this) n + 1; // OK!
    if (var! _.n = this) n + 1; // OK if we allow the shorthands!
  }
}

Promoting on getters

class C {
  List<Point<num>> points = [Point(1, 2)];
  test() {
    if (var _.x where x is int = points[0]) x.isEven; // OK!
    if (int _.x = points[0]) x.isEven; // OK if we allow the shorthands!
  }
}

Negative if-vars

class C {
  Object obj;
  test() {
    if! (var _.obj where obj is! int = this) return;
    obj + 1; // OK if we allow the negative if-variables!
  }
}

Worked example

New syntax for the worked example from here:

while (true) {
  comp = _compare(current.key, key);
  if (comp > 0) {
    if! (var _.left where left == null = current) break;  // add "var", negative form.
    comp = _compare(left.key, key);       // "current.left!" -> "left"
    if (comp > 0) {
      // Rotate right.                    // remove "tmp" (use "left" as temp)
      current.left = left.right;          // "tmp" -> "left"
      left.right = current;               // "tmp" -> "left"
      current = left;                     // "tmp" -> "left"
      if (current.left == null) break;
    }
    // Link right.
    right.left = current;
    right = current;
    current = current.left!;              // unchanged, "left" could be stale
  } else if (comp < 0) {
    if! (var _.right where right == null = current) break; // add "var", negative form.
    comp = _compare(right.key, key);      // "current.right!" -> "right"
    if (comp < 0) {
      // Rotate left.                     // remove "tmp" (use "right" as temp)
      current.right = right.left;         // "tmp" -> "right"
      right.left = current;               // "tmp" -> "right"
      current = right;                    // "tmp" -> "right"
      if (current.right == null) break;
    }
    // Link left.
    left.right = current;
    left = current;
    current = current.right!;             // unchanged, "right" could be stale
  } else {
    break;
  }
}

New syntax from the worked example from here.

    if (var! _.extendsClause = node) {
      record('Uses', 'extend');
      record('Superclass names', extendsClause.superclass.toString());
    }

    if (var! _.withClause = node) {
      for (var mixin in withClause.mixinTypes) {
        record('Uses', 'mixin');
        record('Mixin names', mixin.toString());
      }
    }

    if (var! _.implementsClause = node) {
      for (var type in implementsClause.interfaces) {
        record('Uses', 'implement');
        record('Superinterface names', type.toString());
      }
    }

cc @munificent @lrhn @eernstg @jakemac53 @natebosch @stereotype441 @mit-mit

@munificent
Copy link
Member

A refutable pattern may be used as the condition in an if statement: if the pattern is not refuted the if branch is evaluated, and all of the variables bound by the pattern are in scope in the if statement. Otherwise, the failure branch is taken (i.e. either the else clause or the next statement of the enclosing block).

I really like this approach. When I was first working on patterns, I spent some time trying to come up with a nice syntax for a single-pattern if statement form and nothing stuck, so I omitted it for now. But I am 100% on board with adding that if we can find a syntax we like.

I see the proposal you have here as three subcomponents:

  • if (var ... = e) ... to define an if statement that applies an expression to a refutable pattern.
  • An _.identifier pattern that invokes a getter on the value, and binds a variable with the result.
  • A where e guard clause to refute a pattern based on an arbitrary predicate.

For what it's worth, the pattern matching proposal does already define syntax for the last two. If we were to use those, you get:

if (var (length: length) if (length > 0) = list) {
  print("Non-empty list of length $length");
}

Guard syntax

I think using if as the guard syntax looks funny inside an if statement. I picked if instead of where because its meaning is pretty clear and at the time, I only had it used inside switches. If we want to add an if statement matching form, it's probably worth revisiting that and considering where. The parentheses are an interesting question. If we omit them, that makes guards syntactically different from most other places where conditions appear—if, while, do-while—where parentheses are required.

Field destructuring

Most languages that have some kind of record-like destructuring have sugar for binding a variable with the same name as the field. For example, JavaScript allows both:

var {a: x, b: y} = {a: 1, b: 2}; // x = 1, y = 2.
var {a, b} = {a: 1, b: 2}; // a = 1, b = 2.

Rust allows both:

let Point { x, y } = p;
let Point { x: a, y: b } = p;

I didn't propose syntax for this for Dart yet. It's harder for us because we use () for tuples and records, so (a, b) looks like positional fields, not inferred named fields. I've been mulling over (a:, b:). Sugar here is particularly valuable because it benefits not just the field promotion kind of use cases but any place where you want to destructure a record or call getters on an object.

Another nice thing about (field: name) patterns is that they nest, which I don't think your proposed syntax does. You can do:

if (var (name: name, phone: (ext: ext, digits: digits))) { ... }

Or with some sugar for cases where the subpattern is a variable with the same name as the field:

if (var (name:, phone: (ext:, digits:))) { ... }

Matching if statements

That leaves the if (var ... = ...) part. I'm not immediately loving it. But I'm not super opposed either. It's similar to if-let in Rust and Swift. (Gonna just pat myself on the back here and point out a blog post I wrote proposing something similar for a hobby language of mine that predates both of those languages' entire existence.)

It's a little odd because var and final don't signal "refutable" to me. But I guess in the context of if, maybe it does?

If we want to consider alternatives, one I toyed with is:

if (expr case pattern) { ... }

This has the benefit of putting the expression before the pattern, which follows other uses of refutable patterns in switch statements where the value comes first. Another is:

if (case pattern = expr) { ... }

With both of these, I chose case to mirror the "main" refutable pattern matching construct with is switch cases. I don't know if that's a great idea or not.

Combinations

So, trying out some combinations of those suggestions gives:

// Original proposal here:
if (var _.length where length > 0 = list) { ... }

// Current patterns proposal:
if (var (length: length) if (length > 0) = list) { ... }

// Proposal here but using record patterns for field access:
if (var (length:) where length > 0 = list) { ... }

// Patterns proposal but using "if (var" from here:
if (var (length:) if (length > 0) = list) { ... }

// Patterns proposal with "if ... case" and "where" for guards:
if (list case (length:) where length > 0) { ... }

// Patterns proposal with "if ... case" and "where" for guards:
if (list case (length:) where length > 0) { ... }

Thoughts?

The main missing piece is some kind of null-checking pattern. I have a TODO for that in the pattern matching proposal because I haven't come up with a syntax I like yet.

@lrhn
Copy link
Member

lrhn commented Nov 21, 2021

I don't think (var _.n where n != null = this) reads very well. The spaces breaks up the "expression" and the logically connected chunks are not easily spotted. I'd prefer forcing some parentheses, at least around the where condition, to stop me seeing null = this as an assignment.

In general, I need more separation and grouping to be able to easily read these run-on expressions.

As for pattern syntax, I'd prefer something more like C#'s object matchers:

if (_ is List{length: var length} list && length > 0) ...

That pattern is-check follows the format:
is TypeBasedPattern name, and the type pattern is TypeName{getterName: Pattern}.

(We can probably abbreviate length: var length into just var length.)

Introducing arbitrary where patterns could be useful too.
Maybe:

if (_ is List{length: if it > 0} list) ...

(using implicit it for the name, or if you need to name itList{length: var l if l > 0}.)

@munificent
Copy link
Member

As for pattern syntax, I'd prefer something more like C#'s object matchers:

if (_ is List{length: var length} list && length > 0) ...

That pattern is-check follows the format: is TypeBasedPattern name, and the type pattern is TypeName{getterName: Pattern}.

The braces don't do much for me aesthetically. Note that because Dart has map literals (unlike C#), it's harder to use those since they can collide with map patterns. Even if they don't technically collide in the grammar because of the leading type name, they may look confusing to a reader who expects { ... } to signal "map".

In general, pattern syntax mirrors the syntax used to create the object being matched. For instances of classes, that's constructor calls, which is why I used parentheses. The proposal already has "extractor" patterns which let you combine type testing and destructuring. They look like:

List(length: var length)

In the context of a matching-if statement, it would be something like:

if (var List(length: var length) where length > 0 = list) ...

@lrhn
Copy link
Member

lrhn commented Nov 23, 2021

The = list at the end really doesn't work for me. I can't see how you assign to 0, and I can't see what else you assign to. If the thing to the left of = is not an identifier or some kind of end-brace, I'm going to have a hard time reading it.

Parentheses in the extractor pattern are good too, I was considering those as the next option if braces won't work. It does look like a constructor - but it isn't related to constructors if all it does getter accesses, which is what worries me a little.

Anyway, then it can be written as:

if (list is List(length: var length) && length > 0) ...

without the where. This promotes the list to List and extracts its List.length to length.
Or

if (this.list is List(length: var length) list && length > 0) ...

if you want to bind a value which cannot be promoted.
With where, I'd write it as

if (this.list is List(length: var length if (length > 0)) list) ...

(Using a keyword to make it clear where the condition starts, and parentheses to make it work inside an expression:

  if (this.list is List(length: var length) list if (length > 0) && somethingUnrelated) ...

Not sure where/if really need to be inside the patterns, but I guess it has its uses (like case patterns, where you can't easily add extra conditions, but then we might just allow extra conditions on cases.).

@munificent
Copy link
Member

@munificent:
could you please explain why you prefer this:
if (var List(length: var length) where length > 0 = list) ...
over a more straightforward
if (list matches List(length: var length) && length > 0) ...

I don't think I prefer either of those. :) My preference would maybe be:

if (list case List(length: var length) where (length > 0)) ...

matches looks nice too, but I'd rather reuse an existing reserved word if possible. Using && instead of some explicit guard clause like where or if is harder because it raises questions about what the actual syntax inside the surrounding if represents. In my suggestion, the grammar is:

'if' '(' expression 'case' pattern ')'

So the thing after case is a pattern and only a pattern, and patterns have explicit support for guard clauses. && is an expression, so it's not clear to me what it would mean to allow one here or what it means to have a pattern as a subexpression of it.

I'm still wondering whether we really know a massive use case for this device in dart to justify the pursuit.
(There are languages totally based on pattern matching, but in dart, it looks more like a shiny toy to me :-).

Yes, pattern matching is mostly syntactic sugar. However:

  • When it comes to working with tuples (and to a lesser extent lists and maps), it's really cumbersome to not have that sugar. It's great that you can quickly bundle a couple of return values using (123, "blah"). It's less great if the caller doesn't have an equally easy way to extract those fields back out into separate variables.
  • Users have long requested the ability to write algebraic datatype-style destructuring matches. You can accomplish the same thing using the Visitor pattern or a series of if (... is ...) statements, but it's pretty cumbersome.
  • Combining type testing and variable binding gives you a safer way of ensuring a variable has some expected type. Dart gets you pretty far with type promotion but (as in this issue here), it doesn't always work.

@leafpetersen
Copy link
Member Author

@munificent

I don't hate the (length : length) syntax, but I don't love it either. Maybe it would become familiar if we use that syntax with tuples as well, but in general, there's a very long-standing syntax for these sorts of things that uses braces (C, JS, SML, OCaml, probably many others) that makes it easy to see {length : length} as a record with a field named length in a way that the parenthesized version doesn't for me. This is especially true because in general parentheses are usually used in the grammar just to delineate things and not to structure actual values (e.g. (3 + 4)*5 is just a grouping thing, and f(3) is just about delineating the arguments and not creating a structure to pass to f). This carries over pretty strongly to the `(length:) syntax. It's a weird combination of punctuation. Again, could maybe learn to live with this but I don't love it.

Another nice thing about (field: name) patterns is that they nest, which I don't think your proposed syntax does. You can do:

if (var (name: name, phone: (ext: ext, digits: digits))) { ... }

My syntax is definitely more limited (it can only handle a single field per object) but it does nest. Example:

if (var _.phone.ext where ext != null)  {
  print("$phone is in scope here");
  print("$ext is also in scope here");
}

It's possible we could include ?. in the path syntax, but there are multiple interpretations of what that could mean.

If we want to consider alternatives, one I toyed with is:

if (expr case pattern) { ... }

This really doesn't work to me. case e discriminates on e in other languages, so e case pattern reads quite oddly.

if (case pattern = expr) { ... }

This works ok for me.

I do wonder whether having the _.x syntax as a shorthand for (x : x) might not be worthwhile even so?

@leafpetersen
Copy link
Member Author

leafpetersen commented Nov 24, 2021

The = list at the end really doesn't work for me. I can't see how you assign to 0, and I can't see what else you assign to. If the thing to the left of = is not an identifier or some kind of end-brace, I'm going to have a hard time reading it.

Agreed, this felt like it got lost.

As for pattern syntax, I'd prefer something more like C#'s object matchers:

if (_ is List{length: var length} list && length > 0) ...

As a data point, it took me a long time to figure out what (I think) this is supposed to mean. I think list is the thing being discriminated on? It's weird having it just hang out there.

I don't like having to write var length. Why do you feel you need to use var there instead of just {length : length}

I also really dislike having to repeat the field name as the variable name. The key piece of doing anything here is to be able to pun on the field name to get the variable name. Otherwise I don't see much point.

@leafpetersen
Copy link
Member Author

@munificent

It's a little odd because var and final don't signal "refutable" to me. But I guess in the context of if, maybe it does?

I can definitely see a use for refutable patterns on ordinary variable bindings as well, where you have invariants that can't be captured in the type system that establish properties that you just want to assert, in much the same way that e! asserts that "e is not null, just trust me". But I suppose you would probably want a different syntax for that.

@lrhn
Copy link
Member

lrhn commented Nov 24, 2021

As a data point, it took me a long time to figure out what (I think) this is supposed to mean. I think list is the thing being discriminated on? It's weird having it just hang out there.

The list is a new variable bound to the result of the match.
It's the generalization of the type-match pattern something is List list which binds list of type List to the value on a successful pattern match. That one was introduced by itself in C# ... 8 I think: expr is Type id.
The Type id is a type-checking and binding pattern, just like when you declare a variable using the irrefutable pattern List list = x; (irrefutable if x has static type List).
The _ here is the value being tested (which I used because I misread the other examples using _ for something else and thought they were testing a _ variable).

I don't like having to write var length. Why do you feel you need to use var there instead of just {length : length}

I'd expect a pattern to be able to match values, so:

  if (list is List{length: 8}) ...

checks whether the list has length 8 (it checks the value of the length getter against the value pattern 8, which matches if they are ==).
I'd also want to be able to do:

  var expectedLength = 8;
  if (list is List{length: expectedLength}) ...

and use a variable as the variable pattern. If that meant "bind the value to the new variable expectedLength", I wouldn't have a good way to check against a variable. (I guess I can use length: (expectedLength), but it's not going to be clear to users why that's needed.)

That means that if I want to bind at that position, I need to write something more, which is why I add var:

  if (list is List{length: var length}) ...

Here var length is not a value pattern, it's an (irrefutable) binding pattern.

Also, I like having var or some other binding-specific syntax around anywhere a variable is introduced. Parameter lists and catch clauses are currently the ones which don't, and their context usually makes it clear what's going on.

I'd be fine with allowing var id as a special pattern form inside a type-property-checking pattern, meaning id: var id. If every other type of pattern starts with id: or [expression]:, then var id is not in conflict with any of those.

@leafpetersen
Copy link
Member Author

As a data point, it took me a long time to figure out what (I think) this is supposed to mean. I think list is the thing being discriminated on? It's weird having it just hang out there.

The list is a new variable bound to the result of the match.

Wait, what? Then what is being discriminated on? There are no other candidates to be expressions in that syntax.

@leafpetersen
Copy link
Member Author

I'd expect a pattern to be able to match values, so:

  if (list is List{length: 8}) ...

checks whether the list has length 8 (it checks the value of the length getter against the value pattern 8, which matches if they are ==). I'd also want to be able to do:

This is not a general pattern matching proposal, this is a proposal for a specific small syntax that might be a subset of a pattern matching proposal (if we do one).

I'd also want to be able to do:

  var expectedLength = 8;
  if (list is List{length: expectedLength}) ...

I'm not very keen on this. Patterns aren't expressions, and it gets weird if you allow them to be.

That means that if I want to bind at that position, I need to write something more, which is why I add var:

  if (list is List{length: var length}) ...

Here var length is not a value pattern, it's an (irrefutable) binding pattern.

This is a good reason not to allow expressions there. :)

Also, I like having var or some other binding-specific syntax around anywhere a variable is introduced. Parameter lists and catch clauses are currently the ones which don't, and their context usually makes it clear what's going on.

Right, this is provided by the case in one of the proposals from @munificent , or the leading var. In other words, you have syntax that says "This is a pattern", and then you know what every variable in the pattern is a binding occurrence, and you don't have sprinkle var all through the pattern.

@lrhn
Copy link
Member

lrhn commented Nov 24, 2021

I admit that C# does not allow general expressions, only constants, but they do allow named constants, so

const int expectedLength = 8; // assuming you can declare a local const in C#
if (something is List{Length: expectedLength) list) { ... }

matches lists with length 8.

They have relational patterns too < constant, > constant, but only for the relational operators, not ==. Using == constant instead of just constant as a pattern seems like an obvious workaround if you don't want to hog the "single identifier pattern" syntax for values.

That does mean that they have to introduce new syntax to combine patterns (and and or), so:

const int expectedLength = 8; // assuming you can declare a local const in C#
if (something is List{Length: > 4 and < 8) list) { ... }

rather than use normal expression syntax (say Length: where it > 4 && it < 8).

I'm not sure why allowing general expressions inside patterns is bad. They contain values, and the way to express values is expressions. I can see why making some values necessarily constant can allow some optimizations, like;

case < 4: ...
case >= 4 && < 8: ...
case >= 8: ....

can be recognized as distinct ranges and optimized in some numerical way.

@leafpetersen
Copy link
Member Author

I'm not sure why allowing general expressions inside patterns is bad.

One reason is grammatical ambiguity. Consider an imaginary syntax: case e of (foo : Foo(3)) . Does Foo(3) represent a pattern, or an expression? Either could be syntactically and semantically valid. Interpreting it as a pattern means "destructure the foo field from e, check that it is a Foo containing 3". Interpreting it as an expression means "destructure the foo field from e, and check that it is equal (or identical, depending on how you define matching semantics) to the value that results from evaluating the expression Foo(3)". So you have to either avoid ambiguities like that, or resolve them in a default direction, which feels tricky to make understandable to the user. Forcing all binders in a pattern to explicitly say var is one way to help avoid ambiguities, but it doesn't solve all the problems (e.g. the above case) and it's very verbose.

Another reason is, as you mention, efficiency of compilation. For complex patterns, languages that rely on pattern matching use various strategies to pick the order in which to do tests. For a simple example, given this code:

case (fst, snd)
 of (Foo(Bar(Baz([]))), true) => something1
  | (_, true) => something2
  |  (_, false) => something3

You'd like to compile the complex match to a simple match of the form:

if (snd) {
    // Complex destructuring of fst here to decide whether you're in the first or second clause
else {
  // Definitely in the second clause
  something3;
}

This avoid doing the complex, expensive destructuring of fst unless it is necessary to decide the match. If your patterns have side effects, then this gets weird. This is already a concern for me with the pattern proposals we have: if we make destructuring user programmable (and it seems hard to avoid), then we have to decide which pieces get executed and in which order (or say that it is explicitly undefined) which limits the compiler's ability to re-order the tests. Expressions in patterns have the same problem (though again, maybe this is already lost for us).

Standard ML of NJ used this approach, I think, for building decision trees from patterns.

Lennart Augustsson had a paper on doing this for lazy languages where left to right order had to be preserved (since termination is an observable side effect).

@lrhn
Copy link
Member

lrhn commented Nov 25, 2021

Taking up grammar space and being ambiguous is a good argument.

As for using constants to do extra-efficient compilation, that's a fine feature, but it doesn't prevent you from doing non-constant values too. That would have to have a specific evaluation order, which makes it harder to optimize, but you still get exactly what you ask for. Something like:

case < start: ...
case >= start && < end: ...
case > end: ...

where start and end are not constants may still be preferable to writing an if/else chain, even if that's what it's going to work like at runtime anyway.

I also like var in binding patterns for its consistency. Every binding looks approximately the same, whether it's a declaration or a pattern binding, it's one of: var x, final x, final type x or type x.

The last one is the known pattern for doing binding type checks, if (e is String s) .... The other ones are also binding patterns, and you can actually do if (e is var s) to introduce a new local variable unconditionally.
C# is not completely orthogonal, you can write e is List<int>{Count: var len} list, but not use List<int>{Count: var len} list = e; as declaration. The property checks only works in tests, probably because they can be refutable.

I'm a little worried about inventing too much new syntax just for patterns, like relationalOperator expression. It's very narrowly tailored for a specific use. You can't use 42 > as pattern, only < 42, which means you don't get to decide the receiver of the call (not a problem in C#, might matter in Dart). Having too many little languages inside the language, rather than use the existing general expression syntax, makes things less flexible and orthogonal.
It's also true that using the full existing expression syntax makes syntax ambiguities much more likely.

@munificent
Copy link
Member

munificent commented Nov 30, 2021

I don't hate the (length : length) syntax, but I don't love it either. Maybe it would become familiar if we use that syntax with tuples as well, but in general, there's a very long-standing syntax for these sorts of things that uses braces (C, JS, SML, OCaml, probably many others) that makes it easy to see {length : length} as a record with a field named length in a way that the parenthesized version doesn't for me.

I hear you. If we were designing Dart from scratch, I'd probably go in that direction too. But I think we are already in a position where curly braces mean "map"—a homogeneous data structure where all key-value pairs have the same type. If you want a thing that exposes named properties whose individual types are known to the type system... that's kind of what an instance of a class is. And the way you construct instances of classes is Foo(field: 123, other: "value").

The mental model I have is:

  • [] List, all elements have same type.
  • {} Map, all key-value pairs have the same type.
  • () Class instance, tuple, or record. Structured object whose properties are individually known to the type system.

I look at tuples and records as basically just sugar for classes. And, in particular, I want to allow destructuring pattern matching on instances of user-defined classes so that algebraic datatype-style code is easy to write. The syntax for that in most languages I know is Foo(var1, var2). In Dart, since classes tend to be more open to extension than sum types, I expect users might want to be able to destructure just some of a class's fields. That implies something like Foo(thingIWant: var1, otherThing: var2). If we're gonna have that, it seems natural to me to use (thingIWant: var1, otherThing: var2) for records.

It also nicely allows mixing positional and named destructuring, which mirrors function calls in Dart which can mix positional and named arguments:

var call      = foo(1, 2, named: 3);
var construct = Foo(1, 2, named: 3);
var record    =    (1, 2, named: 3);
var                (a, b, named: c) = ... // Pattern.
if (expr case pattern) { ... }

This really doesn't work to me. case e discriminates on e in other languages, so e case pattern reads quite oddly.

Yeah, in SML in friends you have case e of patterns.... But in Dart, the thing we already have is:

switch (expr) { case pattern: ... }
//      expr    case pattern

So I don't think it would be too crazy to do if (expr case pattern) { ... }.

@leafpetersen
Copy link
Member Author

That implies something like Foo(thingIWant: var1, otherThing: var2). If we're gonna have that, it seems natural to me to use (thingIWant: var1, otherThing: var2) for records.

So returning to your original counter-proposal from this thread then, how does this match up?

// Patterns proposal with "if ... case" and "where" for guards:
if (list case (length:) where length > 0) { ... }

The above looks like it's using the record syntax. Give what you say above, I'd expect something like:

// Patterns proposal with "if ... case" and "where" for guards:
if (list case List(length:) where length > 0) { ... }

That feels like it is getting a bit verbose though. Is your original counter-proposal assuming some inference? I think it might be doable, but there's definitely a lot of room for ambiguity to creep in.

@munificent
Copy link
Member

munificent commented Nov 30, 2021

So returning to your original counter-proposal from this thread then, how does this match up?

// Patterns proposal with "if ... case" and "where" for guards:
if (list case (length:) where length > 0) { ... }

The above looks like it's using the record syntax.

The way to think about it is that every pattern syntax is essentially sugar for accessing a certain protocol, not just matching over a specific concrete type. So:

  • [] - Matches over any type that implements List<T>, including but not limited to the built-in List class.
  • {} - Matches over any type that implements Map<K, V>, including but not limited to the built-in Map class.
  • () - Matches over any type that implements a couple of tuple/record protocols, including but not limited to the built-in objects created by record literals. The protocol here is a little more complex than simply "implements this core library interface" because we need something more structural, but the basic concept is the same. You can use (...) patterns for any object that is "record-like", not just actual record objects.

In the proposal, the protocol for named field access in record patterns is literally just "the type has a getter with the corresponding name", so you can use (name: ...) patterns to call any getter on any object. It's most natural and idiomatic to do so for field-like getters that feel like you're destructuring an object, but it's mechically just a "getter pattern". In other words, every Dart object is record-like (though some only have a hashCode getter...).

Give what you say above, I'd expect something like:

// Patterns proposal with "if ... case" and "where" for guards:
if (list case List(length:) where length > 0) { ... }

You can do this too, though it's redundant here since we know the matched value list has static type List. The difference between (length:) and List(length:) is that the former just invokes the getter on the matched object without caring what type it has (as long as the type does have a length getter). The latter is an "extractor" pattern that combines both a type check / downcast with recursive destructuring. It matches if the object is a List and if so, then calls length to access the getter. It's sort of the OOP analog to matching on a specific constructor in a sum type.

That feels like it is getting a bit verbose though. Is your original counter-proposal assuming some inference? I think it might be doable, but there's definitely a lot of room for ambiguity to creep in.

Yes, the patterns proposal specifies how to take the static type of the matched value and use that to infer a type for the pattern. It's similar to how upwards inference works for local variables. (You can of course think of local variables as simply a non-refutable pattern match of a simple variable pattern to the initializer's value.) I'm definitely not an expert at this stuff, but I did my best to work it out in some detail.

I think type inference here is necessary for usability. For example:

var tuple = (1, 'a string');
var (a, b) = tuple;

Users would be unpleasantly surprised if a didn't end up having static type int and b type String.

@leafpetersen
Copy link
Member Author

You can do this too, though it's redundant here since we know the matched value list has static type List. The difference between (length:) and List(length:) is that the former just invokes the getter on the matched object without caring what type it has (as long as the type does have a length getter).

Ok, this makes sense.

Yes, the patterns proposal specifies how to take the static type of the matched value and use that to infer a type for the pattern.

Sorry, I wasn't specific enough. I'd assumed that there would be inference to infer the types of the bound variables from the pattern, but I was wondering more about how (name: ) and Foo(name:) related, and whether the former was just using type inference to fill in the missing Foo. I think from your explanation, there is no reason to do that.

@TimWhiting
Copy link

Out of all of the proposals to address non-null promotion of fields (stable getters, shadow) and the various if-variables / guard proposals, I feel like pattern matching has the least noise, and is more compatible with the features that users would like to see in dart in the near future (patterns & related features, data classes, destructuring, tuples)...

I would rather have the language move this way, where the changes all seem like a unified flow, rather than a bunch of random syntax noise like stable getters / shadow vars / let vars / naming subexpressions using '@'. It would also keep dart easily able to learn / pick up with more people coming from modern languages that have pattern related features where they have proved their usefulness and clarity, rather than forcing people to learn about oddities around stable getters, etc (sorry to pick on stable getters).

It also seems like this discussion has converged more easily to a point of view where most people agree it is worth doing with only slight variation of opinion on syntax. I feel like @munificent's proposal on patterns addresses how patterns might differ in dart as compared to other languages and the reasons, and I feel like it has very sound arguments in favor of doing it with the syntax he proposes.

@Levi-Lesches
Copy link

I agree. I feel like there's a lot of fragmented discussion on what's the best sugar for a few specific cases, but that's just going to end up with a language with a lot of very niche corners. One feature that covers them all -- and leaves room for expansion -- would be a real step up.

(Also for the record I'd still like to see stable getters because it's not just sugar for if-statements, it's an API design feature that will help with making dataclasses more immutable, allowing more expressions to be const, and of course, promotion.)

@TimWhiting
Copy link

Indeed. Rather than writing boring if (n != null) n + 1; people will write the same in a more modern notation: if (var _.n where n != null = this) n + 1;. Problem solved! 😄

Valid point, while pattern matching is generally useful in many cases for if-vars + more complex things, it does not make the null-check common case short. However, for your example I would argue we don't need if-vars at all. n ?+ 1 (I think this might have been proposed somewhere, or you could make an extension method n?.add(1). 😄

The real value of this proposal is in the destructuring of regular dart objects using getters (you can get multiple fields into a local var) and guarding against conditions in an if statement.

@leafpetersen leafpetersen added the field-promotion Issues related to addressing the lack of field promotion label Feb 23, 2022
@munificent
Copy link
Member

We've shipped patterns now and with if-case and null-check patterns, we do now have a pattern-based idiom for covering some field promotion cases. It may still be the case that the syntax doesn't cover all of those cases. In particular, we don't have a "guard-let" like form #2537 to bind variables in the rest of the block. And it may be that the pattern syntax we have isn't great for this idiom.

But there is something there. Do we want to keep this issue open?

@leafpetersen
Copy link
Member Author

Closing, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
field-promotion Issues related to addressing the lack of field promotion
Projects
None yet
Development

No branches or pull requests

5 participants