Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support pattern-matching in let clauses in Linq #6877

Closed
gafter opened this issue Nov 19, 2015 · 88 comments
Closed

Support pattern-matching in let clauses in Linq #6877

gafter opened this issue Nov 19, 2015 · 88 comments

Comments

@gafter
Copy link
Member

gafter commented Nov 19, 2015

Given the let-statement proposed in the draft pattern-matching specification for #206, it has been proposed by @alrz to extend the let clause in Linq to support pattern-matching as well:

because of the similarities with let statement I think that would be reasonable to support patterns in it.

from person in list
let Student { Grade is var grade, Name is var name } = person
where grade > 65
select new { Grade = grade, Name = name };
...
// equivalent to
...
from person in list
let $x = person switch(
  case Student { Grade is var grade, Name is var name }: new {grade, name},
  case *: null
)
where $x?.grade > 65        
select new { Grade = $x.grade, Name = $x.name }   
...
// complete patterns
...
from tuple in list
let (var x, var y) = tuple
where x > y
select x + y
...
// equivalent to
...
from tuple in list
let $x = tuple case (var x, var y) : new {x,y}
where $x.x > $x.y
select $x.x + $x.y

etc.

It isn't clear how this would be expected to work when the pattern is fallible.

@mattwar
Copy link
Contributor

mattwar commented Nov 19, 2015

You could make it also where as a where clause and omit anything that doesn't match.

@HaloFour
Copy link

@gafter

I agree, letting let do double-duty here seems intuitive. But I also agree that failed patterns pose a potential problem. Having the failed patterns omitted automatically seems a little too implicit. Maybe support something like else continue?

from person in list
let Student { Grade is var grade, Name is var name } = person else continue
where grade > 65
select new { Grade = grade, Name = name };

@gafter
Copy link
Member Author

gafter commented Nov 19, 2015

@mattwar I agree, but which Linq method would it translate into? .Where doesn't have a mechanism for adding new variables, and .Let doesn't have a mechanism for filtering.

@alrz
Copy link
Member

alrz commented Nov 19, 2015

@HaloFour that's exactly the issue I was thinking about, it implicitly skips failed patterns. but else continue seems fair.

@HaloFour
Copy link

With let already providing a bit of syntactic voodoo in LINQ I think that it's reasonable that it would be translated out into multiple extension method calls.

Going with implicit filtering:

from person in list
let Student { Grade is var grade, Name is var name } = person
where grade > 65
select new { Grade = grade, Name = name };

converted into:

list
    // beginning of let expansion
    .Select(person => person switch (
        case Student { Grade is var grade, Name is var Name } :
            (true, person, grade, name)
        case * :
            (false, person, default(int), default(string))
    ))
    .Where(tuple => tuple.Item1)
    .Select(tuple => new { person = tuple.Item2, grade = tuple.Item3, name = tuple.Item4 })
    // end of let expansion
    .Where(projected => projected.grade > 65)
    .Select(projected => new { Grade = projected.grade, Name = projected.name });

@alrz
Copy link
Member

alrz commented Nov 19, 2015

I'd say it could be Let:

list
.Let(person => person switch( ... ))
.Where($x => $x.grade > 65)
.Select($x => new { Grade = $x.grade, Name = $x.name });

public static IEnumerable<TResult> Let<T, TResult>(
  this IEnumerable<T> source,
  Func<T, TResult> selector)
{
  foreach(var item in source) {
    var result = selector(item);
    if (result != null) yield return result;
  }
}

@gafter
Copy link
Member Author

gafter commented Nov 19, 2015

@alrz What does result != null mean when you don't know if it is a reference type or not?

@alrz
Copy link
Member

alrz commented Nov 19, 2015

@gafter It will be an anonymous type anyway, I forgot to put the class constraint.

@gafter
Copy link
Member Author

gafter commented Nov 19, 2015

@alrz No, it is not required to be an anonymous type. We were hoping to start using tuples in many cases for translating let to reduce GC pressure.

@HaloFour
Copy link

@gafter Great point. The compiler could spit out a private static function that accepts the original range variable (or a tuple of all current range variables) and returns an IEnumerable<tuple> of the original range variable(s) plus the variable patterns, then yield return a single result on a match.

Going with implicit filtering:

from person in list
let Student { Grade is var grade, Name is var name } = person
where grade > 65
select new { Grade = grade, Name = name };

converted (somewhat) into:

IEnumerable<(Person, int, string)> $match(Person person) {
    switch (person) {
        case Student { Grade is var grade, Name is var Name }:
            yield return (person, grade, name);
        default: yield break;
    };
};

list
    .SelectMany($match)
    .Where(projected => projected.Item2 > 65)
    .Select(projected => new { Grade = projected.Item2, Name = projected.Item3 });

@alrz
Copy link
Member

alrz commented Nov 19, 2015

@gafter I was assuming the current implementation, I did mention this in the tuple issue, too. By the way, that's good to know that it _is_ going to use tuples. 👍

@gafter
Copy link
Member Author

gafter commented Nov 19, 2015

@HaloFour The Linq pattern is not tied to IEnumerable... it would be a shame if this pattern-matching support worked for IEnumerable but no other types.

@HaloFour
Copy link

@gafter Apologies for botching the example. I didn't imply that it would be tied to IEnumerable, aside the fact that let would use it to return either 0 or 1 results per attempted match, which I thought was what you were going for.

@HaloFour
Copy link

@gafter Wait, my example was correct. I'm not sure where the confusion is?

@bbarry
Copy link

bbarry commented Nov 19, 2015

from person in list
let Student { Grade is var grade, Name is var name } = person
where grade > 65
select new { Grade = grade, Name = name };

into

list
    .SelectMany(person => person switch (
        case Student { Grade is var grade, Name is var name } : 
            new [] { (person, grade, name) },
        case * : 
            Enumerable.Empty<...>()
    ))
    .Where(projected => projected.Item2 > 65)
    .Select(projected => new { Grade = projected.Item2, Name = projected.Item3 });

?

Is it possible to avoid the allocation for that array?

@HaloFour
Copy link

@bbarry

The compiler wouldn't be forced to impose the restriction of closures not being iterators on itself. I assume that the compiler would emit a private static iterator function and then pass a delegate to that function to SelectMany.

Your example using arrays is much clearer than mine in illustrating the concept, I think.

@orthoxerox
Copy link
Contributor

What about this approach?

var students =
from person in people
where person is Student student
select student;
//is transformed into
var students = people
.Select(person => (person is Student student)?(true, student):(false, default(Student)))
.Where(tuple => tuple.Item1 = true)
.Select(tuple => tuple.Item2);

@HaloFour
Copy link

@orthoxerox Looks pretty similar to my first suggestion for an expansion, although you're using is with a ternary rather than the switch expression.

@gafter
Copy link
Member Author

gafter commented Nov 19, 2015

Why are all you folks tying this to Enumerable (e.g. using Enumerable.Empty or having the compiler generate a method that performs yield return)? The Linq feature can be used with any type that fits the API pattern (e.g. PLINQ, Linq to SQL, IObservable, etc). If we have this construct force the result to be Enumerable, then we've broken one of the important features of Linq.

@gafter
Copy link
Member Author

gafter commented Nov 19, 2015

@HaloFour Your first proposed expansion (also @orthoxerox 's most recent proposed expansion) would work.

@bbarry
Copy link

bbarry commented Nov 19, 2015

@gafter I thought you said SelectMany above; I was trying to figure out how that would have worked. I think @HaloFour's tuple expansion #6877 (comment) is better. It seems odd to have let reduce the number of items in the statement...

@gafter
Copy link
Member Author

gafter commented Nov 19, 2015

@bbarry I thought SelectMany could be used, but I tried and now I don't think so.

@HaloFour
Copy link

@gafter Indeed, implementing this via SelectMany would lock it to IEnumerable<T> under the hood as the mechanism to return zero or one value. It was your idea, we were following it. If this tie to IEnumerable<T> is unacceptable then that's fine, but you forgot to mention that before chastising us for chasing your concept.

Anyway, that is all implementation details. I'd be more concerned with hammering down the semantics and syntax (and variations thereof) at this stage and worrying about how exactly that could be expanded later. Particularly of interest would be any capability of this feature functioning as a filter as well as a projection, either by default or through additional syntax.

Aside that, it is nice to hear that LINQ may take advantage of tuples for let projections.

@alrz
Copy link
Member

alrz commented Nov 19, 2015

@HaloFour SelectMany is not tied to IEnumerable<T>, implement it for your desired type and it will work

static M<U> SelectMany<T, U>(this M<T> m, Func<T, M<U>> k)

No need to mention, it's the bind operator.

@gafter
Copy link
Member Author

gafter commented Nov 19, 2015

@alrz How would you use SelectMany to implement a pattern-based let clause, i.e. so that it can return one result or none, without tying it to IEnumerable?

@alrz
Copy link
Member

alrz commented Nov 24, 2015

Nevertheless, I think for

from item in db.Products
select (item.Name, item.Model)

it would be reasonable that it be translated as

db.Products
.Select(item => new {item.Name,item.Model})
.Select(t=>(t.Name,t.Model))

Otherwise, if you accidentally use a tuple for projecting IQueryable<T> you are just accidentally screwed.

@HaloFour
Copy link

@alrz

You'd be opting-into tuples explicitly in that case which is very different from whatever let automagickally does under the hood. Note that your translation would suffer the same amount of screwed-ness given both Selects would be in the expression tree and would have to be translated by Entity Framework or whatever queryable provider.

@alrz
Copy link
Member

alrz commented Nov 24, 2015

@HaloFour I'm thinking that EF guys should be loud about this and write a big bold statement somewhere that states DON'T USE TUPLES, while the main motivation behind implementing the same behavior as anonymous types (inferring item/property names from the input expression) is using it with EF (at least this is what it would be expected).

@bbarry
Copy link

bbarry commented Nov 24, 2015

There is nothing stating that linq expression tree visitor code couldn't be modified to support tuples, that is an enhancement on those things.

I think it would be useful for the expression tree representing that tuple to know the compile time names used, but I don't see why a person who "accidentally" projects an IQueryable<T> into a tuple shouldn't be permitted to get exactly the thing they expressed in code.

@HaloFour
Copy link

@alrz I bet they add support for projecting to tuples long before any of this ships.

@alrz
Copy link
Member

alrz commented Nov 24, 2015

@bbarry I would totally love that. Expression trees are generated in compile-time, so are tuple names. nothing should stop you from being able to get those sweet names from an expression tree (#2060).

@paulomorgado
Copy link

@gafter,

Not "oneply", "twoply" or "threeply" in particular. Just "tuply" in general. 😄

/cc @HaloFour

@alrz
Copy link
Member

alrz commented Jan 26, 2016

If we unify nullables, with help of higher kinded types I think we can implement generic LINQ operations for nullables and utilize it for incomplete let (because that's what it is! it changes the type of the collection to nullable).

Besides, although we have nullable reference types, T? meaning two things, depending on the type, is just awful.

@HaloFour
Copy link

HaloFour commented Feb 3, 2016

I'm going to posit that by default a failed pattern should result in a run-time exception. As with pattern matching in any other context the compiler should go to whatever lengths it can to prevent that. Where the pattern may be fallible I think that reusing the else keyword with an embedded statement is useful and should follow very similar rules to that of using let for destructuring outside of LINQ. The one difference is that the embedded-statement cannot contain a return statement since that wouldn't make a lot of sense in the LINQ query. But since the LINQ query can be considered a looping construct I think that continue and break should both be acceptable.

In the case of continue the LINQ query will omit the elements that do not match the pattern:

from person in list
let Student { Grade is var grade, Name is var name } = person else continue
where grade > 65
select new { Grade = grade, Name = name };

// translated into

list
    // beginning of let expansion
    .Select(person => person switch (
        case Student { Grade is var grade, Name is var Name } :
            (true, person, grade, name)
        case * :
            (false, person, default(int), default(string))
    ))
    .Where(tuple => tuple.Item1)
    .Select(tuple => new { person = tuple.Item2, grade = tuple.Item3, name = tuple.Item4 })
    // end of let expansion
    .Where(projected => projected.grade > 65)
    .Select(projected => new { Grade = projected.grade, Name = projected.name });

In the case of break the LINQ query will terminate the sequence:

from person in list
let Student { Grade is var grade, Name is var name } = person else break
where grade > 65
select new { Grade = grade, Name = name };

// translated into

list
    // beginning of let expansion
    .Select(person => person switch (
        case Student { Grade is var grade, Name is var Name } :
            (true, person, grade, name)
        case * :
            (false, person, default(int), default(string))
    ))
    .TakeWhile(tuple => tuple.Item1)
    .Select(tuple => new { person = tuple.Item2, grade = tuple.Item3, name = tuple.Item4 })
    // end of let expansion
    .Where(projected => projected.grade > 65)
    .Select(projected => new { Grade = projected.grade, Name = projected.name });

And, for completeness, support throw:

from person in list
let Student { Grade is var grade, Name is var name } = person else throw new InvalidOperationException("oops!")
where grade > 65
select new { Grade = grade, Name = name };

// translated into

list
    // beginning of let expansion
    .Select(person => person switch (
        case Student { Grade is var grade, Name is var Name } :
            (person, grade, name)
        case * :
            throw new InvalidOperationException("oops!")
    ))
    .Select(tuple => new { person = tuple.Item1, grade = tuple.Item2, name = tuple.Item3 })
    // end of let expansion
    .Where(projected => projected.grade > 65)
    .Select(projected => new { Grade = projected.grade, Name = projected.name });

And, of course, multiline embedded statements:

from person in list
let Student { Grade is var grade, Name is var name } = person else {
    Console.WriteLine("whoa, unexpected!");
    continue;
}
where grade > 65
select new { Grade = grade, Name = name };

// translated into

list
    // beginning of let expansion
    .Select(person => {
        switch (person) {
            case Student { Grade is var grade, Name is var Name } :
                return (true, person, grade, name);
            case * :
                Console.WriteLine("whoa, unexpected!");
                return (false, person, default(int), default(string));
        }
    })
    .Where(tuple => tuple.Item1)
    .Select(tuple => new { person = tuple.Item2, grade = tuple.Item3, name = tuple.Item4 })
    // end of let expansion
    .Where(projected => projected.grade > 65)
    .Select(projected => new { Grade = projected.grade, Name = projected.name });

@alrz
Copy link
Member

alrz commented Feb 3, 2016

@HaloFour Still, you are translating let to more than one LINQ operator (and worse, depending on the else clause). Supporting break, continue or throw in query expressions just for let doesn't make much sense, and also, you assumed that continue and break are expressions, otherwise the else clause should have some weird production rules to support them besides of throw. However, I believe they should be expressions to be able to write something like #5143 comment.

My two cents,

extension<T> IEnumerable<T> {
  public IEnumerable<TResult> Select<TResult>(Func<T, Option<TResult>> selector) {
    foreach(var item in this) {
      let Some(var value) = selector(item) else continue;
      yield return value;
    }
  }
}

from obj in list
let Foo foo = obj
select foo;

list.Select(obj => obj is Foo foo ? Some(foo) : None());

from person in list
let Student { Grade is var grade, Name is var name } = person
where grade > 65
select new { Grade = grade, Name = name };

list.Select(person => person is Student { Grade is var grade, Name is var name } ?
  Some((grade,name)) : None())
.Where(t => t.grade > 65)
.Select(t => new { Grade = t.grade, Name = t.name});

Then you might think of else continue to make it explicit.

Using tuples for fallible patterns is not intuitive at all, because still you need to allocate default values which totally loses the point. I think an option type is the best solution to this kind of problems, regardless of what language you are coding in.

@HaloFour
Copy link

HaloFour commented Feb 3, 2016

@alrz

Still, you are translating let to more than one LINQ operator (and worse, depending on the else clause).

There is nothing that states that a LINQ clause must be directly translated to a single extension method. I am working with what I have. .NET 3.5 LINQ has no single function which performs both a map and a filter. If a future version of the BCL does include such a function then that would be a good target for optimization, but as it stands I see no reason to limit the feature to some not-yet-released CLR.

I am aware that I am using tuples in my example translation which would not exist until a new BCL is released. When targeting a currently released framework the translation would use anonymous types instead, just as it does today with let range variables.

Supporting break, continue or throw in query expressions just for let doesn't make much sense, and also, you assumed that continue and break are expressions, otherwise the else clause should have some weird production rules to support them besides of throw.

I think it makes a lot of sense and also provides good symmetry for the non-LINQ destructuring form of let. And none of those statements need to be treated as expressions as the else clause cannot produce a value. What follows the else clause is an embedded-statement which must not reach the end of the block. This would be the exact same syntax and behavior that would be expected of the non-LINQ destructuring statement when used within a loop.

Using tuples for fallible patterns is not intuitive at all, because still you need to allocate default values which totally loses the point.

The developer doesn't have to do anything. The use of anonymous types, tuples or even Option<T> would be strictly an implementation detail. And the compiler using default(T) to fill in those empty anonymous type or tuple properties is effectively free.

Also, I don't think that going for an "overload" of Select is a good idea for your proposed extension method. There are valid reasons to want to include the None results in the projected sequence.

@alrz
Copy link
Member

alrz commented Feb 4, 2016

@HaloFour

There is nothing that states that a LINQ clause must be directly translated to a single extension method.

I think that was the first thing that came up and lead to SelectMany etc. Otherwise an additional Where would just work.

I see no reason to limit the feature to some not-yet-released CLR.

Same was true for await and it is true for tuple types now, but they probably can be supported through some libraries in earlier versions.

What follows the else clause is an embedded-statement

So, perhaps you need to end your embedded statement with a semicolon? Furthermore, embedding an embedded statement in a query expression doesn't seem like a good idea.

The developer doesn't have to do anything.

True, but those values will be allocated eventually right?

@HaloFour
Copy link

HaloFour commented Feb 4, 2016

@alrz

I think that was the first thing that came up and lead to SelectMany etc. Otherwise an additional Where would just work.

That was a conceptual implementation that would have avoided the multiple extension methods but would have created an unwanted dependency on IEnumerable<T>. If you read back @gafter was satisfied with my original transformation which I simply repeated here.

Same was true for await and it is true for tuple types now, but they probably can be supported through some libraries in earlier versions.

Actually await follows a convention and has no dependency on Task<T>. async does, however. It has been mentioned that the language team wanted to avoid this dependency but didn't have a good way to do so. There's no reason to create a dependency when one doesn't have to, and my transformations demonstrates that it's not necessary. That's not to mean that a newer better implementation can't be used when targeting a newer framework.

So, perhaps you need to end your embedded statement with a semicolon? Furthermore, embedding an embedded statement in a query expression doesn't seem like a good idea.

Why? It's not the end of the expression. I don't see any problem with embedding a statement or a block in a query expression. However if that would be considered confusing I personally would be fine with that statement being limited to break, continue or throw. In either case the clause does not need to be an expression since no value is being returned.

True, but those values will be allocated eventually right?

No, default(string) is literally the same as null, and default(int) is literally the same as 0. Zero allocations are required. I could have used null in their place in my example translations but then I'd probably have to include type-casts to keep the pseudo-compiler happy.

@alrz
Copy link
Member

alrz commented Feb 4, 2016

@HaloFour "Why?" because the semicolon is a part of an embedded statement. "the clause does not need to be an expression since no value is being returned." That is also true for throw, then I don't see why continue or break can't be an expression to be used in switch in the example that I mentioned above. "No, default(string) is literally the same as null" I meant the additional wasted pointer, no allocations. PS: At first your comment looked like the unfinished portrait of Franklin Roosvelt 😄

@HaloFour
Copy link

HaloFour commented Feb 4, 2016

@alrz

because the semicolon is a part of an embedded statement.

embedded-statement is a list of other potential statements. Some of those statements do have semicolon requirements, although block doesn't. I'd be fine if the spec for let in this context differed syntactically as obviously a semicolon at that part of the query makes no sense.

That is also true for throw, then I don't see why continue or break can't be an expression to be used in switch in the example that I mentioned above.

Probably because you're using the expression form of switch and it was mentioned that it was a bad idea to let the middle of an expression use control statements. That's a different proposal altogether, though. The non-LINQ let would permit continue or break in it's else clause so this would provide a nice symmetry, and these control statements can only affect the query, not the declaring method.

I meant the additional wasted pointer, no allocation.

64-bits of empty space, no big deal. I'm all for going with a more optimal route given a newer BCL but I think it's ridiculous to not permit a language feature to work on the older frameworks when there is clearly an easy way to implement it, even if it might be slightly less efficient.

PS: At first your comment looked like the unfinished portrait of Franklin Roosevelt

I managed to accidentally find some kind of keyboard combination which submits the comment immediately. I don't even know what I hit 😦

@alrz
Copy link
Member

alrz commented Feb 4, 2016

@HaloFour As the last inquire, would it make sense to turn let clause to let statement in case of presence of do as the last part of the query (#1938) so it doesn't take the unpleasant translation of an incomplete let to tuples with an additional bool?

from obj in list
let Foo foo = obj else continue
do WriteLine(foo.Bar);

foreach(var obj in list) {
  let Foo foo = obj else continue;
  WriteLine(foo.Bar);
}

@HaloFour
Copy link

HaloFour commented Feb 4, 2016

@alrz Works for me. I'd want the compiler to use the most efficient implementation that it could.

@orthoxerox
Copy link
Contributor

@alrz LINQ from-syntax is always rewritten into LINQ functional syntax. You would need to add a lookahead translation rule like the one for from-from-select, for example.

from x1 in e1
let P1 x2 = e2 else continue
do s3;

would be rewritten into:

e1.Select(x1 => {
    let P1 x2 = e2 else return;
    { s3; }
})

@alrz
Copy link
Member

alrz commented Feb 4, 2016

@orthoxerox

Per @gafter's comment, it will be a new statement form, not on top of the existing query-expression, but I'm not sure about let translation. And your example will be translated to foreach. See #1938.

@gafter
Copy link
Member Author

gafter commented Mar 27, 2017

Issue moved to dotnet/csharplang #355 via ZenHub

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants