Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend LINQ syntax #100

Closed
scalablecory opened this issue Jan 28, 2015 · 83 comments
Closed

Extend LINQ syntax #100

scalablecory opened this issue Jan 28, 2015 · 83 comments

Comments

@scalablecory
Copy link

I can't count the number of times I've written something like this:

var x = (from item in collection
         where item.foo
         select item.bar).Single()

Readability takes a nose dive, especially with complex or nested queries. There's a lot we can do to improve what is one of C#'s most powerful features. From low-hanging fruit that are compatible with existing infrastructure like ORMs:

from item in collection
select single item.foo;

from item in collection
select first item.foo or default;

from item in collection
select sum of item.foo;

from item in collection
skip 5
select top 3 item.foo;

from item in collection
left join item2 in collection2 on item.foo equals item2.foo
select new { item, item2 };

To bits which are currently only usable by LINQ to Objects:

from item in collection
group item.foo by item.bar into grp using StringComparer.InvariantCulture
select grp;

from item in collection by idx
select "item " + item + " at index " + idx;
@ErikSchierboom
Copy link

I really like this idea. Having to add the braces around the expression just to use the Single() call always seemed a bit unnecessary.

@svick
Copy link
Contributor

svick commented Jan 28, 2015

I agree that something like this would be very useful, but I don't like that you're proposing 4 different syntaxes for 4 different, but very similar, operations.

What I would prefer is a single syntax that can be used to call any "finishing" method. VB.NET already has something like this, though I think its syntax is not ideal.

@eklam
Copy link

eklam commented Jan 28, 2015

I like the idea as well, but your first query (and most of the others) could be made simpler just writing like this:
var x = collection.Single(item => item.foo);

@DustinCampbell
Copy link
Member

How about this?

Aggregate item In collection
Where item.foo
Into Sum

😄

@giggio
Copy link

giggio commented Jan 28, 2015

I like this idea a lot.

@scalablecory
Copy link
Author

@svick @DustinCampbell On the surface I do like the idea of a more generic syntax. If we consider "from" to be a multi-result reduce, a single-result reduce keyword may be well anchored. I worry that readability will be sacrificed as it breaks out of the current Plain English syntax.

@jwooley
Copy link

jwooley commented Jan 28, 2015

While we're at it, go ahead and add Distinct, Skip and Take. Just steal the code from the VB code base.

@axel-habermaier
Copy link
Contributor

@scalablecory: Agreed - this is a common thing to do and has always bothered me.So much, even, that I usually prefer the method-style syntax.

On the other hand, I like how F# supports most LINQ operations quite nicely. The underlying mechanism (computation expressions) even allows for extensibility, so in F# you can easily define your own LINQ variant with all the operators you could ever want. That's probably not necessary for C# as long as the most common operators are supported out-of-the-box.

@AdamSpeight2008
Copy link
Contributor

Extend LINQ query syntax.
Simple Sequence

From y In 1 To 10
From x In 10 To 1 Step -1

Syntax is close to what is already used by VB's for loop.

It could be extended to include datetimes.

hours = From hr In #2015/01/01# To #2015/12/31# Step #01:00:00#

@metatron-the-chronicler

I think that this could be solved in a much more comprehensive way if C# allowed functions to be used as infix operators. Rather than extending the Linq syntax, C#'s syntax should be extended to take care of this without it being limited to just what the compiler team has the time to provide.

@gafter
Copy link
Member

gafter commented Jun 23, 2015

@metatron-the-chronicler Do you have a specific proposal to offer?

@metatron-the-chronicler

I actually wanted to make this a specific proposal but I haven't had the time to sit down and write it.

At any rate I think it would be a good idea to simply allow functions to be used in the infix position.

One might write something like

from c in collection
select somefield ToList()

the important thing is that ToList isn't some special syntax or expression. The compiler simply resolves to the method we all know and love based on the position (Though I can imagine that this might make parsing more complicated.) of ToList.

Of course the same infix notation would ideally be available outside the context of a linq expression. The main benefit of allowing any function to be used in the infix position other than resolving this particular issue would be allowing for cleaner dsls.

I've always found it to be a real shame that the query syntax wasn't implemented in this sort of way from the beginning. Though I suppose that doing so has other technical implications that I don't have the knowledge to dispute.

@gafter
Copy link
Member

gafter commented Jun 24, 2015

@metatron-the-chronicler I'd be interested in seeing your proposal if you ever get around to it.

@scalablecory
Copy link
Author

@metatron-the-chronicler Maybe use attributes to define new linq keywords, so it doesn't have to break out of the Plain English look.

@GSPP
Copy link

GSPP commented Jun 24, 2015

select top 3 item.foo; really is not in the spirit of LINQ. The design principle is that you chain separate operations. select and top should not be mixed. SQL certainly is not a nice language and we do not strive to make LINQ more SQL like. LINQ is better.

What about:

from x in list
orderby x.Prop
skip 20
take 10
select x.Something

Regarding the materialization clause, let's make it support any collection at all:

from x in list
...
select x.SomeInt
as List<int>

or:

as int[]

or:

as HashSet<int>

Maybe we can shortcut to

as HashSet<>, List<>, []

We need some way to make the compiler pick up the appropriate conversion. Maybe look for a constructor with an IEnumerable signature.


Also, distinctness would be useful:

from x in list
distinct x by x.SomeKey
select x

We need new methods in the BCL: DistinctBy, ExceptBy, IntersectBy, UnionBy, etc.


Also, there is one helper method that comes up again and again on Stack Overflow: A method that can be used to partition a sequence into chunks of fixed length. This really should be added to the Enumerable class.


I work a lot with data. Often even big, nasty ETL-style queries. These features would be extremely handy. Especially the materialization feature because you often need to materialize for debuggability. This forces an ugly syntax: You need to wrap the query in braces and add this dangling .ToList() at the end. Code formatters have trouble with that.

@svick
Copy link
Contributor

svick commented Jun 24, 2015

@GSPP

Also, there is one helper method that comes up again and again on Stack Overflow: A method that can be used to partition a sequence into chunks of fixed length. This really should be added to the Enumerable class.

You might want to propose that at the corefx repo, Enumerable is not part of Roslyn.

@GSPP
Copy link

GSPP commented Jun 24, 2015

@HaloFour
Copy link

@GSPP I'm sure you meant it as example syntax, but select <expression> as <type> is already valid LINQ syntax for attempting to type cast the result of the expression over each element in the sequence, e.g.:

var managers = from employee in employees
               where employee.IsManager
               select employee as Manager;

@jnm2
Copy link
Contributor

jnm2 commented Jun 24, 2015

Single, SingleOrDefault, FirstOrDefault or ToList is on just about every LINQ expression I write. This would be much appreciated.

Would Sum, Min and Max be able to be used this way?

Also, Entity Framework has SingleAsync, ToListAsync, etc. I don't suppose extension methods could be included? @metatron-the-chronicler's method would allow that.

@GSPP
Copy link

GSPP commented Jun 24, 2015

@HaloFour true. Suggestions?

from x in list
where x != null
select x into List<>

Maybe? Should be unambiguous. But the pattern doesn't work for Single etc. Probably this should not be based on type but on (extension) methods

@jwooley
Copy link

jwooley commented Jun 24, 2015

VB already supports the extra LINQ methods for Aggregate, Skip, Take, Distinct, Sum, Count. I have wondered what things would look like if you could an arbitrary extension method and have the compiler turn it into a LINQ operator. I suspect this is along the same lines as the infix recommendation. Consider the following:

<Extension>
Public Function PutMeInline(Of T)(
    items as IEnumerable(Of T), 
    predicate As Function(Of T, bool)) As IEnumerable(Of T)

    ‘ Do something and yield results
End Function

Public Sub Main
    Dim vals = {1, 2, 3}
    Dim test = 
        From num In nums
        PutMeInline num % 2 = 0
        Select num
        Distinct
        ToListAsync
End Sub

While this is an intriguing possibility, I’m sure the number of edge cases where the compiler would choke would make such a feature unwieldy.

@HaloFour
Copy link

@GSPP Unfortunately into is also valid there in projecting the partial result into a new named range variable:

from x in list
where x != null
select x into List
where List == null
select List

I'm not sure what would work best. While as and into both make sense they're both already used and they also imply how the query would be terminated which is not necessarily correct. Neither of those terms seem to make sense if you would want to terminate with Any or All.

🍝

I might start by borrowing the existing LINQ clauses from VB.NET, distinct, aggregate, skip and take which are probably all common use cases. Beyond that is it worthwhile to have a separate clause that would invoke a final method on the entire query?

from x in list
where x != null
then ToList();  // or apply, or terminate, or frob

Or a way to mark extension methods as extending the LINQ syntax with a well defined behavior as to how parameters are treated?

[LinqClause("tolist")]
public static List<T> ToList<T>(this IEnumerable<T> source) {
    // redundant much?
    return source.ToList();
}

from x in list
where x != null
tolist;

@weitzhandler
Copy link
Contributor

I strongly vote for this one.

I never use LINQ language queries and instead my coding guidelines is avoid using them and always use the extension methods directly, and that's because the ugly parentheses each query has to be wrapped in order to materialize it (ToArray, ToList, Sum, SingleOrDefault etc.).

Until this issue is addressed, language-build-in LINQ is merely useless.
I really hope to see this implemented soon. Maybe to a more limited extent (avoiding the introduction of a gazillion new language keywords.

I'd say the syntax should provide an operator that expects a shortcutted extension method available for IEnumerable<TElement>, for instance:

//parents is Parent[]
var parents = from student in students
                     where student.Age < 18
                     select student.Parent
                     call ToArray()

//student is Student
var student = from st in students
                      call SingleOrDefault(st => st.Id == id);

Asynchronous methods should also be supported:

   var student = from st in students
                         call await SingleOrDefaultAsync(st => st.Id == id);

Maybe there should be a verbose LINQ fashioned way to pass the arguments and completely avoid the use of parenthesis, but I personally don't see it as necessary.

Anyway this feature is crucial for the completeness of the LINQ syntax.

Some suggestions above proposed the ToList at the beginning of the query, but that's a no go, since we want to be able to process it after the selection, and we don't want to be limited to parameterless ex. methods only. What if we wanna call ToLookup with a key selector, bottom line we can't tie it up to the language, we just need to find a way to call any Enumerable ex. methods on the current query state and treat its result as the final type of the query.

@metatron-the-chronicler

If the compiler does not do it already would it be impossible to just require the type to be given to the identifier being assigned to and have the compiler do the conversion itself?

@weitzhandler
Copy link
Contributor

There are so many opened issues all addressing the same idea.
Wasn't sure which one is the actual discussion.
I'll happily delete all of them but one if I know there is a central active place.

@AdamSpeight2008
Copy link
Contributor

What would be a good addition, is to also have the item index as well.

@alrz
Copy link
Member

alrz commented Dec 17, 2015

As @AdamSpeight2008 said, Aggregate in VB can already do this, but it's limited somehow, e.g. can't be used with ToDictionary overload with two params, It can be a great addition to C# though. I also suggest Aggregate to support Partition By and Order By for windowing.

@Eirenarch
Copy link

Adding terminating operators as keywords is probably not a good idea because there are too many of them. Single, SingleOrDefault, SingleAsync, SingleOrDefaultAsync, First, FirstOrDefault, FirstAsync, FirstOrDefaultAsync, Count, CountAsync, ToList, ToListAsync, ToDictionary, ToDictionaryAsync, ToArray, ToArrayAsync... where does it end?

On the other hand I think it is a very good idea to support non-terminating operators specifically Skip, Take, Distinct.

@MadsTorgersen
Copy link
Contributor

We'll keep this on the backlog as a reminder to consider something here.

The best proposal I've seen was

var x = 
    from item in collection
    where item.foo
    select item.bar
    do Single();

(Or some other keyword). The idea is to add a query operator that is like a . but with different precedence.

@alrz
Copy link
Member

alrz commented Aug 15, 2016

Would like to see if it can handle something like ToDictionary,

var x = 
    from item in collection
    where item.foo
    do ToDictionary(item.Key, item.Value);

i.e. the range variable is available in the do clause.

@TonyValenti
Copy link

Grr... there was a minor typo....

I would say completely eliminate the variable declaration since I think the
goal is to make Linq a replacement for standard loops. Perhaps since that
is the goal, SELECT isn't even needed.

from Parent in Parents
from Child in Parent.Children
let Age = (DateTime.Now - Child.DateOfBirth).Years
where Age <= 13
do {
    //I should have access to all variables in scope here.  Namely, Parent,
Child, and Age.
    Console.WriteLine("Hi {0}, because you are {1} years old, you should
ask {2} for permission.", Child.Name, Child.Age, Parent.Name);
};

On Mon, Aug 15, 2016 at 4:26 PM, Tony Valenti <tony.valenti@gmail.com>
wrote:

> I would say completely eliminate the variable declaration since I think
> the goal is to make Linq a replacement for standard loops.  Perhaps since
> that is the goal, SELECT isn't even needed.
>
> ````
> from Parent in Parents
> from Child in x.Children
> let Age = (DateTime.Now - Child.DateOfBirth).Years
> where Age <= 13
> do {
>     //I should have access to all variables in scope here.  Namely,
> Parent, Child, and Age.
>     Console.WriteLine("Hi {0}, because you are {1} years old, you should
> ask {2} for permission.", Child.Name, Child.Age, Parent.Name);
> };
>
>
> ````
>
> On Mon, Aug 15, 2016 at 3:43 PM, Alireza Habibi <notifications@github.com>
> wrote:
>
>> Would like to see if it can handle something like ToDictionary,
>>
>> var x =
>>     from item in colleciton
>>     where item.foo
>>     do ToDictionary(item.Key, item.Value);
>>
>> or not.
>>
>> —
>> You are receiving this because you were mentioned.
>> Reply to this email directly, view it on GitHub
>> <http://t.sidekickopen55.com/e1t/c/5/f18dQhb0S7lC8dDMPbW2n0x6l2B9nMJN7t5XZsd0HHzW1pNnyC5vwmmjW8qSyyn56dD0mdSTSt602?t=https%3A%2F%2Fgithub.com%2Fdotnet%2Froslyn%2Fissues%2F100%23issuecomment-239922983&si=5089146699251712&pi=f59624b4-e545-4380-c3f9-c9fc17fb4058>,
>> or mute the thread
>> <http://t.sidekickopen55.com/e1t/c/5/f18dQhb0S7lC8dDMPbW2n0x6l2B9nMJN7t5XZsd0HHzW1pNnyC5vwmmjW8qSyyn56dD0mdSTSt602?t=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAM-qVl2RLlSUoNfdce9v70hAIZX5MS0Mks5qgM9zgaJpZM4DYJ5g&si=5089146699251712&pi=f59624b4-e545-4380-c3f9-c9fc17fb4058>
>> .
>>
>
>
>
> --
> Tony Valenti
>



-- 
Tony Valenti

@alrz
Copy link
Member

alrz commented Aug 15, 2016

@TonyValenti That would be a "LINQ statement" which is already proposed in #1938.

@svick
Copy link
Contributor

svick commented Aug 15, 2016

@TonyValenti

I think the goal is to make Linq a replacement for standard loops

No, that is not the goal of this proposal. The goal is to make queries that end with calls to methods like Single() or ToList() easier to write. I don't think your suggestion does that.

@scalablecory
Copy link
Author

scalablecory commented Aug 15, 2016

@TonyValenti the goal of this proposal is to make LINQ syntax align more with the standard feature set of LINQ methods, to reduce the need to fall back to that. It is not intended to replace loops.

@TonyValenti
Copy link

@MadsTorgersen I really like your proposal and it would be really nice if there was some flexibility on what follows the DO statement.

For example:
I should be able to write all of the following:

var items = new[]{1,2,3,4,5};

//This is a statement block that is essentially a "For Each" loop.
//For this loop, the statement has a void return time
from x in items
select x
do {
    Console.WriteLine(x.ToString());
}

//This is a statement block returns a List.
var items = 
from x in items
select x
do Skip(1).ToList();

//This might not be as useful but I figured I'd include it.
//This is a statement block that returns an iterator that does extra processing on each element..
//For this, the statement returns an IEnumerator<INT>
var Squares = 
from x in items
select x
do {
    yield return x * x;
}


@gafter
Copy link
Member

gafter commented Mar 20, 2017

We are now taking language feature discussion on https://github.com/dotnet/csharplang for C# specific issues, https://github.com/dotnet/vblang for VB-specific features, and https://github.com/dotnet/csharplang for features that affect both languages.

@scalablecory
Copy link
Author

This is largely continued in csharplang #101.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests