Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[C# Feature] Convenient combination of foreach and LINQ #1938

Closed
dsaf opened this issue Apr 13, 2015 · 53 comments
Closed

[C# Feature] Convenient combination of foreach and LINQ #1938

dsaf opened this issue Apr 13, 2015 · 53 comments

Comments

@dsaf
Copy link

dsaf commented Apr 13, 2015

I would argue that embedded is a better term for how the syntax feels at the moment.

Currently:

foreach (var item in from xItem in new[] {"One", "Two", "Three" } 
                     where xItem == "Two"
                     select xItem)
{
    Debug.WriteLine(item);
}

Suggested:

foreach (from item in new[] {"One", "Two", "Three" } 
         where item == "Two"
         select item)
{
    Debug.WriteLine(item.ToString());
}
@HaloFour
Copy link

How about:

from item in new[] {"One", "Two", "Three" } 
where item == "Two"
foreach {
   Debug.WriteLine(item.ToString());
}

@bondsbw
Copy link

bondsbw commented Apr 13, 2015

Nice suggestion @dsaf. And thank you @HaloFour, I love that syntax!

@alexpolozov
Copy link

OK, if we are opening the Pandora box of syntactic sugar now, let's not stop at the frosting level:

foreach item in new[] {"One", "Two", "Three" } 
  where item == "Two"
{
   Debug.WriteLine(item.ToString());
}

@dsaf
Copy link
Author

dsaf commented Apr 14, 2015

Both options look like extending LINQ rather than integrating it. To me having two foreach's - with and without round brackets is weird. But to be fair your suggestions are quite close to what X++ is already using and it seems to be working: https://msdn.microsoft.com/en-us/library/aa558063.aspx

@bondsbw
Copy link

bondsbw commented Apr 15, 2015

Actually @dsaf, I'd say that @apskim's approach is merely your original proposal, coupled with two syntax changes.

The first change is to remove the select item portion. I feel this is necessary anyway, because the name item is already in scope from the from clause and the select portion does nothing.

The second change removes the parentheses. This could be a change to the general case of foreach, allowing

foreach (var x in myCollection)
{
   // ...
}

to be rewritten as

foreach var x in myCollection
{
   // ...
}

I think removing the parentheses would require a braced body, otherwise perhaps the syntax would become ambiguous.

@bondsbw
Copy link

bondsbw commented Apr 15, 2015

Oops, I missed that the from is removed in the shortened syntax. For that matter... why do we need the var in the existing foreach clause? That could be implicit too

foreach x in myCollection
{
    //...
}

And where (and join, orderby, let, and the rest) would just be a variation of the existing syntax

foreach x in myCollection
where x < 7
{
    //...
}

@alexpolozov
Copy link

Just to clarify: my proposal was mostly sarcastic :)
I personally think that such syntactic sugar for the sake of sugaring merely complicates the language and breaks the consistent interplay of various language constructs.

@bondsbw
Copy link

bondsbw commented Apr 15, 2015

I don't think it's just for "the sake of sugaring". I feel there are clear advantage to reducing:

var filteredCollection =
    from t in myCollection
    where t < 7
    select t;

foreach (var x in filteredCollection)
{
    //...
}

to

foreach x in myCollection
where x < 7
{
    //...
}

Besides it obviously reducing the amount of code, it reduces the number of variables by removing t and filteredCollection.

Also imagine if it were more complicated (example based on MSDN LINQ samples):

var query = 
    from c in svcContext.ContactSet
    join a in svcContext.AccountSet
        on c.ContactId equals a.PrimaryContactId.Id
    where c.LastName == "Wilcox" || c.LastName == "Andrews"
    where a.Address1_Telephone1.Contains("(206)")
        || a.Address1_Telephone1.Contains("(425)")
    select new
    {
        Contact = new Contact
        {
            FirstName = c.FirstName,
            LastName = c.LastName,
        },
        Account = new Account
        {
            Address1_Telephone1 = a.Address1_Telephone1
        }
    };
foreach (var record in query)
{
    Console.WriteLine("Contact Name: {0} {1}",
        record.Contact.FirstName, record.Contact.LastName);
    Console.WriteLine("Account Phone: {0}",
        record.Account.Address1_Telephone1);
}

could become more succinctly

foreach c in svcContext.ContactSet
    join a in svcContext.AccountSet
        on c.ContactId equals a.PrimaryContactId.Id
    where c.LastName == "Wilcox" || c.LastName == "Andrews"
    where a.Address1_Telephone1.Contains("(206)")
        || a.Address1_Telephone1.Contains("(425)")
{
    Console.WriteLine("Contact Name: {0} {1}", c.FirstName, c.LastName);
    Console.WriteLine("Account Phone: {0}", a.Account.Address1_Telephone1);
}

because (in this example) it eliminates the need for temporary variables holding anonymous types.

@aluanhaddad
Copy link

@bondsbw I don't think this adds much.

The code

var filteredCollection =
    from t in myCollection
    where t < 7
    select t;

foreach (var x in filteredCollection)
{
    //...
}

Can already be shortened to:

foreach(var x in myCollection.Where(x => x < 7))
{
    //...
}

Also the code:

var query = 
    from c in svcContext.ContactSet
    join a in svcContext.AccountSet
        on c.ContactId equals a.PrimaryContactId.Id
    where c.LastName == "Wilcox" || c.LastName == "Andrews"
    where a.Address1_Telephone1.Contains("(206)")
        || a.Address1_Telephone1.Contains("(425)")
    select new
    {
        Contact = new Contact
        {
            FirstName = c.FirstName,
            LastName = c.LastName,
        },
        Account = new Account
        {
            Address1_Telephone1 = a.Address1_Telephone1
        }
    };

can already be shortened to:

var query = 
    from c in svcContext.ContactSet
    join a in svcContext.AccountSet
        on c.ContactId equals a.PrimaryContactId.Id
    where c.LastName == "Wilcox" || c.LastName == "Andrews"
    where a.Address1_Telephone1.Contains("(206)")
        || a.Address1_Telephone1.Contains("(425)")
    select new
    {
        c.FirstName,
        c.LastName,
        a.Address1_Telephone1
    };

and if you really want to avoid the anonymous type projection entirely you can write:

var query = 
    from c in svcContext.ContactSet
    join a in svcContext.AccountSet
        on c.ContactId equals a.PrimaryContactId.Id
    where c.LastName == "Wilcox" || c.LastName == "Andrews"
    where a.Address1_Telephone1.Contains("(206)")
        || a.Address1_Telephone1.Contains("(425)")
    select (Action)(()=>
    {
       Console.WriteLine("Contact Name: {0} {1}", c.FirstName, c.LastName);
       Console.WriteLine("Account Phone: {0}", a.Account.Address1_Telephone1);
    });
foreach(var a in query)
{
    a();
}

Also the proposed example:

foreach c in svcContext.ContactSet
    join a in svcContext.AccountSet
        on c.ContactId equals a.PrimaryContactId.Id
    where c.LastName == "Wilcox" || c.LastName == "Andrews"
    where a.Address1_Telephone1.Contains("(206)")
        || a.Address1_Telephone1.Contains("(425)")
{
    Console.WriteLine("Contact Name: {0} {1}", c.FirstName, c.LastName);
    Console.WriteLine("Account Phone: {0}", a.Account.Address1_Telephone1);
}

is confusing because the scoping is, at least to me, non obvious. Since c is the range variable following foreach, I would not expect a to be in scope. It would only make sense if foreach were the last clause.

Other proposals for expanding LINQ syntax, such as the one proposing the ability to specify an IEqualityComparer inline (#100) add real value.

@bondsbw
Copy link

bondsbw commented Apr 16, 2015

@aluanhaddad

Since c is the range variable following foreach, I would not expect a to be in scope. It would only make sense if foreach was the last clause.

Keeping range variables in scope is, in my mind, a natural part of the proposal. I don't really see the confusion, since range variables are similarly in scope for any select including the select (Action)... workaround you mentioned.

@HaloFour
Copy link

The problem with expressing the foreach clause at the beginning of the query is that the range variables can go in and out of scope. For example:

var q = from x in Enumerable.Range(0, 10)
        select new String('X', x) into letters
        from y in letters
        select x; // compiler error, x is not in scope

Also:

var q = from x in Enumerable.Range(0, 10)
        select new String('X', x) into letters // here x is an int
        from x in letters
        select x; // here x is a char

This is why I would prefer that terminating clauses like foreach occur at the end of the query.

@aluanhaddad
Copy link

@bondsbw The confusion I was referring to was precisely due to the foreach clause being at the beginning.
As HaloFour concisely stated:

The problem with expressing the foreach clause at the beginning of the query is that the range variables can go in and out of scope.

Personally I feel that LINQ expression are properly language integrated for the following reasons:

  1. They can be used anywhere an expression of the type they evaluate to can be used.
  2. They are extensible in that it is easy to enable query comprehensions that manipulate data types other than IEnumerable<out T> and IQueryable<out T>. For example I was able to build an Option library that allows the Option (Maybe) monad to be used in query expressions. This allowed my Option type to be used in query expressions just as Scala's Option type can be used in for comprehensions.

@gafter gafter changed the title [C# Feature] Make LINQ query syntax properly language-intergated [C# Feature] Convenient combination of foreach and LINQ Sep 10, 2015
@alrz
Copy link
Member

alrz commented Nov 13, 2015

Basically, what @HaloFour has suggested, but with a do and an embedded-statement following.

from item in list do WriteLine(item);

@aluanhaddad
Copy link

@alrz Under peril of being accused of bikeshedding, I want to say that foreach would be far superior to do.
Edit:
I changed my mind. do reads much better in this context.

@gafter
Copy link
Member

gafter commented Nov 14, 2015

So are you folks all fine with not being able to return/break from a loop?

@HaloFour
Copy link

@gafter Why wouldn't that be possible? If like @alrz suggested this keyword was followed by an embedded-statement it would seem reasonable that said statement, like any for/foreach/while/do loop would be capable of using continue, break or return.

from number in Enumerable.Range(0, 10)
foreach {
    if (number < 5) continue;
    else if (number == 8) break;
    else if (number == 9) return "oops";
};

Or is this a question of implementation, e.g. passing embedded-statement to an extension method like ForEach?

@alrz
Copy link
Member

alrz commented Nov 14, 2015

The reason that I chose do is that it's already used in do .. while loop with an embedded-statement so it doesn't look "weird" in this context.

@HaloFour

e.g. passing embedded-statement to an extension method like ForEach?

Actually, this proposal suggests to make query-expression a statement. So do-clause here, shouldn't translate to an extension method call, rather, it should translate to a regular foreach statement (one per each from-clause). So

from foo in foos
from bar in foo.Bars
where bar > 0
do WriteLine($"{foo}:{bar}");

would translate to

foreach(var foo in foos)
    foreach(var bar in foo.Bars.Where(bar => bar > 0))
        WriteLine($"{foo}:{bar}");

Normally, query expressions must end with a select- or group-clause. In the statement context, they would have to end with do-clause or whatever. And as you said,

from item in list do {
    // continue;
    // break;
    // return;
};

This doesn't look magical, compared to:

do {
    // continue;
    // break;
    // return;
} while( ... );

@gafter
Copy link
Member

gafter commented Nov 14, 2015

@HaloFour There is no efficient way to pass am embedded statement to an extension method.

Its not that I'm against that sort of thing... see my talks about "closures" for Java circa 2006-2007 where I showed exactly how to do it and provided a language extension and implementation. I just don't know how to do it without a thousand-fold performance penalty on the CLR.

@HaloFour
Copy link

@alrz I wouldn't recommend going the route of trying to use an extension method anyway. I was just fishing for what @gafter was intending by his question regarding not being able to break/return/etc from the loop. I agree that if such a clause were to exist that it should just translate into a foreach loop:

from x in ... foreach embedded-statement

into:

foreach (var x in ...) embedded-statement

@gafter
Copy link
Member

gafter commented Nov 14, 2015

@HaloFour I see now what you had in mind. That does make sense to me. However I think I prefer the syntax @alrz suggests

from item in new[] {"One", "Two", "Three" } 
        where item == "Two"
        do
{
    Debug.WriteLine(item.ToString());
}

I do believe, however, that this should be done using the Linq methods, such as Where for this case.

@gafter gafter self-assigned this Nov 18, 2015
@gafter gafter added this to the C# 7 and VB 15 milestone Nov 18, 2015
@HaloFour
Copy link

@gafter Works for me. I agree that the rest of the LINQ query still using the LINQ extension methods:

from item in new[] { "One", "Two", "Three" }
where item ==  "Two"
do Debug.WriteLine(item.ToString());

// equivalent to
var $temp = (new [] { "One", "Two", "Three" }).Where(item => item == "Two");
foreach (var item in $temp) Debug.WriteLine(item.ToString());

@aluanhaddad
Copy link

I like this idea. Placing the do clause at the end is nicely readable and keeps the scoping straightforward.

One thing that occurs to me is that this could be quite useful as a more general kind of side effecting interjective operator, in the vein of Observable.Do.
However, that would seem to require that the do clause be implemented as a standard sequence operator, ruling out the use of break, continue, return, and so on, but personally I think that would be worth the trade off.

@paulomorgado
Copy link

So, you're operating in the assumption of infinite resources to implement zero-cost features. is it?

@HaloFour
Copy link

@paulomorgado I'd say it's up to the language design team to both determine whether or not a change like this fits within the philosophy of what LINQ is supposed to be (and whether or not that still applies) as to the priority of each of these proposals. I don't think that anyone assumes that every proposal will be immediately considered and implemented.

As for do over an item projected by select, you'd need to use select into in order to project it into a separate range variable:

from item in new[] {"One", "Two", "Three" } 
        where item == "Two"
        select new Something(item) into item
        do
{
    Undo(item);
}

@aluanhaddad @TonyValenti As of now this do embedded-statement is translated directly into a foreach loop around the query. So it would be sequential but also allow for control statements like return, break or continue which would not be possible if it translated into an extension method which accepted an Action<T>.

@aluanhaddad
Copy link

@HaloFour yes, it is a tradeoff. Personally, I would prefer the benefits of an extension method implementation, so I made some arguments for that, but it would still be useful if implemented as a direct translation to foreach, so I'm in favor of it either way.

@TonyValenti
Copy link

Here is an idea:
What if we got a new operator? I propose |. (Pipe dot).

var items = from x in source select x |.ToList();

On Tuesday, December 15, 2015, Shimmy notifications@github.com wrote:

#100 (comment)
#100 (comment)

I strongly vote for this one.

I never use LINQ language queries and instead my coding guidelines is
avoid using them and always use the extension methods directly, and that's
because the ugly parentheses each query has to be wrapped in order to
materialize it (ToArray, ToList, Sum, SingleOrDefault etc.).

Until this issue is addressed, language-build-in LINQ is merely useless.
I really hope to see this implemented soon. Maybe to a more limited extent
(avoiding the introduction of a gazillion new language keywords.

I'd say the syntax should provide an operator that expects a shortcutted
extension method available for IEnumerable, for instance:

//parents is Parent[]
var parents = from student in students
where student.Age < 18
select student.Parent
call ToArray()

//student is Student
var student = from st in students
call SingleOrDefault(st => st.Id == id);

Asynchronous methods should also be supported:

var student = from st in students
call await SingleOrDefaultAsync(st => st.Id == id);

Maybe there should be a verbose LINQ fashioned way to pass the arguments
and completely avoid the use of parenthesis, but I personally don't see it
as necessary.

Anyway this feature is crucial for the completeness of the LINQ syntax.

Some suggestions above proposed the ToList at the beginning of the query,
but that's a no go, since we want to be able to process it after the
selection, and we don't want to be limited to parameterless ex. methods
only. What if we wanna call ToLookup with a key selector, bottom line we
can't tie it up to the language, we just need to find a way to call any
Enumerable ex. methods on the current query state and treat its result as
the final type of the query.


Reply to this email directly or view it on GitHub
#1938 (comment).

Tony Valenti

@alrz
Copy link
Member

alrz commented Dec 16, 2015

@weitzhandler @TonyValenti Not related here. Probably #100, #3571, #3486, #6489. All duplicates though.

@dsaf
Copy link
Author

dsaf commented Dec 16, 2015

@TonyValenti @alrz actually sounds like forward pipe with a special case for LINQ to prevent parentheses #5445. Not sure an operator would be idiomatic, since they have already introduced equals in addition to ==.

@alrz
Copy link
Member

alrz commented Dec 16, 2015

@dsaf on that idea, see #100 (comment).

@alrz
Copy link
Member

alrz commented Feb 4, 2016

It'd be nice to differ the semicolon requirement for the embedded-statement in do clause like

F(list => from foo in list do WriteLine(foo.Bar));

because query-expression is an expression afterall.

Also I think it makes sense to turn let clauses to let statements in a query that ends with do,

from item in list
let foo = item.ComputeFoo()
do { ... }

foreach(var item in list) {
  let foo = item.ComputeFoo();
  // ...
}

This'll be a great win if we were using patterns in let (#6877).

@gafter
Copy link
Member

gafter commented Feb 4, 2016

@alrz I think this proposal is for a new statement form, not a new expression form.

My working vision of this is in #1938 (comment)

@MovGP0
Copy link

MovGP0 commented Feb 27, 2017

instead of

foreach (var item in from xItem in new[] {"One", "Two", "Three" } 
                     where xItem == "Two"
                     select xItem)
{
    Debug.WriteLine(item);
}

you should use MoreLINQ and Command Query Separation (CQS) and write it like this:

// Query
var items = new[] {"One", "Two", "Three" }.Select(i => i == "Two");

// Command 
items.ForEach(Debug.WriteLine);

@aluanhaddad
Copy link

@MovGP0 you mean to get

public static void ForEach<T>(this IEnumerable<T> source, Action<T> action)
{
    foreach(var element in source)
    {
        action(source);
    }
}

I've tried it, added it to many projects, but I do not like how it feels as compared to foreach. Basically, I always end up wanting to write a chainable version of it, like Observable.Do, and here be dragons.

Also, that assumes you only need to support method syntax. I prefer query syntax where available, but I like both.

@MovGP0
Copy link

MovGP0 commented Feb 27, 2017

@aluanhaddad doesn't really matter which syntax you use, as long as you stick to CQS. Using another syntax only means that you have more lines of code to type:

// query
var items = from item in new[] {"One", "Two", "Three" } 
            where item == "Two"
            select item;

// command
foreach (var item in items)
{
    Debug.WriteLine(item);
}

@bondsbw
Copy link

bondsbw commented Feb 27, 2017

@MovGP0 That depends on the syntax; several examples here would have reduced the number of lines.

@gafter
Copy link
Member

gafter commented Feb 28, 2017

@MovGP0 How do I break, continue, etc?

@MovGP0
Copy link

MovGP0 commented Feb 28, 2017

@gafter by putting the query into a separate (extension) method, which returns an IEnumerable, and using yield/break/continue statements in an foreach loop there.

Do you have a code example?

@TonyValenti
Copy link

Related to "How do I break or continue?" I think it is worth deciding what the return value from "do" is. Is a linq statement with "do" a statement or an expression?

My preference is that "do" act like either an Action<> or a Func<> and return either:
IEnumerable when "yield returns/Yield Breaks" are used in the DO.
T or Void when "Return or Breaks".

Here are a few different ways the syntax could look for different things:

//This returns a List<String>
var FirstNotTwo =
from item in new[] {"One", "Two", "Three" } 
where item != "Two"
do ToList();

//This emulates a FOR loop and returns Void:
from item in new[] {"One", "Two", "Three" } 
do (var Index) {
  Console.WriteLine($"Item {Index}: {item}");
}


//This returns void:
from item in new[] {"One", "Two", "Three" } 
where item == "Two"
do {
    Debug.WriteLine(item.ToString());
}

//This returns an IEnumerable<String>
var NotTwo =
from item in new[] {"One", "Two", "Three" } 
do {
    if(item == "Two") {
        yield break;
    } else {
        yield return item;
    }
}

//This returns an IEnumerable<String>
var NotTwo =
from item in new[] {"One", "Two", "Three" } 
do {
    if(item == "Two") {
        yield break;
    } else {
        yield return item;
    }
}

//This returns a String
var FirstNotTwo =
from item in new[] {"One", "Two", "Three" } 
do {
    if(item != "Two") {
        return item;
    }
}


@MovGP0
Copy link

MovGP0 commented Feb 28, 2017

@TonyValenti That from ... do syntax merges the Query with the Command (and thus violates the CQS principle), and in many cases is more to write. I find it cleaner this way:

var ott = new[] {"One", "Two", "Three" };
//This returns a List<String>
var firstNotTwo = ott.Where(item => item != "Two").ToList();
//This emulates a FOR loop and returns void:
var indexedItems = ott.Select((item, index) => new { Item = item, Index = index }); // query
indexedItems.ForEach(i => Console.WriteLine($"Item {i.Index}: {i.Item}"); // command
//This returns void:
var two = ott.Single(item => item == "Two"); // query
Debug.WriteLine(two); // command
//This returns an IEnumerable<String>
var notTwo = ott.Where(item => item != "Two");
//This returns an IEnumerable<String>
var notTwo = ott.TakeWhile(item => item != "Two");
//This returns a String
var firstNotTwo = ott.First(item => item != "Two");

@bondsbw
Copy link

bondsbw commented Feb 28, 2017

@MovGP0 There may always be a great debate about whether query syntax is better than method chaining, but this proposal is to make query syntax better for those who prefer it. (Though I would love to see a ForEach(...) method ship in the BCL.)

Besides, I don't think executing two statements directly in succession (in the same block of code) is really any more/less compliant with the spirit of CQS than a single statement would be. Just my 2 cents.

@jnm2
Copy link
Contributor

jnm2 commented Feb 28, 2017

@bondsbw You're completely right. I wasn't going to say anything but the CQS thing is being taken way out of context in my opinion.

@gafter
Copy link
Member

gafter commented Feb 28, 2017

I think it is worth deciding what the return value from "do" is.

The proposed construct is intended to be a statement form.

@gafter
Copy link
Member

gafter commented May 8, 2018

Now tracked at dotnet/csharplang#101

@gafter gafter closed this as completed May 8, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests