-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Removes the outer loop for CountBy #106111
Conversation
maybe better with a benchmark comparison |
Where is the source code for Linq benchmarks? |
Co-authored-by: Weihan Li <weihanli@outlook.com>
|
return ((IEnumerable<KeyValuePair<TKey, int>>)[]).GetEnumerator(); | ||
} | ||
|
||
return BuildCountDictionary(enumerator, keySelector, keyComparer).GetEnumerator(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One issue I'm finding with this approach is that this is returning an enumerator whose implementation is public. Couldn't this result users taking a dependency on that implementation details which might prevent us from evolving the method in the future?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A dictionary enumerator does not have any useful method or property, and a dictionary enumerator is hidden by the IEnumerator interface.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I follow, wouldn't this cast work?
(Dictionary<T, int>.Enumerator)source.CountBy(x => x).GetEnumerator();
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I follow, wouldn't this cast work?
Yes, it will work. But Dictionary.Enumerator
does not have any specific method or property.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changing the implementation to not use dictionary would result in this cast failing for anybody who might depend on this. This is why we never return enumerable/enumerator implementations that are public unless they are specifically present in the type signature of the method (as is the case with ToList()
or the inner array in Chunk()
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't seen a code analyzer that would check this rule, I doubt it is so, why isn't there a code analyzer for this rule?
|
||
Assert.Throws<InvalidOperationException>(() => enumerator.MoveNext()); | ||
Assert.Throws<InvalidOperationException>(() => countBy.GetEnumerator().MoveNext()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm guessing you meant to assert this?
Assert.Throws<InvalidOperationException>(() => countBy.GetEnumerator().MoveNext()); | |
Assert.Throws<InvalidOperationException>(() => countBy.GetEnumerator()); |
I don't think this is an acceptable behavioural change. Generally speaking GetEnumerator()
calls don't do any meaningful work other than instantiate the enumerator; everything else should be reserved for the first MoveNext()
call, including any nested GetEnumerator()
calls.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think countBy.GetEnumerator().MoveNext()
is independent of how the implementation is done; If the implementation changes, the test will not have to change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The MoveNext()
part is irrelevant in this case, it is countBy.GetEnumerator()
that is throwing when it previously wasn't. GetEnumerator
shouldn't be doing any work or throwing exceptions, however with this change it is clear that GetEnumerator
is doing all the work (since it is this method that is enumerating the entire source enumerable to build the dictionary).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I know, in the IQueryable
interface it is the GetEnumerator
method that does the work of initializing the query, creating an SQL query, sending the SQL query to the server, passing the query parameters, creating a cursor to obtain the result using the MoveNext
method, etc. And it can also throw exceptions.
Therefore, I think that the IEnumerable.GetEnumerator
method can work even before the first call to MoveNext
, including throwing exceptions.
I think that the work can be done both as in the GetEnumerator
method and as in the first call to MoveNext
. These are already implementation details, so I wrote the tests so that they do not depend on the subtleties of the implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not an apples-to-apples comparison. IQueryable
uses a very different execution model, especially when building SQL queries. But even if we look at how the IEnumerable
implementation of built-in query provider works the same invariant holds:
IQueryable<int> queryable = new ThrowingGetEnumerator().AsQueryable()
.Select(x => x + 1)
.Where(x => x > 0);
IEnumerator<int> enumerator = queryable.GetEnumerator(); // No exception
enumerator.MoveNext(); // Exception
public class ThrowingGetEnumerator : IEnumerable<int>
{
public IEnumerator<int> GetEnumerator() => throw new NotSupportedException();
IEnumerator IEnumerable.GetEnumerator() => GetEnumerator();
}
These are already implementation details, so I wrote the tests so that they do not depend on the subtleties of the implementation.
I strongly disagree that that it's an implementation detail. Changing the behavior can break user code that depends on this method not enumerating the source or throwing exceptions; it might not be important if you just foreach over the result, but it is in code that explicitly handles enumerators in bespoke ways.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add GroupBy(x => x)
to query!
GroupBy
is analog to how CountBy
works and queryable.GetEnumerator()
will throw an exception.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would argue that this indicates an issue with the GroupBy
implementation. GetEnumerator()
should not be incurring side-effects in principle.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess GetEnumerator
can start to execute work before the first call MoveNext
and it depends on implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't believe the benefits of this approach outweigh its trade-offs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't believe we should take this change, on account of the behavioural breaking changes and leaking of implementation detail. Thank you for your contribution regardless.
Closing per #106111 (review). Thanks, though! |
Close #106110