-
Notifications
You must be signed in to change notification settings - Fork 4.9k
Implement & expose SkipLast & TakeLast for Enumerable & Queryable #14186
Conversation
nameof(Enumerable.Append), | ||
nameof(Enumerable.Prepend), | ||
"ToHashSet", | ||
"SkipLast", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally (maybe we'll have to do it in phases, though I think it should be avoidable by having a P2P to queryable from system.linq.tests), this shouldn't list SkipLast and TakeLast as queryable has those.
@@ -126,6 +126,7 @@ public static partial class Queryable | |||
public static TSource SingleOrDefault<TSource>(this System.Linq.IQueryable<TSource> source) { throw null; } | |||
public static TSource SingleOrDefault<TSource>(this System.Linq.IQueryable<TSource> source, System.Linq.Expressions.Expression<System.Func<TSource, bool>> predicate) { throw null; } | |||
public static System.Linq.IQueryable<TSource> Skip<TSource>(this System.Linq.IQueryable<TSource> source, int count) { throw null; } | |||
public static System.Linq.IQueryable<TSource> SkipLast<TSource>(this System.Linq.IQueryable<TSource> source, int count) { throw null; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that I left out all versioning concerns in my commit. This will need to be #if netcoreapp11
@@ -53,9 +53,11 @@ | |||
<Compile Include="SequenceEqualTests.cs" /> | |||
<Compile Include="SingleOrDefaultTests.cs" /> | |||
<Compile Include="SingleTests.cs" /> | |||
<Compile Include="SkipLastTests.netcoreapp1.1.cs" /> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Likewise, this is being included for all versions, but should be netcoreapp1.1 only.
if (queue.Count == count) | ||
{ | ||
yield return item; | ||
queue.Dequeue(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should it not yield the dequeued item rather than the current?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, a quick Linqpad run shows that this is being a Skip()
, not a SkipLast()
. In which case there's something wrong with the tests that let this pass, too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JonHanna CI is failing... I mentioned in https://github.com/dotnet/corefx/issues/13842#issuecomment-264613570 that I was having trouble to get the tests to run (or even compile) locally, since all of the netcoreapp1.1 tests were being excluded. Luckily however they seem to be running in CI, and I can see the stack traces.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note to self: e.Current
should be evaluated as lazily as possible here.
@@ -70,6 +72,11 @@ | |||
<Name>System.Linq.Queryable</Name> | |||
<Private>true</Private> | |||
</ProjectReference> | |||
<ProjectReference Include="..\..\System.Linq\src\System.Linq.csproj"> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Highlighting this bit of mine, because I'm not sure I did the right thing with how this reference is in here, and if I didn't I don't want it to go unnoticed.
(I should probably forego commenting on optimisation for now when you're forgoing doing it for now! Deleting those comments to concentrate on correctness). |
|
||
private static IEnumerable<TSource> SkipLastIterator<TSource>(IEnumerable<TSource> source, int count) | ||
{ | ||
var queue = new Queue<TSource>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about specifying capacity
for queue?
var queue = new Queue<TSource>(count);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think if that's done, it should be clamped to a maximum. Otherwise new []{1,2}.SkipLast(10000)
would create a 10000-element capacity queue to which it only adds two elements. (There are cases where we can be more confident in the count, in which case we should go ahead and use that count, though many of those would offer further optimisations akin to those currently in place for Skip()
anyway).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, nice point! I agree, it can be better.
In other hand we can try to test source (for ICollection<>
or IListProvider<>
, as other Enumerable
methods do) for count.
Also, in my practice, items.SkipLast(1)
used often and we can use single TSource
instead of new Queue<TSource>
in this case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JonHanna @ViIvanov In the future I am planning to use a LargeArrayBuilder
here, which since #14020 supporting limiting how many elements are allocated. Then we can call ToArray()
on it and wrap a struct circular buffer around the resulting array.
Also, in my practice,
items.SkipLast(1)
used often and we can use single TSource instead ofnew Queue<TSource>
in this case.
👍 Noted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like this code will not work in case count
is zero.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the best thing when count is zero or less would be to pass to Skip()
anyway; at best it passes back the source directly if the source came from Linq (this was deemed a relatively safe case for doing so in such cases, whereas with collections there could be complications due to two enumerables expected to be distinct being the safe) and otherwise would still have a lighter implementation of the same response.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JonHanna Good idea; I've forwarded to Skip(0)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ViIvanov Thanks for noticing that; it was a problem with TakeLast
as well. Fixed for both methods.
public static IQueryable<TSource> SkipLast<TSource>(this IQueryable<TSource> source, int count) | ||
{ | ||
if (source == null) | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: For consistency in this file, maybe get rid of {
and }
and the additional whitespace below?
"Prepend", | ||
"ToHashSet" | ||
nameof(Enumerable.Append), | ||
nameof(Enumerable.Prepend), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of curiosity, why were Append
and Prepend
not added to the Queryable
set of operators?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nobody had the idea of doing so. At the time I was thinking they wouldn't be a good idea to add, but I think now I was wrong on that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JonHanna @bartdesmet Should I open an API proposal for that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a bad idea.
|
||
if (count <= 0) | ||
{ | ||
return source.Skip(0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the choice of Skip(0)
here arbitrary (as opposed to any other operator which would return the original sequence)? Do we refrain from returning source
because we want some form of hiding of the source here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The policy is that it's considered safe to return source
if source
comes from another linq operation, but not otherwise. Skip
already has logic to determine that, so this can piggy-back. Skip
is also naturally lighter than SkipLast
, since it doesn't need a buffer queue, so this still wins in the case where it doesn't just return source
.
@jamesqo just curious if it is still WIP. When do you expect it to be ready for final round of review? |
using (IEnumerator<TSource> e = source.GetEnumerator()) | ||
{ | ||
while (e.MoveNext()) | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why you use using
and while
instead of foreach
here and foreach
in TakeLastIterator
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ViIvanov foreach
evaluates Current
right away, at the beginning of the loop. We want to evaluate it as late as possible so we don't pay an extra virtual call if iteration stops after the next yield.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I see: this is for case when caller stops enumerating items. Nice thing!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't make much difference there, but it does mean that queue
will only ever be count
items in size rather than count + 1
which if count
happened to be a growth threshold would be a benefit.
Having the foreach
broken down allows for a further optimisation of having the loop as:
while (e.MoveNext())
{
if (count == 0)
{
do
{
yield return queue.Dequeue();
queue.Enqueue(e.Current);
} while (e.MoveNext());
break;
}
count--;
queue.Enqueue(e.Current);
}
Which would avoid testing the count once it had been reached.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JonHanna You are correct, that seems like it would allow for a tighter loop (and a slightly faster comparison too, because now 0 is a comparand). I'll use your implementation in the iterator.
@karelz Not sure yet... I hit a roadblock during this PR because I was unable to build the netcoreapp1.1 tests locally, and I wasn't sure where to go from there. I will try in the next week to get back to this and either resolve those problems or ask for help from someone more knowledgeable about the build system. |
OK, take your time, there's no rush - I just wanted to make sure we didn't drop the ball by accident here ... |
@dotnet-bot test this please (404 on everything) |
@karelz Alright, I finally figured out after https://github.com/dotnet/corefx/issues/14423 why I wasn't able to run the tests. Now I can actually compile and execute them on my machine without wasting CI builds (woot!), so I should be able to move forward on this when I have free time. |
Great to hear. As I said, no rush, I just wanted to make sure we or the infra is not blocking you ;-) |
@dotnet-bot skip ci please (I need to cleanup my commits) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for updating @jamesqo the build files look correct to me. I didn't look very closely at the actual code I will leave that for others to sign-off on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The extra break
in for SkipLast
and the removal of "TakeLast" and "SkipLast" is from the list of excluded methods in the consistency test is the only thing in my comments I think should definitely be done. The rest are just suggestions that may not pan out.
yield return queue.Dequeue(); | ||
queue.Enqueue(e.Current); | ||
} | ||
while (e.MoveNext()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a break;
here and avoid another hit on e.MoveNext()
. Aside from that being slightly wasteful, I've seen buggy enumerators that didn't behave correctly on MoveNext()
following the call that returned false, so it adds a bit of defensiveness.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch. I'll modify the tests so they catch this if it's run twice.
nameof(Enumerable.Append), | ||
nameof(Enumerable.Prepend), | ||
"ToHashSet", | ||
"SkipLast", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the recent changes, I think this should definitely work without "SkipLast" and "TakeLast" here?
while (queue.Count > 0) | ||
{ | ||
yield return queue.Dequeue(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this was foreach(TSource item in queue) yield return queue;
then the lack of updating of queue
on dequeuing might pay off.
Perhaps even:
private class TakeLastIterator<TSource> : IEnumerable<TSource>
{
private readonly IEnumerable<TSource> _source;
private readonly int _count;
public TakeLastIterator(IEnumerable<TSource> source, int count)
{
_source = source;
_count = count;
}
public IEnumerator<TSource> GetEnumerator()
{
var queue = new Queue<TSource>();
int count = _count;
using (IEnumerator<TSource> e = _source.GetEnumerator())
{
while (e.MoveNext())
{
if (count-- != 0)
{
queue.Enqueue(e.Current);
}
else
{
do
{
queue.Dequeue();
queue.Enqueue(e.Current);
}
while (e.MoveNext());
break;
}
}
}
return queue.GetEnumerator();
}
IEnumerator IEnumerable.GetEnumerator()
{
return GetEnumerator();
}
}
And then a level of indirection is removed entirely, but the implementing queue
is still not exposed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(The class idea shifts the point of obtaining results from the first MoveNext()
to the GetEnumerator()
which differs from other methods, though perhaps acceptable since it doesn't break anything being new. Otherwise I'd still wonder about foreach
ing the queue rather than removing from it).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JonHanna I am trying to forego non-trivial optimizations in this PR because it should be a no-brainer to merge. I have already written an optimized version of SkipLast
with a more lightweight circular buffer: master...jamesqo:optimized.skiptake.last But I am saving that for later.
Alright, CI is green again. @VSadov feel free to merge |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks!!
🎉 |
Hello! I saw this PR was merged and I would like to know when SkipLast and TakeLast will become avaible for use ? In the next release of the .net framework? sorry for the ignorance. |
Next release of .NET Core - the 2.0 milestone. The big .NET Framework may eventually get it (if and when is unclear), but the issue is not marked as netfx-port-consider, so it won't be priority - it will have to wait for larger API surface comparison between .NET Core and .NET Framework. |
I just noticed these new methods (I actually thought they were our own extensions when I first found them via intellisense), but I noticed no XML comments are available to them? Could we please ensure the new methods have proper documentation to match other LINQ methods? |
@julealgon updated XML docs are a bit painpoint for us :(, esp. for new APIs - @wtgodbe @ericstj do you know what are our options? |
@karelz Have you considered enforcing the presence of summary tags for public-facing APIs using build-time StyleCop analyzer checks or something similar? IMHO comments should be part of the definition of done for any contribution on standard libraries like this. |
@julealgon sadly it is more involved than it seems:
|
That said, we are discussing how to make it easier in future releases and more reliable. I hope we will solve the problem by next release with new APIs (3.0). |
Implement & expose SkipLast & TakeLast for Enumerable & Queryable Commit migrated from dotnet/corefx@b72def2
Fixes https://github.com/dotnet/corefx/issues/13842.
Queryable
implementation is courtesy of @JonHanna. The initial implementation forgoes optimizations and usesQueue
as the circular buffer.I am not quite sure what I'm doing, but I followed @JonHanna's example for #13726.
cc @stephentoub, @VSadov, @bartdesmet, @svick