Skip to content
This repository has been archived by the owner on Jan 23, 2023. It is now read-only.
/ corefx Public archive

Implement & expose SkipLast & TakeLast for Enumerable & Queryable #14186

Merged
merged 11 commits into from
Jan 27, 2017

Conversation

jamesqo
Copy link
Contributor

@jamesqo jamesqo commented Dec 4, 2016

Fixes https://github.com/dotnet/corefx/issues/13842. Queryable implementation is courtesy of @JonHanna. The initial implementation forgoes optimizations and uses Queue as the circular buffer.

I am not quite sure what I'm doing, but I followed @JonHanna's example for #13726.

cc @stephentoub, @VSadov, @bartdesmet, @svick

nameof(Enumerable.Append),
nameof(Enumerable.Prepend),
"ToHashSet",
"SkipLast",
Copy link
Contributor

@JonHanna JonHanna Dec 4, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally (maybe we'll have to do it in phases, though I think it should be avoidable by having a P2P to queryable from system.linq.tests), this shouldn't list SkipLast and TakeLast as queryable has those.

@@ -126,6 +126,7 @@ public static partial class Queryable
public static TSource SingleOrDefault<TSource>(this System.Linq.IQueryable<TSource> source) { throw null; }
public static TSource SingleOrDefault<TSource>(this System.Linq.IQueryable<TSource> source, System.Linq.Expressions.Expression<System.Func<TSource, bool>> predicate) { throw null; }
public static System.Linq.IQueryable<TSource> Skip<TSource>(this System.Linq.IQueryable<TSource> source, int count) { throw null; }
public static System.Linq.IQueryable<TSource> SkipLast<TSource>(this System.Linq.IQueryable<TSource> source, int count) { throw null; }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that I left out all versioning concerns in my commit. This will need to be #if netcoreapp11

@@ -53,9 +53,11 @@
<Compile Include="SequenceEqualTests.cs" />
<Compile Include="SingleOrDefaultTests.cs" />
<Compile Include="SingleTests.cs" />
<Compile Include="SkipLastTests.netcoreapp1.1.cs" />
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Likewise, this is being included for all versions, but should be netcoreapp1.1 only.

if (queue.Count == count)
{
yield return item;
queue.Dequeue();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should it not yield the dequeued item rather than the current?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, a quick Linqpad run shows that this is being a Skip(), not a SkipLast(). In which case there's something wrong with the tests that let this pass, too.

Copy link
Contributor Author

@jamesqo jamesqo Dec 4, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JonHanna CI is failing... I mentioned in https://github.com/dotnet/corefx/issues/13842#issuecomment-264613570 that I was having trouble to get the tests to run (or even compile) locally, since all of the netcoreapp1.1 tests were being excluded. Luckily however they seem to be running in CI, and I can see the stack traces.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to self: e.Current should be evaluated as lazily as possible here.

@@ -70,6 +72,11 @@
<Name>System.Linq.Queryable</Name>
<Private>true</Private>
</ProjectReference>
<ProjectReference Include="..\..\System.Linq\src\System.Linq.csproj">
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Highlighting this bit of mine, because I'm not sure I did the right thing with how this reference is in here, and if I didn't I don't want it to go unnoticed.

@JonHanna
Copy link
Contributor

JonHanna commented Dec 4, 2016

(I should probably forego commenting on optimisation for now when you're forgoing doing it for now! Deleting those comments to concentrate on correctness).


private static IEnumerable<TSource> SkipLastIterator<TSource>(IEnumerable<TSource> source, int count)
{
var queue = new Queue<TSource>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about specifying capacity for queue?

var queue = new Queue<TSource>(count);

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if that's done, it should be clamped to a maximum. Otherwise new []{1,2}.SkipLast(10000) would create a 10000-element capacity queue to which it only adds two elements. (There are cases where we can be more confident in the count, in which case we should go ahead and use that count, though many of those would offer further optimisations akin to those currently in place for Skip() anyway).

Copy link
Contributor

@ViIvanov ViIvanov Dec 4, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, nice point! I agree, it can be better.

In other hand we can try to test source (for ICollection<> or IListProvider<>, as other Enumerable methods do) for count.

Also, in my practice, items.SkipLast(1) used often and we can use single TSource instead of new Queue<TSource> in this case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JonHanna @ViIvanov In the future I am planning to use a LargeArrayBuilder here, which since #14020 supporting limiting how many elements are allocated. Then we can call ToArray() on it and wrap a struct circular buffer around the resulting array.

@ViIvanov

Also, in my practice, items.SkipLast(1) used often and we can use single TSource instead of new Queue<TSource> in this case.

👍 Noted.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this code will not work in case count is zero.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the best thing when count is zero or less would be to pass to Skip() anyway; at best it passes back the source directly if the source came from Linq (this was deemed a relatively safe case for doing so in such cases, whereas with collections there could be complications due to two enumerables expected to be distinct being the safe) and otherwise would still have a lighter implementation of the same response.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JonHanna Good idea; I've forwarded to Skip(0).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ViIvanov Thanks for noticing that; it was a problem with TakeLast as well. Fixed for both methods.

@karelz karelz added this to the 1.2.0 milestone Dec 4, 2016
public static IQueryable<TSource> SkipLast<TSource>(this IQueryable<TSource> source, int count)
{
if (source == null)
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: For consistency in this file, maybe get rid of { and } and the additional whitespace below?

"Prepend",
"ToHashSet"
nameof(Enumerable.Append),
nameof(Enumerable.Prepend),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity, why were Append and Prepend not added to the Queryable set of operators?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nobody had the idea of doing so. At the time I was thinking they wouldn't be a good idea to add, but I think now I was wrong on that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JonHanna @bartdesmet Should I open an API proposal for that?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a bad idea.


if (count <= 0)
{
return source.Skip(0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the choice of Skip(0) here arbitrary (as opposed to any other operator which would return the original sequence)? Do we refrain from returning source because we want some form of hiding of the source here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The policy is that it's considered safe to return source if source comes from another linq operation, but not otherwise. Skip already has logic to determine that, so this can piggy-back. Skip is also naturally lighter than SkipLast, since it doesn't need a buffer queue, so this still wins in the case where it doesn't just return source.

@jamesqo jamesqo changed the title Implement & expose SkipLast & TakeLast for Enumerable & Queryable. [WIP] Implement & expose SkipLast & TakeLast for Enumerable & Queryable. Dec 20, 2016
@karelz
Copy link
Member

karelz commented Jan 4, 2017

@jamesqo just curious if it is still WIP. When do you expect it to be ready for final round of review?

using (IEnumerator<TSource> e = source.GetEnumerator())
{
while (e.MoveNext())
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why you use using and while instead of foreach here and foreach in TakeLastIterator?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ViIvanov foreach evaluates Current right away, at the beginning of the loop. We want to evaluate it as late as possible so we don't pay an extra virtual call if iteration stops after the next yield.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I see: this is for case when caller stops enumerating items. Nice thing!

Copy link
Contributor

@JonHanna JonHanna Jan 7, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't make much difference there, but it does mean that queue will only ever be count items in size rather than count + 1 which if count happened to be a growth threshold would be a benefit.

Having the foreach broken down allows for a further optimisation of having the loop as:

while (e.MoveNext())
{
    if (count == 0)
    {
        do
        {
            yield return queue.Dequeue();
            queue.Enqueue(e.Current);
        } while (e.MoveNext());
        break;
    }
    
    count--;
    queue.Enqueue(e.Current);
}

Which would avoid testing the count once it had been reached.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JonHanna You are correct, that seems like it would allow for a tighter loop (and a slightly faster comparison too, because now 0 is a comparand). I'll use your implementation in the iterator.

@jamesqo
Copy link
Contributor Author

jamesqo commented Jan 5, 2017

@karelz Not sure yet... I hit a roadblock during this PR because I was unable to build the netcoreapp1.1 tests locally, and I wasn't sure where to go from there. I will try in the next week to get back to this and either resolve those problems or ask for help from someone more knowledgeable about the build system.

@karelz
Copy link
Member

karelz commented Jan 5, 2017

OK, take your time, there's no rush - I just wanted to make sure we didn't drop the ball by accident here ...

@jamesqo
Copy link
Contributor Author

jamesqo commented Jan 8, 2017

@dotnet-bot test this please (404 on everything)

@jamesqo
Copy link
Contributor Author

jamesqo commented Jan 8, 2017

@karelz Alright, I finally figured out after https://github.com/dotnet/corefx/issues/14423 why I wasn't able to run the tests. Now I can actually compile and execute them on my machine without wasting CI builds (woot!), so I should be able to move forward on this when I have free time.

@karelz
Copy link
Member

karelz commented Jan 8, 2017

Great to hear. As I said, no rush, I just wanted to make sure we or the infra is not blocking you ;-)

@jamesqo
Copy link
Contributor Author

jamesqo commented Jan 8, 2017

@dotnet-bot skip ci please (I need to cleanup my commits)

Copy link
Member

@weshaggard weshaggard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for updating @jamesqo the build files look correct to me. I didn't look very closely at the actual code I will leave that for others to sign-off on.

Copy link
Contributor

@JonHanna JonHanna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The extra break in for SkipLast and the removal of "TakeLast" and "SkipLast" is from the list of excluded methods in the consistency test is the only thing in my comments I think should definitely be done. The rest are just suggestions that may not pan out.

yield return queue.Dequeue();
queue.Enqueue(e.Current);
}
while (e.MoveNext());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a break; here and avoid another hit on e.MoveNext(). Aside from that being slightly wasteful, I've seen buggy enumerators that didn't behave correctly on MoveNext() following the call that returned false, so it adds a bit of defensiveness.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch. I'll modify the tests so they catch this if it's run twice.

nameof(Enumerable.Append),
nameof(Enumerable.Prepend),
"ToHashSet",
"SkipLast",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the recent changes, I think this should definitely work without "SkipLast" and "TakeLast" here?

while (queue.Count > 0)
{
yield return queue.Dequeue();
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this was foreach(TSource item in queue) yield return queue; then the lack of updating of queue on dequeuing might pay off.

Perhaps even:

private class TakeLastIterator<TSource> : IEnumerable<TSource>
{
	private readonly IEnumerable<TSource> _source;
	private readonly int _count;
	
	public TakeLastIterator(IEnumerable<TSource> source, int count)
	{
		_source = source;
		_count = count;
	}
	
	public IEnumerator<TSource> GetEnumerator()
	{
	    var queue = new Queue<TSource>();
		int count = _count;
	
	    using (IEnumerator<TSource> e = _source.GetEnumerator())
	    {
	        while (e.MoveNext())
	        {
	            if (count-- != 0)
	            {
	                queue.Enqueue(e.Current);
	            }
	            else
	            {
	                do
	                {
	                    queue.Dequeue();
	                    queue.Enqueue(e.Current);
	                }
	                while (e.MoveNext());
	                break;
	            }
	        }
		}
		
		return queue.GetEnumerator();
	}
	
	IEnumerator IEnumerable.GetEnumerator()
	{
		return GetEnumerator();
	}
}

And then a level of indirection is removed entirely, but the implementing queue is still not exposed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(The class idea shifts the point of obtaining results from the first MoveNext() to the GetEnumerator() which differs from other methods, though perhaps acceptable since it doesn't break anything being new. Otherwise I'd still wonder about foreaching the queue rather than removing from it).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JonHanna I am trying to forego non-trivial optimizations in this PR because it should be a no-brainer to merge. I have already written an optimized version of SkipLast with a more lightweight circular buffer: master...jamesqo:optimized.skiptake.last But I am saving that for later.

@jamesqo
Copy link
Contributor Author

jamesqo commented Jan 27, 2017

Alright, CI is green again. @VSadov feel free to merge

Copy link
Member

@VSadov VSadov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!!

@VSadov VSadov merged commit b72def2 into dotnet:master Jan 27, 2017
@jamesqo jamesqo deleted the st-last branch January 27, 2017 23:10
@jamesqo
Copy link
Contributor Author

jamesqo commented Jan 27, 2017

🎉

@leandromoh
Copy link

leandromoh commented Feb 25, 2017

Hello!

I saw this PR was merged and I would like to know when SkipLast and TakeLast will become avaible for use ?

In the next release of the .net framework?

sorry for the ignorance.

@karelz
Copy link
Member

karelz commented Feb 25, 2017

Next release of .NET Core - the 2.0 milestone.

The big .NET Framework may eventually get it (if and when is unclear), but the issue is not marked as netfx-port-consider, so it won't be priority - it will have to wait for larger API surface comparison between .NET Core and .NET Framework.

@julealgon
Copy link

I just noticed these new methods (I actually thought they were our own extensions when I first found them via intellisense), but I noticed no XML comments are available to them?

Could we please ensure the new methods have proper documentation to match other LINQ methods?

@karelz
Copy link
Member

karelz commented Nov 13, 2018

@julealgon updated XML docs are a bit painpoint for us :(, esp. for new APIs - @wtgodbe @ericstj do you know what are our options?

@julealgon
Copy link

@julealgon updated XML docs are a bit painpoint for us :(, esp. for new APIs - @wtgodbe @ericstj do you know what are our options?

@karelz Have you considered enforcing the presence of summary tags for public-facing APIs using build-time StyleCop analyzer checks or something similar?

IMHO comments should be part of the definition of done for any contribution on standard libraries like this.

@karelz
Copy link
Member

karelz commented Nov 13, 2018

@julealgon sadly it is more involved than it seems:

  1. We have multiple implementations of some APIs. We would have to have XML comments in ref assemblies source code (which would make them harder to read).
  2. Many APIs do not have XML comments - we would have to baseline them.

@karelz
Copy link
Member

karelz commented Nov 13, 2018

That said, we are discussing how to make it easier in future releases and more reliable. I hope we will solve the problem by next release with new APIs (3.0).

picenka21 pushed a commit to picenka21/runtime that referenced this pull request Feb 18, 2022
Implement & expose SkipLast & TakeLast for Enumerable & Queryable

Commit migrated from dotnet/corefx@b72def2
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.