Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Specialise count, last and nth for Cloned and Map iterators #28125

Closed
wants to merge 1 commit into from

Conversation

nagisa
Copy link
Member

@nagisa nagisa commented Aug 31, 2015

Map/Cloned does not call the mapping/cloning function unnecessarily anymore for count, last and nth, which has a serious benefit to performance, especially with expensive mapping functions.

For example

(1..10).map(|x| {
    for y in 1..50 { black_box(y); }
    black_box(x)
}).last()

runs in 25 ns/iter (+/- 9) compared to 238 ns/iter (+/- 21) with nightly.

@rust-highfive
Copy link
Collaborator

r? @nikomatsakis

(rust_highfive has picked a reviewer for you, use r? to override)

@nagisa
Copy link
Member Author

nagisa commented Aug 31, 2015

Thinking now, this might need an RFC since it breaks people’s assumptions with their side-effectful Clone impls and mapping functions.

Please advise.

@bluss
Copy link
Member

bluss commented Aug 31, 2015

My idea has been that these can only be specialized if there is no user-visible difference from the default Iterator-trait provided implementation of last, nth, count. We can change, good to discuss explicitly.

@bluss
Copy link
Member

bluss commented Aug 31, 2015

I believe these specializations aren't of very high importance. Better support for bidirectional or random access would be bigger.

#[inline]
fn last(self) -> Option<B> {
let f = self.f;
self.iter.last().map(f)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just nitpicking... is there a reason why last does not use the same syntax as next and nth? (i.e. self.iter.last().map(|a| (self.f)(a)))

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because it doesn’t compile this way. self is by value here and is therefore partially moved by self.iter which does not allow capturing self into the closure.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, you're right, I did not realise that the signature was different... but then self.iter.last().map(self.f) should work just fine, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it works.

@nagisa
Copy link
Member Author

nagisa commented Aug 31, 2015

I think they are important. If another method is added to Iterator in future releases, there could be incentive to make it run clone/mapping over all elements (because that’s what other functions do currently) even if not doing so would be very advantageous to performance of said method on Map-like iterators.

@alexcrichton alexcrichton added the T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. label Aug 31, 2015
@alexcrichton
Copy link
Member

I'm totally fine with this, pedantically this is a breaking change but practically I highly doubt that it will be. I'll tag this with T-libs so we can discuss but our conclusion may just be to clarify the documentation of the methods on the Iterator trait to indicate that the entire iterator may not be exhausted.

@bluss
Copy link
Member

bluss commented Aug 31, 2015

@alexcrichton It's not about breaking changes but about how we want iterators to behave

@bluss
Copy link
Member

bluss commented Sep 1, 2015

@alexcrichton It breaks existing code. Examples: (1), (2), (3), (4)

.count() is a method known to consume the iterator, with side effects. I think we should maintain it can only be specialized if the difference is invisible (for example: the slice iterator).

@nikomatsakis
Copy link
Contributor

r? @alexcrichton

@nagisa
Copy link
Member Author

nagisa commented Sep 5, 2015

I’m willing to limit the scope to Cloned only, as suggested by @bluss on IRC.

While it is still theoretically breaking, I think the impact should be non-existent in the real world.

@Stebalien
Copy link
Contributor

Definitely not map. The last documentation even explicitly says "loops through the entire iterator". I'm hesitant about cloned but, if you do want to do that, you should change the documentation to specify that it only clones when necessary.

@ranma42
Copy link
Contributor

ranma42 commented Sep 8, 2015

If I am not mistaken, this would impact both side-effectful Clone and Drop implementations (it might be obvious to most devs, but the Drop part has not yet been mentioned so far).

Has there been any investigation on what implementations of the Iterator trait need an explicit specialisation for count, last and nth and which ones can be automatically optimised by LLVM?
This would be more fragile, because updates to the code generation logic and to LLVM might cause regressions, but we might try to make it more robust by extending the codegen tests to also verify that (when optimisations are enabled), rustc generates the desired code.

@aturon aturon added the I-needs-decision Issue: In need of a decision. label Sep 16, 2015
@Kimundi
Copy link
Member

Kimundi commented Sep 23, 2015

The count()-not-exhausting issue might be an argument for just giving in and adding an .drain() method to iterators that runs them to exhaustion, and is defined as such.

This would address the fair-to-common usecase where someone just wants to chain a method to run the iterator chain, instead of using the for syntax with a _ pattern, and where the best current workaround is calling count() and ignoring the result.

@alexcrichton
Copy link
Member

Just as an update, the libs team has had quite a full agenda the past few weeks (dealing with FCP both ending and starting anew), but we'll be sure to get around to this next week!

@bluss
Copy link
Member

bluss commented Sep 26, 2015

@Kimundi I think an actual .foreach(f) is the strictly more versatile alternative; it can serve as both. It's requested all the time by, I guess, a section of programmers used to that kind of style from other languages. Either way, we can't change the semantics of .count().

@alexcrichton
Copy link
Member

Ok, thanks for the patience here! The libs team talked about this and the conclusion was that these iterator adaptor methods should always preserve the same semantics as the current default implementations, specifically with respect to side effects that may be seen.

In light of that I'm going to close this as "unfortunately we can't do this", and otherwise I've opened a tracking issue for auditing those changes we've already made in the standard library.

@alexcrichton alexcrichton removed the I-needs-decision Issue: In need of a decision. label Oct 2, 2015
@bluss
Copy link
Member

bluss commented Oct 2, 2015

Thank you, that sounds great for iterators.

krobelus added a commit to fish-shell/fish-shell that referenced this pull request Jan 7, 2024
…).last()

Iterator::last() consumes the entire iterator, even for DoubleEndedIterator,
see rust-lang/rust#28125 (comment)

Because of this, "at_line_start()" took 90% of

    fish_indent share/completions/git.fish

making it take 1000ms instead of 30 ms. Fix that.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants