Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make iter::Fuse truly zero cost #70374

Closed
wants to merge 4 commits into from
Closed

Make iter::Fuse truly zero cost #70374

wants to merge 4 commits into from

Conversation

AnthonyMikh
Copy link
Contributor

@AnthonyMikh AnthonyMikh commented Mar 24, 2020

As documentation for std::iter::FusedIterator says:

Note: In general, you should not use FusedIterator in generic bounds if you need a fused iterator. Instead, you should just call Iterator::fuse on the iterator. If the iterator is already fused, the additional Fuse wrapper will be a no-op with no performance penalty.

However, the "no performance penalty" part seems a bit questionable: Fuse always keeps done flag no matter if wrapped iterator is fused or not. For a fused iterator it makes no sense to actually keep this flag, yet Fuse blindly attaches it to everything.

This PR resolves the issue: the new implementation guarantess that Fuse on fused iterator just forwards methods to wrapped iterator and carries no additional data. This may bring benefit for code that operates on generic iterator and immediately calls .fuse() on it.

Possible reasons not to merge this PR (in no particular order):

  1. By always keeping a flag we can report correct size_hint for exhausted iterator even if internal one has improper size_hint implementation. However, now this opportunity is not used, and Fuse::<I: FusedIterator>::size_hint just forwards method to wrapped iterator.
  2. This PR adds one more use of specialization which is still incomplete.
  3. This PR changes Debug output of Fuse. However, it can be easily mitigated with custom Debug impl.
  4. This change can break crates that relies on iterator-contained types to have some specific concrete sizes.
  5. Actual performance win is likely to be negligible.
  6. This change relies on trick, and it is unclear if this trick will work with possible future methods of Iterator.

r? @SimonSapin

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Mar 24, 2020
@timvermeulen
Copy link
Contributor

FWIW, a similar optimization to Zip was later reverted because it caused variance issues.

@SimonSapin
Copy link
Contributor

Sorry, I don’t have much bandwidth available to review Rust PRs. Please consider letting the highfive bot pick a reviewer, or picking one based on https://github.com/rust-lang/highfive/blob/master/highfive/configs/rust-lang/rust.json

Per url -s https://raw.githubusercontent.com/rust-lang/highfive/master/highfive/configs/rust-lang/rust.json|jq .groups.libs|grep @|shuf -n1:

r? @joshtriplett

@WaffleLapkin
Copy link
Member

Btw, I think this PR conflicts with #70366 which also rewrites Fuse

@Centril
Copy link
Contributor

Centril commented Mar 25, 2020

r? @matthewjasper (due the new specialization moratorium) and cc @cuviper

@scottmcm
Copy link
Member

I think the two approaches could be combined, by having Fuse be Result<I, Tag>, where Tag is () by default, but ! when I: Fused. Layout optimizations already have size(Result<T, !>) = size(T).

@AnthonyMikh
Copy link
Contributor Author

I think the two approaches could be combined, by having Fuse be Result<I, Tag>, where Tag is () by default, but ! when I: Fused. Layout optimizations already have size(Result<T, !>) = size(T).

I don't think that it is desirable, AFAIK iterator adapters don't drop internal iterators. Restructuring Fuse in such a way would change it semantics and almost definetely break someone's code.

@cuviper
Copy link
Member

cuviper commented Mar 25, 2020

I've noted that #31844 still has unresolved questions about associated types, including whether they should allow specialization at all.

@scottmcm

I think the two approaches could be combined, by having Fuse be Result<I, Tag>, where Tag is () by default, but ! when I: Fused.

I like this idea, if we do allow associated type specialization, and assuming we also decide early drops are ok.

@AnthonyMikh

I don't think that it is desirable, AFAIK iterator adapters don't drop internal iterators.

In general they don't, because it would require more code and extra wrappers to drop early, adding unnecessary cost. Some do drop necessarily, like FlatMap.

Restructuring Fuse in such a way would change it semantics and almost definetely break someone's code.

It is an observable change, but that doesn't necessarily mean it's part of the API contract. I think someone would be on shaky ground to rely on exactly when Fuse drops.

@scottmcm
Copy link
Member

scottmcm commented Mar 25, 2020

FYI, the other PR has been r+'d.

I do like this PR, though, assuming that T-compiler is ok with using specialization to pick a type like this. I think having only one implementation of Iterator for Fuse sounds great, and a nice demonstration of type-based design to have the None branches just automatically and no-unsafe-needed disappear thanks to an uninhabited type.

You might even be able to do this by just changing .iter.as_mut()? to .iter.as_mut().ok()? (and similarly for as_ref and such, of course) after the other PR.

@bors
Copy link
Contributor

bors commented Mar 25, 2020

☔ The latest upstream changes (presumably #70404) made this pull request unmergeable. Please resolve the merge conflicts.

@Dylan-DPC-zz Dylan-DPC-zz added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Mar 27, 2020
@AnthonyMikh
Copy link
Contributor Author

AnthonyMikh commented Mar 27, 2020

I think the two approaches could be combined, by having Fuse be Result<I, Tag>, where Tag is () by default, but ! when I: Fused.

Okay, I tried to implement this approach, but it turns that it is impossible (at least without negative trait bounds). In order to utilize layout optimizations, we need to keep internal iterator in Fuse as Result<I, StopState<I>> (or some other two-variants enum) where StopState is an alias for associated type of I which is, for example, () for regular iterators and core::convert::Infallible for fused iterators. Now take a look at default implementation of Iterator::next from #70366:

    #[inline]
    default fn next(&mut self) -> Option<<I as Iterator>::Item> {
        let next = self.iter.as_mut()?.next();
        if next.is_none() {
            self.iter = None;
        }
        next
    }

Remaking this method in desired requires us to write in the following way:

    default fn next(&mut self) -> Option<<I as Iterator>::Item> {
        let next = self.iter.as_mut()?.next(); // assuming we already rewrote `as_mut`
        if next.is_none() {
            self.iter = Err(???);
        }
        next
    }

And here comes the question: what should we put in place of ???. Since the code should work both with regular and fused iterators, this should be an expression which returns a value of StopValue<I> type. However, for fused iterator StopValue<I> is Infallible (or some other uninhabitated type), and the whole point of uninhabitated types is that they are, well, uninhabitated, so it is impossible to produce a value of this type. This issue can be fixed by somehow stating that this is implementation only for those Is which are not FusedIterator, but this requires negative trait bounds which are not even implemented.

@cuviper
Copy link
Member

cuviper commented Mar 27, 2020

Right now you have Flag::set(&mut self) which either to sets true or does nothing. I think in the new idea, you could instead have Flag::set<T>(iter: &mut Result<T, Self>), which sets Err(()) or again does nothing for the FusedIterator type of Flag.

@AnthonyMikh
Copy link
Contributor Author

Right now you have Flag::set(&mut self) which either to sets true or does nothing. I think in the new idea, you could instead have Flag::set<T>(iter: &mut Result<T, Self>), which sets Err(()) or again does nothing for the FusedIterator type of Flag.

It still doesn't solve the problem since unspecialized version still has to work with all iterators and thus make a value of uninhabitated type.

@cuviper
Copy link
Member

cuviper commented Mar 27, 2020

I meant something like:

    if next.is_none() {
        Flag::set(&mut self.iter);
    }

This can be a no-op for FusedIterator.

@AnthonyMikh
Copy link
Contributor Author

@cuviper thanks for suggesting this approach. I implemented it in #70502.

@AnthonyMikh
Copy link
Contributor Author

Since libs team seems to agree with change in dropping semantics, this PR is unlikely to be merged, so I close it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author.
Projects
None yet
Development

Successfully merging this pull request may close these issues.