Skip to content

iter.map() should not guarantee that the closure is executed for all the iterated elements. #1757

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
ticki opened this issue Sep 25, 2016 · 16 comments
Labels
T-libs-api Relevant to the library API team, which will review and decide on the RFC.

Comments

@ticki
Copy link
Contributor

ticki commented Sep 25, 2016

This would allow things like optimizing iter::Map's implementation of skip.

@ticki
Copy link
Contributor Author

ticki commented Sep 25, 2016

It doesn't guarantee it right now, but the implementation is limited, as if it was necessarily true.

@bluss
Copy link
Member

bluss commented Sep 26, 2016

To be more specific, executed or not for all elements when? Since Iterator::map does not call the closure a single time unless something more happens to the iterator.

I associate to special-casing the method count for map which I still believe is a significant breaking change if it would be done so that it does not call the mapping function. Previously discussed in rust-lang/rust#28125

@Stebalien
Copy link
Contributor

Alternatively, we could introduce a notion of purity (this has been discussed in the past but I don't remember where). That is, add a marker trait trait PureFn: Fn {}, automatically infer it for applicable closures/functions, and then add a #[pure] attribute for telling the compiler that a function has no important side effects (even if it looks like it might, e.g., panic, increment some global counter, etc...). This would allow optimizations based on purity. IMO, these optimizations should be done at the type level through specialization, not by the compiler itself.

Unfortunately, 99% of methods in debug mode wouldn't be pure because integer overflow panics in debug mode. This would make debug/release behavior differ significantly (although, arguably, it already does...).

@Amanieu
Copy link
Member

Amanieu commented Sep 26, 2016

If the map function is pure then the compiler can already optimize it away.

@eddyb
Copy link
Member

eddyb commented Sep 26, 2016

@Stebalien But if it's pure, then LLVM not being able to optimize it is more or less a bug.

@Stebalien
Copy link
Contributor

Sorry, "pure", not pure. My point is that most non-trivial functions in rust aren't pure (there's going to be an assert/unwrap/allocation etc. somewhere). However, many of these side effects aren't really important and it would be nice to be able to say "this function has no important/noticeable side effects, you can avoid calling it if you don't use the result". Basically, make this kind of optimization (this RFC issue) opt-in.

@bluss
Copy link
Member

bluss commented Sep 26, 2016

To defend the cause for some kind of special casing of the counting, the iterator might not be so simple so that a plain .count() without side effects can be reduced to a number (BTreeMap iterators and so on, maybe not even the VecDeque one).

@eddyb
Copy link
Member

eddyb commented Sep 26, 2016

@bluss That's different, in that the data structure may have some simpler way to get the value.
Or do you mean that kind of special-casing combined with @Stebalien's "pure"?

@bluss
Copy link
Member

bluss commented Sep 26, 2016

sure, the latter.

@durka
Copy link
Contributor

durka commented Sep 28, 2016

What about putting the guarantee not on map, but on count? That is, document count as "consumes all items from the iterator and returns the count". Because, for better or for worse, that's what its purpose has become. We could add another method that's documented to be specializable.

@Stebalien
Copy link
Contributor

@durka Once specialization lands, external crate could actually provide that using ExactSizeIterator:

#![feature(specialization)]

pub trait Count {
    fn count(self) -> usize;
}

impl<I> Count for I where I: Iterator {
    default fn count(self) -> usize {
        Iterator::count(self)
    }
}

impl<I> Count for I where I: ExactSizeIterator {
    fn count(mut self) -> usize {
        ExactSizeIterator::len(&mut self)
    }
}

pub fn len<I: IntoIterator>(iter: I) -> usize {
    Count::count(iter.into_iter())
}

@bluss
Copy link
Member

bluss commented Sep 28, 2016

@durka That's more or less what count already says Iterator::count in rustdoc

@durka
Copy link
Contributor

durka commented Sep 28, 2016

@Stebalien not sure what that adds over len anyway.

@Stebalien
Copy link
Contributor

@durka len only exists on iterators that implement ExactSizeIterator. This way, you can accept any Iterator (or IntoIterator) and get its length.

@bluss
Copy link
Member

bluss commented Sep 28, 2016

There's an intermediate way there too, let (lo, hi) = iter.size_hint(); if Some(lo) == hi { lo } else { iter.count() }

@bluss
Copy link
Member

bluss commented Sep 28, 2016

Famous example: chain will often have a dynamically exact size hint.

@nrc nrc added the T-libs-api Relevant to the library API team, which will review and decide on the RFC. label Sep 28, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T-libs-api Relevant to the library API team, which will review and decide on the RFC.
Projects
None yet
Development

No branches or pull requests

7 participants