-
Notifications
You must be signed in to change notification settings - Fork 12.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add implementations of last in terms of next_back on a bunch of DoubleEndedIterators #60130
Conversation
…eEndedIterators. r?Manishearth
(rust_highfive has picked a reviewer for you, use r? to override) |
Maybe the docs should be adjusted to mention that this only applies to the default implementation. |
r? @sfackler |
I think the behavior we want is that the method should behave as if the iterator was advanced until it returns |
That's somewhat unfortunate because it means this optimization can't apply to any combinator. |
It would also be unfortunate to break code that is using these methods as documented. |
Indeed. |
Could we just lint the use of |
To raise just one issue, in generic code you may not even know that your |
@khuey My experience in C++ is that having things (such as C++'s |
@scottmcm on the other hand you might end up wasting joules of the many to reduce surprises of the few. |
I find this uncompelling for all the obvious reasons, so I'll just say that method specialization for performance has prior art in stdlib. e.g. every specialization of |
Really, I'm kinda surprised that |
Ping? |
Oops, lost track of this! @bors r+ |
📌 Commit 3e86cf3 has been approved by |
🌲 The tree is currently closed for pull requests below priority 500, this pull request will be tested once the tree is reopened |
ta |
Add implementations of last in terms of next_back on a bunch of DoubleEndedIterators Provided a `DoubleEndedIterator` has finite length, `Iterator::last` is equivalent to `DoubleEndedIterator::next_back`. But searching forwards through the iterator when it's unnecessary is obviously not good for performance. I ran into this on one of the collection iterators. I tried adding appropriate overloads for a bunch of the iterator adapters like filter, map, etc, but I ran into a lot of type inference failures after doing so. The other interesting case is what to do with `Repeat`. Do we consider it part of the contract that `Iterator::last` will loop forever on it? The docs do say that the iterator will be evaluated until it returns None. This is also relevant for the adapters, it's trivially easy to observe whether a `Map` adapter invoked its closure a zillion times or just once for the last element.
Rollup of 9 pull requests Successful merges: - #60130 (Add implementations of last in terms of next_back on a bunch of DoubleEndedIterators) - #60443 (as_ptr returns a read-only pointer) - #60444 (forego caching for all participants in cycles, apart from root node) - #60719 (Allow subdirectories to be tested by x.py test) - #60780 (fix Miri) - #60788 (default to $ARCH-apple-macosx10.7.0 LLVM triple for darwin targets) - #60799 (Allow late-bound regions in existential types) - #60808 (Improve the "must use" lint for `Future`) - #60819 (submodules: update clippy from 3710ec5 to ad3269c) Failed merges: r? @ghost
Why was this merged when it breaks the documented contract for the function, as noted above? The documentation hasn't even been updated to reflect the change! Here it is:
Having used rust for many years now, I'm shocked to learn that the team believes that documentation of "provided" trait methods only applies to the default implementation. That does not at all align with... well, any other trait in the library. To be very clear, this change did not only change iterators where the difference cannot be observed. To wit: impl<I> Iterator for Rev<I> where I: DoubleEndedIterator {
...
#[inline]
fn last(mut self) -> Option<Self::Item> {
self.next_back()
}
...
} This change breaks the documented contract in an observable way. The following program produces different side-effects before and after this change: fn main() {
let mut iter = [1, 2, 3, 4, 5].iter().inspect(|_| { println!("next()") }).rev();
let _ = iter.last();
} (@sfackler who reviewed because I'm not sure whether anyone will see this otherwise.) |
Oh, oops. I didn't notice that the Rev implementation was still in this change. That part definitely needs to be reverted. |
Actually, all of the slice overrides are also observable, since they call the user-provided closure. I'm just going to revert the whole PR. |
@sfackler, is there documentation on adding unit tests to Of course, I guess it has to be added separately for each iterator type so it may not really make sense; maybe you can provide guidance there too. Maybe a standard suite of tests for each iterator impl? |
Arguably this is a flaw in the iterator API design. The java stream APIs mandate statelessness and side-effect freedom for most of their intermediate operations and by default don't guarantee intermediate operations to be invoked in any particular order, or even at all. Exceptions to that general policy are spelled out on individual operations. https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/util/stream/package-summary.html This allows fairly aggressive optimization of pipelines by fusing intermediate ops to produce the same end results. |
Adding tests to std is basically the same as adding them to any other crate. For example, btree iterator tests are here: https://github.com/rust-lang/rust/blob/master/src/liballoc/tests/btree/map.rs It seems like it should be possible to make a standard iterator test suite by having a function that compares an iterable to a slice of expected values. |
@the8472 that doesn't really seem actionable in any way? |
Wait, the change that's observable here are the things that invoke user defined callbacks, which, fair enough. But what's wrong with |
Rev can wrap an iterator with side effects. |
|
@sfackler at least it is something to consider when designing adding new methods on Iterator. Keep the guarantees to an absolute minimum. |
Ah, right, |
I wasn't talking about |
@the8472 unfortunately rust doesn't have a |
@sfackler, wow, yeah, I missed that. It's not 100% clear that |
@khuey I don't think pure functions are necessary for most optimizations even though they are sufficient. Making explicit non-guarantees about how/when they are invoked gets you most of the way. |
|
@khuey Did you have any specific use case in mind here? I haven't been able to come up with examples of any useful calls to |
FWIW, liballoc tests are split across the liballoc crate itself (like normal unit tests) and the dedicated test crate at tests. |
When possible without changing semantics, implement Iterator::last in terms of DoubleEndedIterator::next_back for types in liballoc and libcore. Provided that the iterator has finite length and does not trigger user-provided code, this is safe. What follows is a full list of the DoubleEndedIterators in liballoc/libcore and whether this optimization is safe, and if not, why not. src/liballoc/boxed.rs Box: Pass through to avoid defeating optimization of the underlying DoubleIterator implementation. This has no correctness impact. src/liballoc/collections/binary_heap.rs Iter: Pass through to avoid defeating optimizations on slice::Iter IntoIter: Not safe, changes Drop order Drain: Not safe, changes Drop order src/liballoc/collections/btree/map.rs Iter: Safe to call next_back, invokes no user defined code. IterMut: ditto IntoIter: Not safe, changes Drop order Keys: Safe to call next_back, invokes no user defined code. Values: ditto ValuesMut: ditto Range: ditto RangeMut: ditto src/liballoc/collections/btree/set.rs Iter: Safe to call next_back, invokes no user defined code. IntoIter: Not safe, changes Drop order Range: Safe to call next_back, invokes no user defined code. src/liballoc/collections/linked_list.rs Iter: Safe to call next_back, invokes no user defined code. IterMut: ditto IntoIter: Not safe, changes Drop order src/liballoc/collections/vec_deque.rs Iter: Safe to call next_back, invokes no user defined code. IterMut: ditto IntoIter: Not safe, changes Drop order Drain: ditto src/liballoc/string.rs Drain: Safe because return type is a primitive (char) src/liballoc/vec.rs IntoIter: Not safe, changes Drop order Drain: ditto Splice: ditto src/libcore/ascii.rs EscapeDefault: Safe because return type is a primitive (u8) src/libcore/iter/adapters/chain.rs Chain: Not safe, invokes user defined code (Iterator impl) src/libcore/iter/adapters/flatten.rs FlatMap: Not safe, invokes user defined code (Iterator impl) Flatten: ditto FlattenCompat: ditto src/libcore/iter/adapters/mod.rs Rev: Not safe, invokes user defined code (Iterator impl) Copied: ditto Cloned: Not safe, invokes user defined code (Iterator impl and T::clone) Map: Not safe, invokes user defined code (Iterator impl + closure) Filter: ditto FilterMap: ditto Enumerate: Not safe, invokes user defined code (Iterator impl) Skip: ditto Fuse: ditto Inspect: ditto src/libcore/iter/adapters/zip.rs Zip: Not safe, invokes user defined code (Iterator impl) src/libcore/iter/range.rs ops::Range: Not safe, changes Drop order, but ALREADY HAS SPECIALIZATION ops::RangeInclusive: ditto src/libcore/iter/sources.rs Repeat: Not safe, calling last should iloop. Empty: No point, iterator is at most one item long. Once: ditto OnceWith: ditto src/libcore/option.rs Item: No point, iterator is at most one item long. Iter: ditto IterMut: ditto IntoIter: ditto src/libcore/result.rs Iter: No point, iterator is at most one item long IterMut: ditto IntoIter: ditto src/libcore/slice/mod.rs Split: Not safe, invokes user defined closure SplitMut: ditto RSplit: ditto RSplitMut: ditto Windows: Safe, already has specialization Chunks: ditto ChunksMut: ditto ChunksExact: ditto ChunksExactMut: ditto RChunks: ditto RChunksMut: ditto RChunksExact: ditto RChunksExactMut: ditto src/libcore/str/mod.rs Chars: Safe, already has specialization CharIndices: ditto Bytes: ditto Lines: Safe to call next_back, invokes no user defined code. LinesAny: Deprecated Everything that is generic over P: Pattern: Not safe because Pattern invokes user defined code. SplitWhitespace: Safe to call next_back, invokes no user defined code. SplitAsciiWhitespace: ditto This is attempt 2 of rust-lang#60130. r? @sfackler
When possible without changing semantics, implement Iterator::last in terms of DoubleEndedIterator::next_back for types in liballoc and libcore. Provided that the iterator has finite length and does not trigger user-provided code, this is safe. What follows is a full list of the DoubleEndedIterators in liballoc/libcore and whether this optimization is safe, and if not, why not. src/liballoc/boxed.rs Box: Pass through to avoid defeating optimization of the underlying DoubleIterator implementation. This has no correctness impact. src/liballoc/collections/binary_heap.rs Iter: Pass through to avoid defeating optimizations on slice::Iter IntoIter: Not safe, changes Drop order Drain: Not safe, changes Drop order src/liballoc/collections/btree/map.rs Iter: Safe to call next_back, invokes no user defined code. IterMut: ditto IntoIter: Not safe, changes Drop order Keys: Safe to call next_back, invokes no user defined code. Values: ditto ValuesMut: ditto Range: ditto RangeMut: ditto src/liballoc/collections/btree/set.rs Iter: Safe to call next_back, invokes no user defined code. IntoIter: Not safe, changes Drop order Range: Safe to call next_back, invokes no user defined code. src/liballoc/collections/linked_list.rs Iter: Safe to call next_back, invokes no user defined code. IterMut: ditto IntoIter: Not safe, changes Drop order src/liballoc/collections/vec_deque.rs Iter: Safe to call next_back, invokes no user defined code. IterMut: ditto IntoIter: Not safe, changes Drop order Drain: ditto src/liballoc/string.rs Drain: Safe because return type is a primitive (char) src/liballoc/vec.rs IntoIter: Not safe, changes Drop order Drain: ditto Splice: ditto src/libcore/ascii.rs EscapeDefault: Safe because return type is a primitive (u8) src/libcore/iter/adapters/chain.rs Chain: Not safe, invokes user defined code (Iterator impl) src/libcore/iter/adapters/flatten.rs FlatMap: Not safe, invokes user defined code (Iterator impl) Flatten: ditto FlattenCompat: ditto src/libcore/iter/adapters/mod.rs Rev: Not safe, invokes user defined code (Iterator impl) Copied: ditto Cloned: Not safe, invokes user defined code (Iterator impl and T::clone) Map: Not safe, invokes user defined code (Iterator impl + closure) Filter: ditto FilterMap: ditto Enumerate: Not safe, invokes user defined code (Iterator impl) Skip: ditto Fuse: ditto Inspect: ditto src/libcore/iter/adapters/zip.rs Zip: Not safe, invokes user defined code (Iterator impl) src/libcore/iter/range.rs ops::Range: Not safe, changes Drop order, but ALREADY HAS SPECIALIZATION ops::RangeInclusive: ditto src/libcore/iter/sources.rs Repeat: Not safe, calling last should iloop. Empty: No point, iterator is at most one item long. Once: ditto OnceWith: ditto src/libcore/option.rs Item: No point, iterator is at most one item long. Iter: ditto IterMut: ditto IntoIter: ditto src/libcore/result.rs Iter: No point, iterator is at most one item long IterMut: ditto IntoIter: ditto src/libcore/slice/mod.rs Split: Not safe, invokes user defined closure SplitMut: ditto RSplit: ditto RSplitMut: ditto Windows: Safe, already has specialization Chunks: ditto ChunksMut: ditto ChunksExact: ditto ChunksExactMut: ditto RChunks: ditto RChunksMut: ditto RChunksExact: ditto RChunksExactMut: ditto src/libcore/str/mod.rs Chars: Safe, already has specialization CharIndices: ditto Bytes: ditto Lines: Safe to call next_back, invokes no user defined code. LinesAny: Deprecated Everything that is generic over P: Pattern: Not safe because Pattern invokes user defined code. SplitWhitespace: Safe to call next_back, invokes no user defined code. SplitAsciiWhitespace: ditto This is attempt 2 of rust-lang#60130. r? @sfackler
When possible without changing semantics, implement Iterator::last in terms of DoubleEndedIterator::next_back for types in liballoc and libcore. Provided that the iterator has finite length and does not trigger user-provided code, this is safe. What follows is a full list of the DoubleEndedIterators in liballoc/libcore and whether this optimization is safe, and if not, why not. src/liballoc/boxed.rs Box: Pass through to avoid defeating optimization of the underlying DoubleIterator implementation. This has no correctness impact. src/liballoc/collections/binary_heap.rs Iter: Pass through to avoid defeating optimizations on slice::Iter IntoIter: Not safe, changes Drop order Drain: Not safe, changes Drop order src/liballoc/collections/btree/map.rs Iter: Safe to call next_back, invokes no user defined code. IterMut: ditto IntoIter: Not safe, changes Drop order Keys: Safe to call next_back, invokes no user defined code. Values: ditto ValuesMut: ditto Range: ditto RangeMut: ditto src/liballoc/collections/btree/set.rs Iter: Safe to call next_back, invokes no user defined code. IntoIter: Not safe, changes Drop order Range: Safe to call next_back, invokes no user defined code. src/liballoc/collections/linked_list.rs Iter: Safe to call next_back, invokes no user defined code. IterMut: ditto IntoIter: Not safe, changes Drop order src/liballoc/collections/vec_deque.rs Iter: Safe to call next_back, invokes no user defined code. IterMut: ditto IntoIter: Not safe, changes Drop order Drain: ditto src/liballoc/string.rs Drain: Safe because return type is a primitive (char) src/liballoc/vec.rs IntoIter: Not safe, changes Drop order Drain: ditto Splice: ditto src/libcore/ascii.rs EscapeDefault: Safe because return type is a primitive (u8) src/libcore/iter/adapters/chain.rs Chain: Not safe, invokes user defined code (Iterator impl) src/libcore/iter/adapters/flatten.rs FlatMap: Not safe, invokes user defined code (Iterator impl) Flatten: ditto FlattenCompat: ditto src/libcore/iter/adapters/mod.rs Rev: Not safe, invokes user defined code (Iterator impl) Copied: ditto Cloned: Not safe, invokes user defined code (Iterator impl and T::clone) Map: Not safe, invokes user defined code (Iterator impl + closure) Filter: ditto FilterMap: ditto Enumerate: Not safe, invokes user defined code (Iterator impl) Skip: ditto Fuse: ditto Inspect: ditto src/libcore/iter/adapters/zip.rs Zip: Not safe, invokes user defined code (Iterator impl) src/libcore/iter/range.rs ops::Range: Not safe, changes Drop order, but ALREADY HAS SPECIALIZATION ops::RangeInclusive: ditto src/libcore/iter/sources.rs Repeat: Not safe, calling last should iloop. Empty: No point, iterator is at most one item long. Once: ditto OnceWith: ditto src/libcore/option.rs Item: No point, iterator is at most one item long. Iter: ditto IterMut: ditto IntoIter: ditto src/libcore/result.rs Iter: No point, iterator is at most one item long IterMut: ditto IntoIter: ditto src/libcore/slice/mod.rs Split: Not safe, invokes user defined closure SplitMut: ditto RSplit: ditto RSplitMut: ditto Windows: Safe, already has specialization Chunks: ditto ChunksMut: ditto ChunksExact: ditto ChunksExactMut: ditto RChunks: ditto RChunksMut: ditto RChunksExact: ditto RChunksExactMut: ditto src/libcore/str/mod.rs Chars: Safe, already has specialization CharIndices: ditto Bytes: ditto Lines: Safe to call next_back, invokes no user defined code. LinesAny: Deprecated Everything that is generic over P: Pattern: Not safe because Pattern invokes user defined code. SplitWhitespace: Safe to call next_back, invokes no user defined code. SplitAsciiWhitespace: ditto This is attempt 2 of rust-lang#60130. r? @sfackler
When possible without changing semantics, implement Iterator::last in terms of DoubleEndedIterator::next_back for types in liballoc and libcore. Provided that the iterator has finite length and does not trigger user-provided code, this is safe. What follows is a full list of the DoubleEndedIterators in liballoc/libcore and whether this optimization is safe, and if not, why not. src/liballoc/boxed.rs Box: Pass through to avoid defeating optimization of the underlying DoubleIterator implementation. This has no correctness impact. src/liballoc/collections/binary_heap.rs Iter: Pass through to avoid defeating optimizations on slice::Iter IntoIter: Not safe, changes Drop order Drain: Not safe, changes Drop order src/liballoc/collections/btree/map.rs Iter: Safe to call next_back, invokes no user defined code. IterMut: ditto IntoIter: Not safe, changes Drop order Keys: Safe to call next_back, invokes no user defined code. Values: ditto ValuesMut: ditto Range: ditto RangeMut: ditto src/liballoc/collections/btree/set.rs Iter: Safe to call next_back, invokes no user defined code. IntoIter: Not safe, changes Drop order Range: Safe to call next_back, invokes no user defined code. src/liballoc/collections/linked_list.rs Iter: Safe to call next_back, invokes no user defined code. IterMut: ditto IntoIter: Not safe, changes Drop order src/liballoc/collections/vec_deque.rs Iter: Safe to call next_back, invokes no user defined code. IterMut: ditto IntoIter: Not safe, changes Drop order Drain: ditto src/liballoc/string.rs Drain: Safe because return type is a primitive (char) src/liballoc/vec.rs IntoIter: Not safe, changes Drop order Drain: ditto Splice: ditto src/libcore/ascii.rs EscapeDefault: Safe because return type is a primitive (u8) src/libcore/iter/adapters/chain.rs Chain: Not safe, invokes user defined code (Iterator impl) src/libcore/iter/adapters/flatten.rs FlatMap: Not safe, invokes user defined code (Iterator impl) Flatten: ditto FlattenCompat: ditto src/libcore/iter/adapters/mod.rs Rev: Not safe, invokes user defined code (Iterator impl) Copied: ditto Cloned: Not safe, invokes user defined code (Iterator impl and T::clone) Map: Not safe, invokes user defined code (Iterator impl + closure) Filter: ditto FilterMap: ditto Enumerate: Not safe, invokes user defined code (Iterator impl) Skip: ditto Fuse: ditto Inspect: ditto src/libcore/iter/adapters/zip.rs Zip: Not safe, invokes user defined code (Iterator impl) src/libcore/iter/range.rs ops::Range: Not safe, changes Drop order, but ALREADY HAS SPECIALIZATION ops::RangeInclusive: ditto src/libcore/iter/sources.rs Repeat: Not safe, calling last should iloop. Empty: No point, iterator is at most one item long. Once: ditto OnceWith: ditto src/libcore/option.rs Item: No point, iterator is at most one item long. Iter: ditto IterMut: ditto IntoIter: ditto src/libcore/result.rs Iter: No point, iterator is at most one item long IterMut: ditto IntoIter: ditto src/libcore/slice/mod.rs Split: Not safe, invokes user defined closure SplitMut: ditto RSplit: ditto RSplitMut: ditto Windows: Safe, already has specialization Chunks: ditto ChunksMut: ditto ChunksExact: ditto ChunksExactMut: ditto RChunks: ditto RChunksMut: ditto RChunksExact: ditto RChunksExactMut: ditto src/libcore/str/mod.rs Chars: Safe, already has specialization CharIndices: ditto Bytes: ditto Lines: Safe to call next_back, invokes no user defined code. LinesAny: Deprecated Everything that is generic over P: Pattern: Not safe because Pattern invokes user defined code. SplitWhitespace: Safe to call next_back, invokes no user defined code. SplitAsciiWhitespace: ditto This is attempt 2 of rust-lang#60130. r? @sfackler
Provided a
DoubleEndedIterator
has finite length,Iterator::last
is equivalent toDoubleEndedIterator::next_back
. But searching forwards through the iterator when it's unnecessary is obviously not good for performance. I ran into this on one of the collection iterators.I tried adding appropriate overloads for a bunch of the iterator adapters like filter, map, etc, but I ran into a lot of type inference failures after doing so.
The other interesting case is what to do with
Repeat
. Do we consider it part of the contract thatIterator::last
will loop forever on it? The docs do say that the iterator will be evaluated until it returns None. This is also relevant for the adapters, it's trivially easy to observe whether aMap
adapter invoked its closure a zillion times or just once for the last element.