-
Notifications
You must be signed in to change notification settings - Fork 12.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Iterator::array_chunks
(take N+1)
#100026
Merged
Merged
Changes from all commits
Commits
Show all changes
12 commits
Select commit
Hold shift + click to select a range
ca3d101
Add `Iterator::array_chunks()`
rossmacarthur f548518
Use `array::IntoIter` for the `ArrayChunks` remainder
rossmacarthur ef72349
Remove `array::IntoIter::with_partial` -- an artifact of the past, on…
WaffleLapkin b8b1486
Forward `ArrayChunks::next{,_back}` to `try_{for_each,rfold}`
WaffleLapkin 4db628a
Remove incorrect impl `TrustedLen` for `ArrayChunks`
WaffleLapkin 3102b39
Use `#[track_caller]` to make panic in `Iterator::array_chunks` nicer
WaffleLapkin 37dfb04
Remove `Fuse` from `ArrayChunks` implementation
WaffleLapkin 4c0292c
Simplify `ArrayChunks::is_empty`
WaffleLapkin 475e4ba
Simplify `ArrayChunks::{,r}fold` impls
WaffleLapkin 756bd6e
Use `next_chunk` in `ArrayChunks` impl
WaffleLapkin eb6b729
address review comments
WaffleLapkin 5fbcde1
fill-in tracking issue for `feature(iter_array_chunks)`
WaffleLapkin File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,182 @@ | ||
use crate::array; | ||
use crate::iter::{ByRefSized, FusedIterator, Iterator}; | ||
use crate::ops::{ControlFlow, NeverShortCircuit, Try}; | ||
|
||
/// An iterator over `N` elements of the iterator at a time. | ||
/// | ||
/// The chunks do not overlap. If `N` does not divide the length of the | ||
/// iterator, then the last up to `N-1` elements will be omitted. | ||
/// | ||
/// This `struct` is created by the [`array_chunks`][Iterator::array_chunks] | ||
/// method on [`Iterator`]. See its documentation for more. | ||
#[derive(Debug, Clone)] | ||
#[must_use = "iterators are lazy and do nothing unless consumed"] | ||
#[unstable(feature = "iter_array_chunks", reason = "recently added", issue = "100450")] | ||
pub struct ArrayChunks<I: Iterator, const N: usize> { | ||
iter: I, | ||
remainder: Option<array::IntoIter<I::Item, N>>, | ||
} | ||
|
||
impl<I, const N: usize> ArrayChunks<I, N> | ||
where | ||
I: Iterator, | ||
{ | ||
#[track_caller] | ||
pub(in crate::iter) fn new(iter: I) -> Self { | ||
assert!(N != 0, "chunk size must be non-zero"); | ||
Self { iter, remainder: None } | ||
} | ||
|
||
/// Returns an iterator over the remaining elements of the original iterator | ||
/// that are not going to be returned by this iterator. The returned | ||
/// iterator will yield at most `N-1` elements. | ||
#[unstable(feature = "iter_array_chunks", reason = "recently added", issue = "100450")] | ||
#[inline] | ||
pub fn into_remainder(self) -> Option<array::IntoIter<I::Item, N>> { | ||
self.remainder | ||
} | ||
} | ||
|
||
#[unstable(feature = "iter_array_chunks", reason = "recently added", issue = "100450")] | ||
impl<I, const N: usize> Iterator for ArrayChunks<I, N> | ||
where | ||
I: Iterator, | ||
{ | ||
type Item = [I::Item; N]; | ||
|
||
#[inline] | ||
fn next(&mut self) -> Option<Self::Item> { | ||
self.try_for_each(ControlFlow::Break).break_value() | ||
} | ||
|
||
#[inline] | ||
fn size_hint(&self) -> (usize, Option<usize>) { | ||
let (lower, upper) = self.iter.size_hint(); | ||
|
||
(lower / N, upper.map(|n| n / N)) | ||
} | ||
|
||
#[inline] | ||
fn count(self) -> usize { | ||
self.iter.count() / N | ||
} | ||
|
||
fn try_fold<B, F, R>(&mut self, init: B, mut f: F) -> R | ||
where | ||
Self: Sized, | ||
F: FnMut(B, Self::Item) -> R, | ||
R: Try<Output = B>, | ||
{ | ||
let mut acc = init; | ||
loop { | ||
match self.iter.next_chunk() { | ||
Ok(chunk) => acc = f(acc, chunk)?, | ||
Err(remainder) => { | ||
// Make sure to not override `self.remainder` with an empty array | ||
// when `next` is called after `ArrayChunks` exhaustion. | ||
self.remainder.get_or_insert(remainder); | ||
|
||
break try { acc }; | ||
} | ||
} | ||
} | ||
} | ||
|
||
fn fold<B, F>(mut self, init: B, f: F) -> B | ||
where | ||
Self: Sized, | ||
F: FnMut(B, Self::Item) -> B, | ||
{ | ||
self.try_fold(init, NeverShortCircuit::wrap_mut_2(f)).0 | ||
} | ||
} | ||
|
||
#[unstable(feature = "iter_array_chunks", reason = "recently added", issue = "100450")] | ||
impl<I, const N: usize> DoubleEndedIterator for ArrayChunks<I, N> | ||
where | ||
I: DoubleEndedIterator + ExactSizeIterator, | ||
{ | ||
#[inline] | ||
fn next_back(&mut self) -> Option<Self::Item> { | ||
self.try_rfold((), |(), x| ControlFlow::Break(x)).break_value() | ||
} | ||
|
||
fn try_rfold<B, F, R>(&mut self, init: B, mut f: F) -> R | ||
where | ||
Self: Sized, | ||
F: FnMut(B, Self::Item) -> R, | ||
R: Try<Output = B>, | ||
{ | ||
// We are iterating from the back we need to first handle the remainder. | ||
self.next_back_remainder(); | ||
|
||
let mut acc = init; | ||
let mut iter = ByRefSized(&mut self.iter).rev(); | ||
|
||
// NB remainder is handled by `next_back_remainder`, so | ||
// `next_chunk` can't return `Err` with non-empty remainder | ||
// (assuming correct `I as ExactSizeIterator` impl). | ||
while let Ok(mut chunk) = iter.next_chunk() { | ||
// FIXME: do not do double reverse | ||
// (we could instead add `next_chunk_back` for example) | ||
chunk.reverse(); | ||
acc = f(acc, chunk)? | ||
} | ||
|
||
try { acc } | ||
} | ||
|
||
fn rfold<B, F>(mut self, init: B, f: F) -> B | ||
where | ||
Self: Sized, | ||
F: FnMut(B, Self::Item) -> B, | ||
{ | ||
self.try_rfold(init, NeverShortCircuit::wrap_mut_2(f)).0 | ||
} | ||
} | ||
|
||
impl<I, const N: usize> ArrayChunks<I, N> | ||
where | ||
I: DoubleEndedIterator + ExactSizeIterator, | ||
{ | ||
/// Updates `self.remainder` such that `self.iter.len` is divisible by `N`. | ||
fn next_back_remainder(&mut self) { | ||
// Make sure to not override `self.remainder` with an empty array | ||
// when `next_back` is called after `ArrayChunks` exhaustion. | ||
if self.remainder.is_some() { | ||
return; | ||
} | ||
|
||
// We use the `ExactSizeIterator` implementation of the underlying | ||
// iterator to know how many remaining elements there are. | ||
let rem = self.iter.len() % N; | ||
|
||
// Take the last `rem` elements out of `self.iter`. | ||
let mut remainder = | ||
// SAFETY: `unwrap_err` always succeeds because x % N < N for all x. | ||
unsafe { self.iter.by_ref().rev().take(rem).next_chunk().unwrap_err_unchecked() }; | ||
|
||
// We used `.rev()` above, so we need to re-reverse the reminder | ||
remainder.as_mut_slice().reverse(); | ||
self.remainder = Some(remainder); | ||
} | ||
} | ||
|
||
#[unstable(feature = "iter_array_chunks", reason = "recently added", issue = "100450")] | ||
impl<I, const N: usize> FusedIterator for ArrayChunks<I, N> where I: FusedIterator {} | ||
|
||
#[unstable(feature = "iter_array_chunks", reason = "recently added", issue = "100450")] | ||
impl<I, const N: usize> ExactSizeIterator for ArrayChunks<I, N> | ||
where | ||
I: ExactSizeIterator, | ||
{ | ||
#[inline] | ||
fn len(&self) -> usize { | ||
self.iter.len() / N | ||
} | ||
|
||
#[inline] | ||
fn is_empty(&self) -> bool { | ||
self.iter.len() < N | ||
} | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think I'll block the PR on this, but
unwrap_err_unchecked
here scares me -- I added a note to the tracking issue.My instinct for now is that
%N
is definitely trustworthy, so that combined withtake
is sufficient to say this is ok. But I'm still afraid, since it's notTrustedLen
, and wonder if someone can come up with an evil example here where it somehow gets a full chunk and thus is UB in safe code.(At least the obvious things, like a
len()
that always saysusize
, is protected against because of the modulo.)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As already noted in another comment I don't like this entire method. Either we should add
next_chunk_back
or remove the DEI impl.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, an evil impl can return less than
rem
elements, butactual <= rem < N
still holds (I don't think there is any way to trickTake
into returning more thanrem
elements) so the chunk will be an error no matter what.An evil impl can just return a wrong length, so after this iterator returns a number of elements not divisible by
N
. But this is fine too, anErr
will just be ignored:rust/library/core/src/iter/adapters/array_chunks.rs
Lines 116 to 119 in 5fbcde1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That being said, as @the8472 already said, we should probably rewrite the DEI impl anyway (I'll try to do this, after this lands).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, the
%
is certainly fine. What I'm not yet convinced of is that there's no way to maketake
misbehave somehow -- for example,Take::try_fold
uses the innertry_fold
, so maybe there'd be a way for that to be implemented wrongly-but-not-UB that could result in too many things getting put in the array somehow.But I agree that just making a nicer implementation without the
unsafe{}
is the right plan.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually you are right 😨
Overriding
try_fold
with some nasty unsafe to skip overBreak(_)
allows you to trickTake
into returning more elements that it should (playground).n
can underflow, if a bad iterator impl skips overBreak(_)
:rust/library/core/src/iter/adapters/take.rs
Lines 87 to 89 in 2fbc08e
The fact that unsafe code can't trust
Take
is scary.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the problem there is still in the
unsafe
inMisbehave
-- it protects against double-drop, but duplicating arbitrary values can violate safety invariants in general. (For example, if the type is!Clone + !Default
, I can use you moving it into me to track resource consumption, like a ZST tracker for a global semaphore.) So that specific example is just "well unsoundunsafe
-using code breaks everything".But yeah, even though I've not been able to come up with a safe trick that would do it, things along those lines are what make me worried about it. (And certainly if specialization existed then it would be possible to do particularly weird things in safe code because it can violate parametricity even worse than in normal code.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The example could check, that
type_id::<B>() == type_id::<usize>()
or something similar, which would IMO make the code sound.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I thought about that one, but
TypeId::of::<B>()
needsB: 'static
, which we don't have here. Regardless, it's yet another chink in the armour that makes be scared we'll break through eventually.