Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Iterator::collect_into and Iterator::collect_with #92982

Closed
wants to merge 3 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
148 changes: 148 additions & 0 deletions library/core/src/iter/traits/iterator.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1741,6 +1741,154 @@ pub trait Iterator {
FromIterator::from_iter(self)
}

/// Collects all the items from an iterator into a collection.
///
/// This method consumes the iterator and adds all its items to the
/// passed collection. The collection is then returned, so the call chain
/// can be continued.
///
/// The collection is passed and returned by mutable reference.
/// To pass it by by value, use [`collect_with`].
///
/// [`collect_with`]: Iterator::collect_with
///
/// This is useful when you already have a collection and wants to add
/// the iterator items to it.
///
/// This method is a convenience method to call [Extend::extend](trait.Extend.html),
/// but instead of being called on a collection, it's called on an iterator.
///
/// # Examples
///
/// Basic usage:
///
/// ```
/// #![feature(iter_more_collects)]
///
/// let a = [1, 2, 3];
/// let mut vec: Vec::<i32> = vec![0, 1];
///
/// a.iter().map(|&x| x * 2).collect_into(&mut vec);
/// a.iter().map(|&x| x * 10).collect_into(&mut vec);
///
/// assert_eq!(vec![0, 1, 2, 4, 6, 10, 20, 30], vec);
/// ```
///
/// `Vec` can have a manual set capacity to avoid reallocating it:
///
/// ```
/// #![feature(iter_more_collects)]
///
/// let a = [1, 2, 3];
/// let mut vec: Vec::<i32> = Vec::with_capacity(6);
///
/// a.iter().map(|&x| x * 2).collect_into(&mut vec);
/// a.iter().map(|&x| x * 10).collect_into(&mut vec);
///
/// assert_eq!(6, vec.capacity());
/// ```
///
/// The returned mutable reference can be used to continue the call chain:
///
/// ```
/// #![feature(iter_more_collects)]
///
/// let a = [1, 2, 3];
/// let mut vec: Vec::<i32> = Vec::new();
///
/// let count = a.iter().collect_into(&mut vec).iter().count();
///
/// assert_eq!(count, vec.len());
/// println!("Vec len is {}", count);
///
/// let count = a.iter().collect_into(&mut vec).iter().count();
///
/// assert_eq!(count, vec.len());
/// println!("Vec len now is {}", count);
/// ```
// must_use not added here since collect_into takes a (mutable) reference
#[inline]
#[unstable(feature = "iter_more_collects", reason = "new API", issue = "none")]
fn collect_into<E: Extend<Self::Item>>(self, collection: &mut E) -> &mut E
where
Self: Sized,
{
collection.extend(self);
collection
}

/// Collects all the items from an iterator with a collection.
///
/// This method consumes the iterator and adds all its items to the
/// passed collection. The collection is then returned, so the call chain
/// can be continued.
///
/// The collection is passed and returned by value. To pass it by by mutable
/// reference, use [`collect_into`].
///
/// [`collect_into`]: Iterator::collect_into
///
/// This is useful when you want to pre-allocate memory for the collection
/// that will contains the iterator items.
///
/// This method is a convenience method to call [Extend::extend](trait.Extend.html),
/// but instead of being called on a collection, it's called on an iterator.
///
/// # Examples
///
/// Basic usage:
///
/// ```
/// #![feature(iter_more_collects)]
///
/// let a = [1, 2, 3];
///
/// let doubled = a.iter()
/// .map(|&x| x * 2)
/// .collect_with(Vec::new());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this sets a bad example since it uses Extend and we have separate code paths for FromIterator which are optimized for the fact that we're starting with an empty vec. It's kind of like doing

let mut vec = Vec::new()
vec.extend(iter);
vec

Which is a silly thing to do when you can just use iter.collect()

For that reason I think the name also is suboptimal since it associates itself with collect() and not extend().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I put Vec::with_capacity(3) instead?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

size_hint already takes care of that in many cases, so it would only help in cases where the size hint of the iterator isn't accurate (e.g. when it contains a filter). Or maybe when collecting into a collection with a custom allocator.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, so if I understood correctly, I should use a filter (or some other adapter) in the examples that makes size_hint not provide any useful info for iter.collect() right?
I could also mention that usually iter.collect() is preferred when size_hint is accurate

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes

Copy link
Contributor

@cormacrelf cormacrelf Jan 17, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This lack of optimisation for the empty case is a bit of a worry in terms of whether collect_with is a good API at all. Wouldn't all of the good usages be better covered by the mutable collect_into? I had envisioned we could use .collect_with(Vec::new()) to replace the turbofish, but it seems that is almost never a good idea, or at least only applicable in certain circumstances. Now we would be creating a more ergonomic API whose use we would have to immediately discourage.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

collect_with can be very useful in order to avoid reallocation(s) when size_hint returns the default value.

I had envisioned we could use .collect_with(Vec::new()) to replace the turbofish, but it seems that is almost never a good idea, or at least only applicable in certain circumstances.

I agree that it is almost never a good idea. I also like more the turbofish syntax where applicable.

My idea for collect_with was a method to be used when size_hint is not helpful and the collection (in which the items are collected) does not already exists. This allows shorter code:

let len1 = vec![1, 2, 3]
    .into_iter()
    .filter(|&x| x < 3)
    .collect_with(Vec::with_capacity(2))
    .len();

// Instead of
let mut vec = Vec::with_capacity(2);
let len2 = vec![1, 2, 3]
    .into_iter()
    .filter(|&x| x < 3)
    .collect_into(&mut vec)
    .len();

I think we could specify it in the doc. However, someone could still misuse the method. Calling Extend::extend_reserve should, at least, fix the speed gap between collect() and misused collect_with. (not already implemented)

Eventually, I realised collect_with can be implemented with a call to collect_into, should it be a better implementation?

///
/// assert_eq!(vec![2, 4, 6], doubled);
/// ```
///
/// Collecting an iterator into a `Vec` with manually set capacity in order
/// to avoid reallocating it:
///
/// ```
/// #![feature(iter_more_collects)]
///
/// let doubled = (0..50).map(|x| x * 2)
/// .collect_with(Vec::with_capacity(50));
///
/// assert_eq!(50, doubled.capacity());
/// ```
///
/// Passing to `collect_into` a collection with less capacity than necessary
/// can lead to less performant code:
///
/// ```
/// #![feature(iter_more_collects)]
///
/// let chars = ['g', 'd', 'k', 'k', 'n'];
///
/// let hello = chars.iter()
/// .map(|&x| x as u8)
/// .map(|x| (x + 1) as char)
/// .collect_with(String::with_capacity(2));
///
/// assert_eq!("hello", hello);
/// assert!(5 <= hello.capacity()); // At least one reallocation happened
/// ```
#[inline]
#[unstable(feature = "iter_more_collects", reason = "new API", issue = "none")]
#[must_use = "if you really need to exhaust the iterator, consider `.for_each(drop)` instead"]
fn collect_with<E: Extend<Self::Item>>(self, mut collection: E) -> E
where
Self: Sized,
{
collection.extend(self);
collection
}

/// Consumes an iterator, creating two collections from it.
///
/// The predicate passed to `partition()` can return `true`, or `false`.
Expand Down
15 changes: 15 additions & 0 deletions library/core/tests/iter/traits/iterator.rs
Original file line number Diff line number Diff line change
Expand Up @@ -496,3 +496,18 @@ fn test_collect() {
let b: Vec<isize> = a.iter().cloned().collect();
assert!(a == b);
}

#[test]
fn test_collect_into() {
let a = vec![1, 2, 3, 4, 5];
let mut b = Vec::new();
a.iter().cloned().collect_into(&mut b);
assert!(a == b);
}

#[test]
fn test_collect_with() {
let a = vec![1, 2, 3, 4, 5];
let b = a.iter().cloned().collect_with(Vec::with_capacity(5));
assert!(a == b);
}
1 change: 1 addition & 0 deletions library/core/tests/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,7 @@
#![feature(slice_partition_dedup)]
#![feature(int_log)]
#![feature(iter_advance_by)]
#![feature(iter_more_collects)]
#![feature(iter_partition_in_place)]
#![feature(iter_intersperse)]
#![feature(iter_is_partitioned)]
Expand Down