chain() make collect very slow

While working on a [SO](https://stackoverflow.com/questions/57378606/how-can-i-ensure-that-a-rust-vector-only-contains-alternating-types/57378944#57378944) question.

We was wondering if `chain()` would produce an acceptable speed, after some digging and benchmark, we come to the conclusion that `collect()` is slow because it use [`while let`](https://doc.rust-lang.org/src/alloc/vec.rs.html#1939). Unfortunately, this make collect very slow, I don't really understand why but that a fact.

But we saw that `for_each()` (probably thank to [`fold()`](https://doc.rust-lang.org/src/core/iter/adapters/chain.rs.html#101)) implementation of `chain()` don't have this problem and produce something a lot faster.

```rust
#![feature(test)]
extern crate test;

use either::Either; // 1.5.2
use std::iter;

#[derive(Debug, Default)]
pub struct Data<X, Y> {
    head: Option<Y>,
    pairs: Vec<(X, Y)>,
    tail: Option<X>,
}

impl<X, Y> Data<X, Y> {
    pub fn iter(&self) -> impl Iterator<Item = Either<&X, &Y>> {
        let head = self.head.iter().map(Either::Right);

        let pairs = self.pairs.iter().flat_map(|(a, b)| {
            let a = iter::once(Either::Left(a));
            let b = iter::once(Either::Right(b));
            a.chain(b)
        });

        let tail = self.tail.iter().map(Either::Left);

        head.chain(pairs).chain(tail)
    }
}

#[derive(Debug)]
struct AData(usize);
#[derive(Debug)]
struct BData(usize);

#[cfg(test)]
mod tests {
    use crate::{AData, BData, Data};
    use test::Bencher; // 1.5.2

    #[bench]
    fn test_for_each(b: &mut Bencher) {
        b.iter(|| {
            let data = Data {
                head: Some(BData(84)),
                pairs: std::iter::repeat_with(|| (AData(42), BData(84)))
                    .take(20998)
                    .collect(),
                tail: Some(AData(42)),
            };

            let mut data_bis = Vec::with_capacity(21000);
            data.iter().for_each(|x| data_bis.push(x));
        });
    }

    #[bench]
    fn test_collect(b: &mut Bencher) {
        b.iter(|| {
            let data = Data {
                head: Some(BData(84)),
                pairs: std::iter::repeat_with(|| (AData(42), BData(84)))
                    .take(20998)
                    .collect(),
                tail: Some(AData(42)),
            };

            let _: Vec<_> = data.iter().collect();
        });
    }
}
```

```none
test tests::test_collect  ... bench:   1,682,529 ns/iter (+/- 2,157,023)
test tests::test_for_each ... bench:     609,031 ns/iter (+/- 750,944)
```

So, should we change implementation of collect to use `for_each()` ? Note that a for loop doesn't solve the problem. For this to be optimized we need to use `for_each()`.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

chain() make collect very slow #63340

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

chain() make collect very slow #63340

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions