Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Perf regression in Vec<u8> Write impl? #24095

Closed
frankmcsherry opened this issue Apr 5, 2015 · 3 comments · Fixed by #24120
Closed

Perf regression in Vec<u8> Write impl? #24095

frankmcsherry opened this issue Apr 5, 2015 · 3 comments · Fixed by #24120

Comments

@frankmcsherry
Copy link
Contributor

Hi folks. A bunch of my serialization code recently got a lot slower, and I think I've tracked it down to writing binary data into a Vec<u8>. This used to be quite fast, well above the 8-10GB/s range. It is now about 1GB/s for me. The following main.rs demos the performance I'm seeing.

#![feature(test)]
extern crate test;

use test::Bencher;
use std::io::Write;

#[bench] 
fn bench(bencher: &mut Bencher) {
    let data = &[0u8; 4096];
    let mut buffer = Vec::with_capacity(data.len());
    bencher.bytes = data.len() as u64;
    bencher.iter(|| {
        buffer.clear();
        buffer.write_all(data).unwrap();
    });
}

fn main() {
    let data = &[0u8; 4096];
    let mut buffer = Vec::with_capacity(data.len());

    // writes 4GB, takes .. 4s+
    for _ in (0..(1 << 20)) {
        buffer.clear();
        buffer.write_all(data).unwrap();
    };
}

The perf numbers look like (where main writes 4GB in 4KB chunks):

Echidnatron% cargo bench; time cargo run --release 
     Running target/release/bench-5e9b1b37cda85a22

running 1 test
test bench ... bench:      4086 ns/iter (+/- 596) = 1002 MB/s

test result: ok. 0 passed; 0 failed; 0 ignored; 1 measured

     Running `target/release/bench`
cargo run --release  4.06s user 0.03s system 99% cpu 4.094 total
Echidnatron% 

Sorry if this is old news, or a misdiagnosis. Something is a bit slower now, though. The comments in the source (Vec<T>::push_all) do suggest it isn't stable yet because the impl might get faster; I didn't expect it to get 10x slower on me though :).

Echidnatron% cargo --version
cargo 0.0.1-pre-nightly (d71f748 2015-04-03) (built 2015-04-04)
@alexcrichton
Copy link
Member

I think this is the same cause as #24014, at least the performance of Iterator for ops::Range comment.

Nominating for 1.0 as this is quite a critical iterator to have perform well.

triage: I-nominated

@aturon
Copy link
Member

aturon commented Apr 6, 2015

@alexcrichton I will try to fix this today.

aturon added a commit to aturon/rust that referenced this issue Apr 6, 2015
A recent change to the implementation of range iterators meant that,
even when stepping by 1, the iterators *always* involved checked
arithmetic.

This commit reverts to the earlier behavior (while retaining the
refactoring into traits).

Fixes rust-lang#24095
cc rust-lang#24014
aturon added a commit to aturon/rust that referenced this issue Apr 7, 2015
A recent change to the implementation of range iterators meant that,
even when stepping by 1, the iterators *always* involved checked
arithmetic.

This commit reverts to the earlier behavior (while retaining the
refactoring into traits).

Fixes rust-lang#24095
cc rust-lang#24014
bors added a commit that referenced this issue Apr 8, 2015
A recent change to the implementation of range iterators meant that,
even when stepping by 1, the iterators *always* involved checked
arithmetic.

This commit reverts to the earlier behavior (while retaining the
refactoring into traits).

Fixes #24095
Closes #24119
cc #24014 

r? @alexcrichton
@frankmcsherry
Copy link
Contributor Author

Can confirm with the new nightly that the #[bench] is reporting 50GB/s. <3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants