Perf regression in Vec<u8> Write impl? #24095

frankmcsherry · 2015-04-05T18:06:51Z

Hi folks. A bunch of my serialization code recently got a lot slower, and I think I've tracked it down to writing binary data into a Vec<u8>. This used to be quite fast, well above the 8-10GB/s range. It is now about 1GB/s for me. The following main.rs demos the performance I'm seeing.

#![feature(test)]
extern crate test;

use test::Bencher;
use std::io::Write;

#[bench] 
fn bench(bencher: &mut Bencher) {
    let data = &[0u8; 4096];
    let mut buffer = Vec::with_capacity(data.len());
    bencher.bytes = data.len() as u64;
    bencher.iter(|| {
        buffer.clear();
        buffer.write_all(data).unwrap();
    });
}

fn main() {
    let data = &[0u8; 4096];
    let mut buffer = Vec::with_capacity(data.len());

    // writes 4GB, takes .. 4s+
    for _ in (0..(1 << 20)) {
        buffer.clear();
        buffer.write_all(data).unwrap();
    };
}

The perf numbers look like (where main writes 4GB in 4KB chunks):

Echidnatron% cargo bench; time cargo run --release 
     Running target/release/bench-5e9b1b37cda85a22

running 1 test
test bench ... bench:      4086 ns/iter (+/- 596) = 1002 MB/s

test result: ok. 0 passed; 0 failed; 0 ignored; 1 measured

     Running `target/release/bench`
cargo run --release  4.06s user 0.03s system 99% cpu 4.094 total
Echidnatron%

Sorry if this is old news, or a misdiagnosis. Something is a bit slower now, though. The comments in the source (Vec<T>::push_all) do suggest it isn't stable yet because the impl might get faster; I didn't expect it to get 10x slower on me though :).

Echidnatron% cargo --version
cargo 0.0.1-pre-nightly (d71f748 2015-04-03) (built 2015-04-04)

The text was updated successfully, but these errors were encountered:

alexcrichton · 2015-04-06T16:10:17Z

I think this is the same cause as #24014, at least the performance of Iterator for ops::Range comment.

Nominating for 1.0 as this is quite a critical iterator to have perform well.

triage: I-nominated

aturon · 2015-04-06T16:20:46Z

@alexcrichton I will try to fix this today.

A recent change to the implementation of range iterators meant that, even when stepping by 1, the iterators *always* involved checked arithmetic. This commit reverts to the earlier behavior (while retaining the refactoring into traits). Fixes rust-lang#24095 cc rust-lang#24014

@alexcrichton

A recent change to the implementation of range iterators meant that, even when stepping by 1, the iterators *always* involved checked arithmetic. This commit reverts to the earlier behavior (while retaining the refactoring into traits). Fixes #24095 Closes #24119 cc #24014 r? @alexcrichton

frankmcsherry · 2015-04-09T13:34:42Z

Can confirm with the new nightly that the #[bench] is reporting 50GB/s. <3

rust-highfive added the I-nominated label Apr 6, 2015

alexcrichton mentioned this issue Apr 6, 2015

compiler generate slower code in my benchmarks, after upgrade from 0.13.0 to master #24014

Closed

3 tasks

aturon mentioned this issue Apr 6, 2015

Fix range performance regression #24120

Merged

bors closed this as completed in #24120 Apr 8, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Perf regression in Vec<u8> Write impl? #24095

Perf regression in Vec<u8> Write impl? #24095

frankmcsherry commented Apr 5, 2015

alexcrichton commented Apr 6, 2015

aturon commented Apr 6, 2015

frankmcsherry commented Apr 9, 2015

Perf regression in Vec<u8> Write impl? #24095

Perf regression in Vec<u8> Write impl? #24095

Comments

frankmcsherry commented Apr 5, 2015

alexcrichton commented Apr 6, 2015

aturon commented Apr 6, 2015

frankmcsherry commented Apr 9, 2015