Skip to content

Conversation

@sfackler
Copy link
Member

It's passed to the underlying reader, so uninitialized memory == sad
times.

We might want to shrink the default buffer size as well. 64k is pretty
huge. Java uses 8k by default, and Go uses 4k for reference.

r? @alexcrichton

It's passed to the underlying reader, so uninitialized memory == sad
times.

We might want to shrink the default buffer size as well. 64k is pretty
huge. Java uses 8k by default, and Go uses 4k for reference.
@sfackler
Copy link
Member Author

The unsafe equivalent of the buffer initialization is a bit faster, not sure if we'd want to use that instead:

#![allow(unstable)]
extern crate test;

use std::iter::repeat;
use std::slice::bytes::MutableByteVector;

#[bench]
fn normal(b: &mut test::Bencher) {
    b.iter(|| {
        repeat(0u8).take(64 * 1024).collect::<Vec<_>>()
    })
}

#[bench]
fn fast(b: &mut test::Bencher) {
    b.iter(|| {
        let mut v = Vec::with_capacity(64 * 1024);
        unsafe {
            v.set_len(64 * 1024);
        }
        v.set_memory(0);
        v
    })
}
test fast   ... bench:      9238 ns/iter (+/- 1808)
test normal ... bench:     46661 ns/iter (+/- 7560)

@Gankra
Copy link
Contributor

Gankra commented Jan 13, 2015

The fact that repeat(0).take(n).collect() doesn't compile down to equivalent code is simply a perf bug for iterators. See http://discuss.rust-lang.org/t/the-trusted-iterator-length-problem/1302 for details.

bors added a commit that referenced this pull request Jan 14, 2015
It's passed to the underlying reader, so uninitialized memory == sad
times.

We might want to shrink the default buffer size as well. 64k is pretty
huge. Java uses 8k by default, and Go uses 4k for reference.

r? @alexcrichton
@bors bors merged commit 89f1848 into rust-lang:master Jan 14, 2015
@lilyball
Copy link
Contributor

We might want to shrink the default buffer size as well. 64k is pretty
huge. Java uses 8k by default, and Go uses 4k for reference.

My dim recollection is that it's 64k because the libuv project determined that, when reading from a local file, reading in 64k chunks provided the fastest throughput in the general case.

That said, if BufferedReader is being used for reading things that aren't files, it does seem perhaps overly large.

Perhaps it makes sense to split new() into two functions, new_large() and new_small(), where new_large() is 64k and new_small() is 4k. I'm proposing this rather than keeping the name new() because I think it makes sense to require the user to think about which size is more appropriate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants