Add Read::size_hint and pre-allocate in read_to_end #45928

SimonSapin · 2017-11-11T09:51:26Z

Many FromIterator implementations rely on Iterator::size_hint to pre-allocate memory. This is a prototype of what something equivalent for std::io::Read could look like. An alternative would be to not add any public API but use an private specialization trait, like ZipImpl in libcore.

This came up in #45837 in the case of reading from a file. I’ve measured File::read_to_end with a vector created with Vec::with_capacity(file.metadata().len() as usize), compared to Vec::new(). On my linux desktop with an SSD (though everything is probably in filesystem cache here), pre-allocating + reading take 3% more time to 43% less time depending on file size.

     Running target/release/deps/read_to_end_bench-9122fdf004b0166c

running 24 tests
test a_read_128::vec_new            ... bench:       1,380 ns/iter (+/- 49) = 92 MB/s
test a_read_128::vec_with_capacity  ... bench:       1,416 ns/iter (+/- 107) = 90 MB/s
test b_read_512::vec_new            ... bench:       1,690 ns/iter (+/- 78) = 302 MB/s
test b_read_512::vec_with_capacity  ... bench:       1,736 ns/iter (+/- 52) = 294 MB/s
test c_read_2k::vec_new             ... bench:       2,085 ns/iter (+/- 113) = 982 MB/s
test c_read_2k::vec_with_capacity   ... bench:       2,015 ns/iter (+/- 72) = 1016 MB/s
test d_read_8k::vec_new             ... bench:       2,778 ns/iter (+/- 131) = 2948 MB/s
test d_read_8k::vec_with_capacity   ... bench:       2,516 ns/iter (+/- 89) = 3255 MB/s
test e_read_32k::vec_new            ... bench:       5,107 ns/iter (+/- 103) = 6416 MB/s
test e_read_32k::vec_with_capacity  ... bench:       4,404 ns/iter (+/- 81) = 7440 MB/s
test f_read_128k::vec_new           ... bench:      16,232 ns/iter (+/- 106) = 8074 MB/s
test f_read_128k::vec_with_capacity ... bench:      12,223 ns/iter (+/- 103) = 10723 MB/s
test g_read_512k::vec_new           ... bench:      39,179 ns/iter (+/- 210) = 13381 MB/s
test g_read_512k::vec_with_capacity ... bench:      31,704 ns/iter (+/- 98) = 16536 MB/s
test h_read_1m::vec_new             ... bench:     463,119 ns/iter (+/- 2,354) = 2264 MB/s
test h_read_1m::vec_with_capacity   ... bench:     251,983 ns/iter (+/- 3,072) = 4161 MB/s
test i_read_2m::vec_new             ... bench:     668,742 ns/iter (+/- 8,317) = 3135 MB/s
test i_read_2m::vec_with_capacity   ... bench:     383,879 ns/iter (+/- 1,269) = 5463 MB/s
test j_read_4m::vec_new             ... bench:   1,188,553 ns/iter (+/- 13,409) = 3528 MB/s
test j_read_4m::vec_with_capacity   ... bench:     857,714 ns/iter (+/- 8,254) = 4890 MB/s
test k_read_8m::vec_new             ... bench:   2,302,201 ns/iter (+/- 15,746) = 3643 MB/s
test k_read_8m::vec_with_capacity   ... bench:   1,947,089 ns/iter (+/- 10,942) = 4308 MB/s
test l_read_32m::vec_new            ... bench:   8,990,230 ns/iter (+/- 18,718) = 3732 MB/s
test l_read_32m::vec_with_capacity  ... bench:   8,648,669 ns/iter (+/- 23,157) = 3879 MB/s

test result: ok. 0 passed; 0 failed; 0 ignored; 24 measured; 0 filtered out

#![feature(test)]
extern crate test;
extern crate tempdir;

use std::fs::File;
use std::io::{Write, Read};
use test::Bencher;

fn run<F>(bencher: &mut Bencher, size: usize, new_buffer: F)
    where F: Fn(&File) -> Vec<u8>
{
    let dir = tempdir::TempDir::new("bench").unwrap();
    let path = dir.path().join("something");
    File::create(&path).unwrap().write_all(&vec![42; size]).unwrap();
    bencher.bytes = size as u64;
    bencher.iter(|| {
        let mut file = File::open(&path).unwrap();
        let mut buffer = new_buffer(&file);
        file.read_to_end(&mut buffer).unwrap();
    })
}

macro_rules! sizes {
    ($( $name: ident  $size: expr )+) => {
        $(
            mod $name {
                use super::*;

                #[bench]
                fn vec_new(bencher: &mut Bencher) {
                    run(bencher, $size, |_| Vec::new())
                }

                #[bench]
                fn vec_with_capacity(bencher: &mut Bencher) {
                    run(bencher, $size, |file| {
                        Vec::with_capacity(file.metadata().unwrap().len() as usize)
                    })
                }
            }
        )+
    }
}

sizes! {
    a_read_128 128
    b_read_512 512
    c_read_2k 2 * 1024
    d_read_8k 8 * 1024
    e_read_32k 32 * 1024
    f_read_128k 128 * 1024
    g_read_512k 512 * 1024
    h_read_1m 1024 * 1024
    i_read_2m 2 * 1024 * 1024
    j_read_4m 4 * 1024 * 1024
    k_read_8m 8 * 1024 * 1024
    l_read_32m 32 * 1024 * 1024
}

bluss · 2017-11-11T10:54:19Z

If this adds a new system call when calling read_to_end on a File, it has been discussed before. Putting a system call in size_hint (metadata) is not ideal. (I can't find it though.)

SimonSapin · 2017-11-11T10:57:43Z

Yes, File::size_hint as implemented in the current PR does make a system call. I agree this is not ideal. I expect this PR won’t be merged as-is, I submitted it to have some starting point to discuss what if anything should be done here.

ollie27 · 2017-11-11T23:29:17Z

src/libstd/fs.rs

@@ -449,6 +449,13 @@ impl Read for File {
        self.inner.read(buf)
    }

+    fn size_hint(&self) -> usize {
+        match self.metadata() {
+            Ok(meta) => meta.len() as usize,


It's a minor point but this should probably use a saturating cast not as.

Is there an API for doing saturating integer casts? Writing #[cfg(target_pointer_size = …)] code here seems tedious.

Sadly not yet (rust-lang/rfcs#1218). Something like cmp::min(meta.len(), usize::max_value() as u64) as usize should work though.

On the other hand this is a hint, the default impl returns zero without necessarily being at EOF so underestimating should be OK. And if you’re reading a file larger than your address space, you’re gonna have a hard time in read_to_end anyway.

This also needs to check its current position and subtract that off, right?

Right. However the obvious fix doesn’t work because <File as Seek>::seek takes &mut self, even with SeekFrom::Current(0). I wonder if there should be some API like File::position(&self) -> io::Result<u64>.

Never mind, <std::fs::File as Seek>::seek calls std::sys::fs::File::seek which takes &self, so size_hint can call the latter directly.

mbrubeck · 2017-11-13T16:55:47Z

read_to_end still reads at most 8KB at a time, even if the entire buffer is allocated up-front. The pre-allocated case could be optimized further by reading up to the available capacity in a single read syscall.

Update: This was fixed by #46050.

shepmaster · 2017-11-18T17:46:28Z

random-assigning to...

r? @sfackler

sfackler · 2017-11-20T05:52:26Z

Does #46050 give the same improvement?

sunfishcode · 2017-11-20T10:57:17Z

A non-obvious detail of read_to_end's loop is that you always have to allocate at least one byte more than the file size, because it reads until read returns 0, and that last read needs to be passed some space that it won't use.

If you're calling metadata(), when it provides a size, can you trust the size (and not take it as a hint)? If so, in that case you could add alternate logic in read_to_end that just reads until it gets that many bytes. That would avoid the need to allocate extra space, and the extra read syscall at the end. In the common case you could do one fstat and one read, rather than two reads.

mbrubeck · 2017-11-20T15:20:19Z

You can't trust the size from metadata because the file may change between the metadata call and the read call.

sunfishcode · 2017-11-20T16:38:08Z

You always have to be prepared to get fewer bytes than expected, or errors, but if there are more bytes than expected, it's no different from someone appending bytes after your last read returned 0. You'll miss the newly appended bytes, but it was a race anyway.

That assumes that metadata() isn't memoized, but it doesn't currently appear to be.

bors · 2017-11-21T20:08:15Z

☔ The latest upstream changes (presumably #46166) made this pull request unmergeable. Please resolve the merge conflicts.

carols10cents · 2017-11-27T15:05:04Z

What's the status of this @SimonSapin ? I'm having a hard time telling :)

SimonSapin · 2017-11-28T17:38:34Z

Does #46050 give the same improvement?

I’ve run the benchmark again in rustc 1.24.0-nightly (560a5da 2017-11-27), which includes #46050. Previous results were “3% more time to 43% less time” Now the time with Vec::with_capacity is 4% to 48% less time than Vec::new.

So this PR’s improvement seems independent of #46050’s improvement.

More important IMO is whether Read::size_hint is the kind of public API we want to stabilize eventually.

test a_read_128::vec_new            ... bench:       1,362 ns/iter (+/- 200) = 93 MB/s
test a_read_128::vec_with_capacity  ... bench:       1,158 ns/iter (+/- 199) = 110 MB/s
test b_read_512::vec_new            ... bench:       1,705 ns/iter (+/- 203) = 300 MB/s
test b_read_512::vec_with_capacity  ... bench:       1,172 ns/iter (+/- 301) = 436 MB/s
test c_read_2k::vec_new             ... bench:       2,102 ns/iter (+/- 188) = 974 MB/s
test c_read_2k::vec_with_capacity   ... bench:       1,193 ns/iter (+/- 327) = 1716 MB/s
test d_read_8k::vec_new             ... bench:       2,759 ns/iter (+/- 271) = 2969 MB/s
test d_read_8k::vec_with_capacity   ... bench:       1,412 ns/iter (+/- 187) = 5801 MB/s
test e_read_32k::vec_new            ... bench:       4,957 ns/iter (+/- 93) = 6610 MB/s
test e_read_32k::vec_with_capacity  ... bench:       2,878 ns/iter (+/- 205) = 11385 MB/s
test f_read_128k::vec_new           ... bench:      14,690 ns/iter (+/- 135) = 8922 MB/s
test f_read_128k::vec_with_capacity ... bench:       9,101 ns/iter (+/- 160) = 14401 MB/s
test g_read_512k::vec_new           ... bench:      30,788 ns/iter (+/- 284) = 17028 MB/s
test g_read_512k::vec_with_capacity ... bench:      21,640 ns/iter (+/- 42) = 24227 MB/s
test h_read_1m::vec_new             ... bench:     444,013 ns/iter (+/- 3,361) = 2361 MB/s
test h_read_1m::vec_with_capacity   ... bench:     232,138 ns/iter (+/- 2,474) = 4517 MB/s
test i_read_2m::vec_new             ... bench:     633,958 ns/iter (+/- 3,849) = 3308 MB/s
test i_read_2m::vec_with_capacity   ... bench:     343,065 ns/iter (+/- 813) = 6112 MB/s
test j_read_4m::vec_new             ... bench:   1,114,472 ns/iter (+/- 12,811) = 3763 MB/s
test j_read_4m::vec_with_capacity   ... bench:     773,090 ns/iter (+/- 2,065) = 5425 MB/s
test k_read_8m::vec_new             ... bench:   2,156,756 ns/iter (+/- 7,510) = 3889 MB/s
test k_read_8m::vec_with_capacity   ... bench:   1,792,980 ns/iter (+/- 5,539) = 4678 MB/s
test l_read_32m::vec_new            ... bench:   8,499,907 ns/iter (+/- 24,512) = 3947 MB/s
test l_read_32m::vec_with_capacity  ... bench:   8,129,977 ns/iter (+/- 22,403) = 4127 MB/s

sunfishcode · 2017-11-28T18:30:54Z

Reserving size_hint bytes sets up read_to_end to reallocate in the common case, because it needs extra space for the final read to detect EOF. This can create 2x or more slowdowns in some cases and the resulting buffers are twice as large as they need to be due to Vec's doubling behavior.

This can be fixed either by reserving size_hint + 1 bytes rather than size_hint bytes, or by #46340.

This one byte of extra capacity is necessary for the final zero-size read that indicates EOF.

SimonSapin · 2017-11-28T18:39:48Z

Good point. I’ve added the + 1. As mbrubeck mentioned above, the approach in #46340 is racy since the file could grow in the meantime.

sunfishcode · 2017-11-28T18:44:39Z

To be sure, read_to_end is already racy; if someone appends to a file while read_to_end is reading it, read_to_end may or may not see the appended bytes, because if it gets to EOF before the write happens, it stops reading.

sfackler · 2017-11-28T19:34:04Z

Cool - something like this seems worthwhile then. Should it return an io::Result<usize>? You can always return 0 if something went wrong, but it seems like it's nice to not throw away those errors. I kind of doubt that reads would succeed if you can't stat the file anymore.

SimonSapin · 2017-11-30T13:18:31Z

@sfackler Good point. Changed to io::Result<usize>.

SimonSapin · 2017-11-30T13:24:57Z

I don’t have a strong opinion on the strategy of this PR v.s. that of #46340. A key difference is the semantics of size_hint / size_snapshot in the general Read case, not just File: size_hint is intended to be an estimate and defaults to zero (like the first component of Iterator::size_hint), size_snapshot is optional and when Some(_) is assumed to be exact.

bors · 2017-12-04T12:39:37Z

☔ The latest upstream changes (presumably #46485) made this pull request unmergeable. Please resolve the merge conflicts.

sfackler · 2017-12-04T17:12:50Z

Sorry for the delay here - should we maybe make an issue to figure out the approach to take? I'd like to avoid splitting the discussion across two PRs.

shepmaster · 2017-12-09T15:39:10Z

I'm going to make an executive decision and mark this and #46340 as waiting on a team decision as it seems that they cannot both exist together. Let me know if you disagree.

sfackler · 2018-01-10T06:05:00Z

The @rust-lang/libs team talked about this and #46340 during our triage meeting and decided that we're not looking to expand the surface area of Read in this way just yet. Instead, we can modify the implementation of fs::read to do a length lookup which should cover 99% of the use cases for this kind of thing.

Pre-allocate in fs::read and fs::read_string This is a simpler alternative to rust-lang#46340 and rust-lang#45928, as requested by the libs team.

This was referenced Nov 11, 2017

Read::read_to_end time is quadratic for input over 8KB #45851

Closed

Add read, read_string, and write functions to std::fs #45837

Merged

kennytm added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. labels Nov 11, 2017

ollie27 reviewed Nov 11, 2017

View reviewed changes

rust-highfive assigned sfackler Nov 18, 2017

SimonSapin added 2 commits November 28, 2017 18:08

Use early return in std::io::read_to_end

7c09cd5

Add Read::size_hint, similar to Iterator::size_hint

44c93ce

SimonSapin force-pushed the io-size-hint branch from c2f42cc to dd1fc4a Compare November 28, 2017 17:09

Pre-allocate in Read::read_to_end and read_to_string based on size_hint

6adc9e6

SimonSapin force-pushed the io-size-hint branch from dd1fc4a to 6adc9e6 Compare November 28, 2017 17:17

sunfishcode mentioned this pull request Nov 28, 2017

Size snapshot #46340

Closed

read_to_end: reserve 1 byte more than size_hint

a84c028

This one byte of extra capacity is necessary for the final zero-size read that indicates EOF.

Change Read::size_hint return io::Result

1599ffc

Account for current position in File::size_hint

c636c37

shepmaster added S-waiting-on-team Status: Awaiting decision from the relevant subteam (see the T-<team> label). and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Dec 9, 2017

sfackler closed this Jan 10, 2018

mbrubeck mentioned this pull request Jan 10, 2018

Pre-allocate in fs::read and fs::read_string #47324

Merged

Add Read::size_hint and pre-allocate in read_to_end #45928

Add Read::size_hint and pre-allocate in read_to_end #45928

Uh oh!

Conversation

SimonSapin commented Nov 11, 2017

Uh oh!

bluss commented Nov 11, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

SimonSapin commented Nov 11, 2017

Uh oh!

ollie27 Nov 11, 2017

Choose a reason for hiding this comment

Uh oh!

SimonSapin Nov 12, 2017

Choose a reason for hiding this comment

Uh oh!

ollie27 Nov 12, 2017

Choose a reason for hiding this comment

Uh oh!

SimonSapin Nov 12, 2017

Choose a reason for hiding this comment

Uh oh!

sfackler Nov 28, 2017

Choose a reason for hiding this comment

Uh oh!

SimonSapin Nov 30, 2017

Choose a reason for hiding this comment

Uh oh!

SimonSapin Nov 30, 2017

Choose a reason for hiding this comment

Uh oh!

mbrubeck commented Nov 13, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

shepmaster commented Nov 18, 2017

Uh oh!

sfackler commented Nov 20, 2017

Uh oh!

sunfishcode commented Nov 20, 2017

Uh oh!

mbrubeck commented Nov 20, 2017

Uh oh!

sunfishcode commented Nov 20, 2017

Uh oh!

bors commented Nov 21, 2017

Uh oh!

carols10cents commented Nov 27, 2017

Uh oh!

SimonSapin commented Nov 28, 2017

Uh oh!

sunfishcode commented Nov 28, 2017

Uh oh!

SimonSapin commented Nov 28, 2017

Uh oh!

sunfishcode commented Nov 28, 2017

Uh oh!

sfackler commented Nov 28, 2017

Uh oh!

SimonSapin commented Nov 30, 2017

Uh oh!

SimonSapin commented Nov 30, 2017

Uh oh!

bors commented Dec 4, 2017

Uh oh!

sfackler commented Dec 4, 2017

Uh oh!

shepmaster commented Dec 9, 2017

Uh oh!

sfackler commented Jan 10, 2018

Uh oh!

Uh oh!

bluss commented Nov 11, 2017 •

edited

Loading

mbrubeck commented Nov 13, 2017 •

edited

Loading