Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improved IO Bytes Size Hint #81136

Merged
merged 21 commits into from
Mar 5, 2021
Merged

Conversation

Xavientois
Copy link
Contributor

After trying to implement better size_hint() return values for File in this PR and changing to implementing it for BufReader in this PR, I have arrived at this implementation that provides tighter bounds for the Bytes iterator of various readers including BufReader, Empty, and Chain.

Unfortunately, for BufReader, the size_hint only improves after calling fill_buffer due to it using the contents of the buffer for the hint. Nevertheless, the the tighter bounds should result in better pre-allocation of space to handle the contents of the Bytes iterator.

Closes #81052

@rust-highfive
Copy link
Collaborator

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @cramertj (or someone else) soon.

If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.

Please see the contribution instructions for more information.

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jan 17, 2021
@bors
Copy link
Contributor

bors commented Jan 31, 2021

☔ The latest upstream changes (presumably #81578) made this pull request unmergeable. Please resolve the merge conflicts.

@rust-log-analyzer
Copy link
Collaborator

The job mingw-check failed! Check out the build log: (web) (plain)

Click to see the possible cause of the failure (guessed by this bot)
configure: rust.channel         := nightly
configure: rust.debug-assertions := True
configure: llvm.assertions      := True
configure: dist.missing-tools   := True
configure: build.configure-args := ['--enable-sccache', '--disable-manage-submodu ...
configure: writing `config.toml` in current directory
configure: 
configure: run `python /checkout/x.py --help`
configure: 
---
skip untracked path cpu-usage.csv during rustfmt invocations
skip untracked path src/doc/book/ during rustfmt invocations
skip untracked path src/doc/rust-by-example/ during rustfmt invocations
skip untracked path src/llvm-project/ during rustfmt invocations
Diff in /checkout/library/std/src/io/util.rs at line 4:
Running `"/checkout/obj/build/x86_64-unknown-linux-gnu/stage0/bin/rustfmt" "--config-path" "/checkout" "--edition" "2018" "--unstable-features" "--skip-children" "--check" "/checkout/library/std/src/io/util.rs"` failed.
If you're running `tidy`, try again with `--bless`. Or, if you just want to format code, run `./x.py fmt` instead.
 mod tests;
 use crate::fmt;
 use crate::fmt;
-use crate::io::{self, BufRead, Initializer, IoSlice, IoSliceMut, Read, Seek, SeekFrom, SizeHint, Write};
+use crate::io::{
+    self, BufRead, Initializer, IoSlice, IoSliceMut, Read, Seek, SeekFrom, SizeHint, Write,
 
 
 /// A reader which is always at EOF.
failed to run: /checkout/obj/build/bootstrap/debug/bootstrap test --stage 2 src/tools/tidy
Build completed unsuccessfully in 0:00:16

@Dylan-DPC-zz
Copy link

@cramertj this is ready for review

@chadbrewbaker
Copy link

Sorry if I am blind, but I didn't see an implementation. A benchmark would be nice. BufReader is a pain point for me right now in several ways.

Chiefly, read_until doesn't support stream processing. You should be able to pass in a function that does a left fold over fixed buffer reads until it hits the end of line or whatnot. I'm agnostic if the buffer stays as Vector that doesn't grow or is a u8 array.

fn read_until_streaming(&mut self, byte: u8, buf: &mut Vec<u8>, left_fold_func, init_state ) ->
 (Result<usize>, final_state) {
    }

A "hello world" might preform an XOR over the bytes of each line.

struct XorState {
    sum: u8 =0;
}

fun xor_fold( &mut state: XorState, buf: &mut Vec<u8>, byte: u8){
   // while (buf.peek() != byte)
   // state.sum = xor(state.sum, buf.get() );
}

@Xavientois
Copy link
Contributor Author

@chadbrewbaker I'm not sure I understand your question. If you're wondering where the updated size hint is implemented for BufReader it is on line 441 of library/std/src/io/buffered/bufreader.rs.

Are you asking benchmarks to show the performance boost for this improved size hint? I am not sure what I would be checking since the goal of this PR is to provide tighter bounds from the size hint when possible, which is shown by the unit tests I included.

@chadbrewbaker
Copy link

Sorry, missed the unit test. Ok.

@cramertj
Copy link
Member

cramertj commented Mar 2, 2021

@bors r+

@bors
Copy link
Contributor

bors commented Mar 2, 2021

📌 Commit 7674ae1 has been approved by cramertj

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Mar 2, 2021
Dylan-DPC-zz pushed a commit to Dylan-DPC-zz/rust that referenced this pull request Mar 3, 2021
…ramertj

Improved IO Bytes Size Hint

After trying to implement better `size_hint()` return values for `File` in [this PR](rust-lang#81044) and changing to implementing it for `BufReader` in [this PR](rust-lang#81052), I have arrived at this implementation that provides tighter bounds for the `Bytes` iterator of various readers including `BufReader`, `Empty`, and `Chain`.

Unfortunately, for `BufReader`, the size_hint only improves after calling `fill_buffer` due to it using the contents of the buffer for the hint. Nevertheless, the the tighter bounds  should result in better pre-allocation of space to handle the contents of the `Bytes` iterator.

Closes rust-lang#81052
Dylan-DPC-zz pushed a commit to Dylan-DPC-zz/rust that referenced this pull request Mar 4, 2021
…ramertj

Improved IO Bytes Size Hint

After trying to implement better `size_hint()` return values for `File` in [this PR](rust-lang#81044) and changing to implementing it for `BufReader` in [this PR](rust-lang#81052), I have arrived at this implementation that provides tighter bounds for the `Bytes` iterator of various readers including `BufReader`, `Empty`, and `Chain`.

Unfortunately, for `BufReader`, the size_hint only improves after calling `fill_buffer` due to it using the contents of the buffer for the hint. Nevertheless, the the tighter bounds  should result in better pre-allocation of space to handle the contents of the `Bytes` iterator.

Closes rust-lang#81052
@bors
Copy link
Contributor

bors commented Mar 4, 2021

⌛ Testing commit 7674ae1 with merge b347d73d836ef2d36e2e8160b567c5b3edeee259...

@rust-log-analyzer
Copy link
Collaborator

The job mingw-check failed! Check out the build log: (web) (plain)

Click to see the possible cause of the failure (guessed by this bot)
Removing intermediate container 93ce468f24a5
 ---> 7ae884a133af
Step 5/10 : RUN npm install es-check -g
 ---> Running in 440ac6ccb27e
/node-v14.4.0-linux-x64/bin/es-check -> /node-v14.4.0-linux-x64/lib/node_modules/es-check/index.js

> spawn-sync@1.0.15 postinstall /node-v14.4.0-linux-x64/lib/node_modules/es-check/node_modules/spawn-sync
> node postinstall
+ es-check@5.2.1
added 95 packages from 44 contributors in 3.766s
Removing intermediate container 440ac6ccb27e
 ---> b4d6f0a1ef82
---
Cloning into 'rust-toolstate'...
<Nothing changed>
+ es-check es5 ../src/librustdoc/html/static/main.js ../src/librustdoc/html/static/settings.js ../src/librustdoc/html/static/source-script.js ../src/librustdoc/html/static/storage.js

Cannot read property 'includes' of undefined

@bors
Copy link
Contributor

bors commented Mar 4, 2021

💔 Test failed - checks-actions

@bors bors added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Mar 4, 2021
@JohnTitor
Copy link
Member

Unrelated CI failure, @bors retry

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Mar 4, 2021
Dylan-DPC-zz pushed a commit to Dylan-DPC-zz/rust that referenced this pull request Mar 5, 2021
…ramertj

Improved IO Bytes Size Hint

After trying to implement better `size_hint()` return values for `File` in [this PR](rust-lang#81044) and changing to implementing it for `BufReader` in [this PR](rust-lang#81052), I have arrived at this implementation that provides tighter bounds for the `Bytes` iterator of various readers including `BufReader`, `Empty`, and `Chain`.

Unfortunately, for `BufReader`, the size_hint only improves after calling `fill_buffer` due to it using the contents of the buffer for the hint. Nevertheless, the the tighter bounds  should result in better pre-allocation of space to handle the contents of the `Bytes` iterator.

Closes rust-lang#81052
JohnTitor added a commit to JohnTitor/rust that referenced this pull request Mar 5, 2021
…ramertj

Improved IO Bytes Size Hint

After trying to implement better `size_hint()` return values for `File` in [this PR](rust-lang#81044) and changing to implementing it for `BufReader` in [this PR](rust-lang#81052), I have arrived at this implementation that provides tighter bounds for the `Bytes` iterator of various readers including `BufReader`, `Empty`, and `Chain`.

Unfortunately, for `BufReader`, the size_hint only improves after calling `fill_buffer` due to it using the contents of the buffer for the hint. Nevertheless, the the tighter bounds  should result in better pre-allocation of space to handle the contents of the `Bytes` iterator.

Closes rust-lang#81052
m-ou-se added a commit to m-ou-se/rust that referenced this pull request Mar 5, 2021
…ramertj

Improved IO Bytes Size Hint

After trying to implement better `size_hint()` return values for `File` in [this PR](rust-lang#81044) and changing to implementing it for `BufReader` in [this PR](rust-lang#81052), I have arrived at this implementation that provides tighter bounds for the `Bytes` iterator of various readers including `BufReader`, `Empty`, and `Chain`.

Unfortunately, for `BufReader`, the size_hint only improves after calling `fill_buffer` due to it using the contents of the buffer for the hint. Nevertheless, the the tighter bounds  should result in better pre-allocation of space to handle the contents of the `Bytes` iterator.

Closes rust-lang#81052
bors added a commit to rust-lang-ci/rust that referenced this pull request Mar 5, 2021
Rollup of 10 pull requests

Successful merges:

 - rust-lang#80723 (Implement NOOP_METHOD_CALL lint)
 - rust-lang#80763 (resolve: Reduce scope of `pub_use_of_private_extern_crate` deprecation lint)
 - rust-lang#81136 (Improved IO Bytes Size Hint)
 - rust-lang#81939 (Add suggestion `.collect()` for iterators in iterators)
 - rust-lang#82289 (Fix underflow in specialized ZipImpl::size_hint)
 - rust-lang#82728 (Avoid unnecessary Vec construction in BufReader)
 - rust-lang#82764 (Add {BTreeMap,HashMap}::try_insert)
 - rust-lang#82770 (Add assert_matches macro.)
 - rust-lang#82773 (Add diagnostic item to `Default` trait)
 - rust-lang#82787 (Remove unused code from main.js)

Failed merges:

r? `@ghost`
`@rustbot` modify labels: rollup
@bors bors merged commit 6013811 into rust-lang:master Mar 5, 2021
@rustbot rustbot added this to the 1.52.0 milestone Mar 5, 2021
@Xavientois Xavientois deleted the io_reader_size_hint branch December 9, 2021 14:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants