Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid repeated re-initialization of the BufReader buffer #102760

Merged
merged 1 commit into from
Oct 7, 2022

Conversation

saethlin
Copy link
Member

@saethlin saethlin commented Oct 7, 2022

Fixes #102727

We accidentally removed this in #98748. It looks so redundant. But it isn't.

The default Read::read_buf will defensively initialize the whole buffer, if any of it is indicated to be uninitialized. In uses where reads from the wrapped Read impl completely fill the BufReader, initialized and filled are the same, and this extra member isn't required. But in the reported issue, the BufReader wraps a Read impl which will never fill the whole buffer. So the default Read::read_buf implementation repeatedly re-initializes the extra space in the buffer.

This adds back the extra initialized member, which ensures that the default Read::read_buf only zero-initialized the buffer once, and I've tried to add a comment which explains this whole situation.

@rustbot rustbot added the T-libs Relevant to the library team, which will review and decide on the PR/issue. label Oct 7, 2022
@rustbot
Copy link
Collaborator

rustbot commented Oct 7, 2022

Hey! It looks like you've submitted a new PR for the library teams!

If this PR contains changes to any rust-lang/rust public library APIs then please comment with @rustbot label +T-libs-api -T-libs to tag it appropriately. If this PR contains changes to any unstable APIs please edit the PR description to add a link to the relevant API Change Proposal or create one if you haven't already. If you're unsure where your change falls no worries, just leave it as is and the reviewer will take a look and make a decision to forward on if necessary.

Examples of T-libs-api changes:

  • Stabilizing library features
  • Introducing insta-stable changes such as new implementations of existing stable traits on existing stable types
  • Introducing new or changing existing unstable library APIs (excluding permanently unstable features / features without a tracking issue)
  • Changing public documentation in ways that create new stability guarantees
  • Changing observable runtime behavior of library APIs

@rust-highfive
Copy link
Collaborator

r? @Mark-Simulacrum

(rust-highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Oct 7, 2022
@saethlin saethlin changed the title Avoid defensive re-initialization of the BufReader buffer Avoid repeated re-initialization of the BufReader buffer Oct 7, 2022
@Mark-Simulacrum
Copy link
Member

How difficult is it to write a test here? I suspect it would be pretty annoying, but maybe we can reasonably do it with some helper cfg(test) exposed bits...

@saethlin
Copy link
Member Author

saethlin commented Oct 7, 2022

I'm proposing that rustc-perf add a runtime benchmark that targets a reasonably-real-world use case that tickles this code path: rust-lang/rustc-perf#1460

Does that assuage your concern about testing?

@Mark-Simulacrum
Copy link
Member

I guess it's not the worst way to do this, though it is a pretty weak signal (and doesn't stop us from landing regressions since runtime benchmarks are a ways away from automatically running).

Is it hard for us to directly expose the initialized variable in cfg(test) so we can just assert that it behaves properly?

I'm okay approving this (and also going to nominate for backport as a bad regression) without that, but it would definitely make me feel better to have that direct feedback mechanism.

@Mark-Simulacrum Mark-Simulacrum added the beta-nominated Nominated for backporting to the compiler in the beta channel. label Oct 7, 2022
@saethlin saethlin force-pushed the dont-reinit-buffer branch from 811b6f8 to 95ae993 Compare October 7, 2022 03:32
@saethlin
Copy link
Member Author

saethlin commented Oct 7, 2022

I added a test which asserts that the initialized member behaves the way it currently does. Honestly I'm entirely unclear on what "proper" behavior would be here, because this whole issue is about the initialization which is done by a default/provided impl of an unstable API. And moreover, this is about avoiding initialization which is being done by an API which was ostensibly designed to avoid initialization. And in any case, the problem here was that we deleted this mechanism entirely, because we totally didn't understand what it was for. I think the new comment is as good of protection against making that mistake again as any test I can write.

I'm not entirely sure benchmarks are a weaker signal than any kind of unit test we can write here. The problem reported was that some code became slower due to re-initialization of the buffer. This test is no protection at all against that. We don't have a way to detect that those bytes in the possibly-uninitialized region of the Buffer's buffer have only been initialized once.

So, if you have a better thing to test for, I would love to hear about it.

@the8472
Copy link
Member

the8472 commented Oct 7, 2022

Shouldn't we also replace the manual fill with a memset (ptr::write_bytes)?

@estebank
Copy link
Contributor

estebank commented Oct 7, 2022

though it is a pretty weak signal

It will at least get flagged eventually. It will not stop a nightly regression from landing, but it certainly would stop a stable regression.

@saethlin
Copy link
Member Author

saethlin commented Oct 7, 2022

I wouldn't object to improving the manual fill implementation, but I feel like that's another optimization on the side where here I'm just trying to fix the regression?

@Mark-Simulacrum
Copy link
Member

I am happy with this PR now, happy to leave future improvements to the future.

@bors r+

@bors
Copy link
Contributor

bors commented Oct 7, 2022

📌 Commit 95ae993 has been approved by Mark-Simulacrum

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Oct 7, 2022
bors added a commit to rust-lang-ci/rust that referenced this pull request Oct 7, 2022
Rollup of 6 pull requests

Successful merges:

 - rust-lang#102300 (Use a macro to not have to copy-paste `ConstFnMutClosure::new(&mut fold, NeverShortCircuit::wrap_mut_2_imp)).0` everywhere)
 - rust-lang#102475 (unsafe keyword: trait examples and unsafe_op_in_unsafe_fn update)
 - rust-lang#102760 (Avoid repeated re-initialization of the BufReader buffer)
 - rust-lang#102764 (Check `WhereClauseReferencesSelf` after all other object safety checks)
 - rust-lang#102779 (Fix `type_of` ICE)
 - rust-lang#102780 (run Miri CI when std::sys changes)

Failed merges:

r? `@ghost`
`@rustbot` modify labels: rollup
@bors bors merged commit fe42003 into rust-lang:master Oct 7, 2022
@rustbot rustbot added this to the 1.66.0 milestone Oct 7, 2022
Dylan-DPC added a commit to Dylan-DPC/rust that referenced this pull request Oct 12, 2022
Use memset to initialize readbuf

The write loop was found to be slow in rust-lang#102727

The proper fix is in rust-lang#102760 but this might still help debug builds and code running under miri by using the write_bytes intrinsic instead of writing one byte at a time.
@m-ou-se m-ou-se added the beta-accepted Accepted for backporting to the compiler in the beta channel. label Oct 19, 2022
@cuviper cuviper mentioned this pull request Oct 20, 2022
@cuviper cuviper removed this from the 1.66.0 milestone Oct 20, 2022
@cuviper cuviper added this to the 1.65.0 milestone Oct 20, 2022
@cuviper cuviper removed the beta-nominated Nominated for backporting to the compiler in the beta channel. label Oct 20, 2022
bors added a commit to rust-lang-ci/rust that referenced this pull request Oct 21, 2022
[beta] backports

- Use rebind instead of dummy binder in `SameTypeModuloInfer` relation rust-lang#102059
- Add missing space between notable trait tooltip and where clause rust-lang#102107
- Avoid repeated re-initialization of the BufReader buffer rust-lang#102760
- Ensure enum cast moves rust-lang#103016
- Fix `TyKind::is_simple_path` rust-lang#103176
- Do anonymous lifetimes remapping correctly for nested rpits rust-lang#103205
- [beta] Cargo backport 1.65.0 rust-lang#103303
- linker: Fix weak lang item linking with combination windows-gnu + LLD + LTO rust-lang#103092

r? `@ghost`
@saethlin saethlin deleted the dont-reinit-buffer branch March 15, 2023 00:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
beta-accepted Accepted for backporting to the compiler in the beta channel. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Performance regression in 1.64+ when BufReader inner reader doesn't fill the buffer
9 participants