- 
                Notifications
    You must be signed in to change notification settings 
- Fork 13.9k
Add comments to explain memory usage optimization #78887
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| r? @lcnr (rust_highfive has picked a reviewer for you, use r? to override) | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC state.clone_from(...) and state = (...).clone() should be equivalent in functionality, so this shouldn't change the behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we have a domain which potentially reuses an allocation when using clone_from this could matter, but I don't really know if that can be the case
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't BitSet already do that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we just do a perf run to check?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd rather not remove this optimization, even if it doesn't affect perf today
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm. I found this code confusing when I read through it since it makes it look like state is shared between each loop iteration, when in fact it's reinitialized for each iteration.
| @bors try @rust-timer queue | 
| Awaiting bors try build completion | 
| ⌛ Trying commit 172b60de3067c9b66bd111298fc898e739ebb8e6 with merge f79c7551c0ed89caa2987f63188f7a66b7eea242... | 
| ☀️ Try build successful - checks-actions | 
| Queued f79c7551c0ed89caa2987f63188f7a66b7eea242 with parent 1773f60, future comparison URL. | 
| Finished benchmarking try commit (f79c7551c0ed89caa2987f63188f7a66b7eea242): comparison url. Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. Please note that if the perf results are neutral, you should likely undo the rollup=never given below by specifying  Importantly, though, if the results of this run are non-neutral do not roll this PR up -- it will mask other regressions or improvements in the roll up. @bors rollup=never | 
| The results seem pretty neutral. However, I've never analyzed perf results before, so take that with a grain of salt :) | 
| There are small but clear regressions in a lot of benchmarks though? | 
| Looking at wall-clock time, it seems that there were some improvements and some regressions. Btw, are there docs on how to interpret perf results? | 
| Anything based on real-world time is going to be too noisy to be useful for a low-impact change like this. I don't think we have any docs on this at the moment (but many parts are not Rust-specific, they apply to interpreting profiles in general). Regarding this PR, I don't think we want to land this as-is, but I'd be happy to r+ a patch that adds a short comment explaining why  | 
| I agree that this is a small, but fairly clear, regression in instruction counts on multiple benchmarks. Wall times are indeed too noisy for looking at changes like this. We do not have significant documentation on performance triaging, it probably makes sense to write something but I don't know if I can find the time to do so myself. | 
| Okay, I'll change this so that it's just adding an explanatory comment :) | 
172b60d    to
    0242f96      
    Compare
  
    state declaration into loop to make code clearer| // `state` is not actually used between iterations; | ||
| // this is just an optimization to avoid reallocating | ||
| // every iteration. | ||
| let mut state = analysis.bottom_value(body); | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's an idea for a further optimization: change this to
| let mut state = analysis.bottom_value(body); | |
| let mut state = MaybeUninit::uninit(); | 
since it's always initialized with clone_from below. Might not be worth the unsafety though 🤷
| @bors r+ rollup | 
| 📌 Commit 0242f96 has been approved by  | 
…as-schievink Rollup of 14 pull requests Successful merges: - rust-lang#76765 (Make it more clear what an about async fn's returns when referring to what it returns) - rust-lang#78574 (Use check-pass instead of build-pass in regions ui test suite) - rust-lang#78669 (Use check-pass instead of build-pass in some consts ui test suits) - rust-lang#78847 (Assert that a return place is not used for indexing during integration) - rust-lang#78854 (Workaround for "could not fully normalize" ICE ) - rust-lang#78875 (rustc_target: Further cleanup use of target options) - rust-lang#78887 (Add comments to explain memory usage optimization) - rust-lang#78890 (comment attribution fix) - rust-lang#78896 (Clarified description of write! macro) - rust-lang#78897 (Add missing newline to error message of the default OOM hook) - rust-lang#78898 (add regression test for rust-lang#78892) - rust-lang#78908 ((rustdoc) [src] link for types defined by macros shows invocation, not defintion) - rust-lang#78910 (Fix links to stabilized versions of some intrinsics) - rust-lang#78912 (Add macro test for min-const-generics) Failed merges: r? `@ghost` `@rustbot` modify labels: rollup
Add explanatory comments so that people understand that it's just an optimization and doesn't affect behavior.