-
Notifications
You must be signed in to change notification settings - Fork 13.8k
Avoid invalidating CFG caches from MirPatch::apply. #146697
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Some changes occurred to MIR optimizations cc @rust-lang/wg-mir-opt |
r=me after perf @bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
Avoid invalidating CFG caches from MirPatch::apply.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Finished benchmarking commit (37ed41a): comparison URL. Overall result: ❌✅ regressions and improvements - no action neededBenchmarking this pull request means it may be perf-sensitive – we'll automatically label it not fit for rolling up. You can override this, but we strongly advise not to, due to possible changes in compiler perf. @bors rollup=never Instruction countOur most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.
Max RSS (memory usage)Results (secondary -3.2%)A less reliable metric. May be of interest, but not used to determine the overall result above.
CyclesResults (secondary 3.4%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Binary sizeResults (secondary 0.0%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Bootstrap: 470.95s -> 471.912s (0.20%) |
body.basic_blocks.len() | ||
); | ||
let bbs = if self.term_patch_map.is_empty() && self.new_blocks.is_empty() { | ||
let bbs = if self.term_patch_map.iter().all(Option::is_none) && self.new_blocks.is_empty() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so this small perf regression might be real because we're now iterating over all basic blocks here even if all are None
🤔
🤷 would still be fine with landing this, but we may also consider alternatives, e.g. I would kinda assume that the patch map will be mostly None
all the time, so maybe something that's better with sparse entries would perform better here.
anyways, haven't looked much at the relevant code yet
@bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
Avoid invalidating CFG caches from MirPatch::apply.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Finished benchmarking commit (5b81809): comparison URL. Overall result: ✅ improvements - no action neededBenchmarking this pull request means it may be perf-sensitive – we'll automatically label it not fit for rolling up. You can override this, but we strongly advise not to, due to possible changes in compiler perf. @bors rollup=never Instruction countOur most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.
Max RSS (memory usage)Results (primary 0.3%, secondary -0.8%)A less reliable metric. May be of interest, but not used to determine the overall result above.
CyclesResults (secondary -2.4%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Binary sizeThis benchmark run did not return any relevant results for this metric. Bootstrap: 471.07s -> 470.492s (-0.12%) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hm, we might be able to get another 0.1-0.2% here by doing some counts
based profiling looking at how many entries end up in the term_patch_map
and then e.g. use some SsoHashMap
with an inline len of 1 or sth istead of a HashMap
. I expect that slotting in the current SsoHashMap
would actually slow things down as TerminatorKind
is too large.
r=me after adding the comment
Co-authored-by: lcnr <rust@lcnr.de>
@bors r=lcnr I'll try |
Scheduling: interleave rollup=never PRs with rollups. @bors p=5 |
Avoid invalidating CFG caches from MirPatch::apply. Small effort to reduce invalidating CFG caches.
💔 Test failed - checks-actions |
A job failed! Check out the build log: (web) (plain enhanced) (plain) Click to see the possible cause of the failure (guessed by this bot)
|
@bors retry (flaky) |
☀️ Test successful - checks-actions |
What is this?This is an experimental post-merge analysis report that shows differences in test outcomes between the merged PR and its parent PR.Comparing 6f34f4e (parent) -> eabf390 (this PR) Test differencesShow 2 test diffs2 doctest diffs were found. These are ignored, as they are noisy. Test dashboardRun cargo run --manifest-path src/ci/citool/Cargo.toml -- \
test-dashboard eabf390b4ceeb34db9f37e97f435134abbcdea92 --output-dir test-dashboard And then open Job duration changes
How to interpret the job duration changes?Job durations can vary a lot, based on the actual runner instance |
Finished benchmarking commit (eabf390): comparison URL. Overall result: ✅ improvements - no action needed@rustbot label: -perf-regression Instruction countOur most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.
Max RSS (memory usage)Results (primary 1.0%, secondary 0.8%)A less reliable metric. May be of interest, but not used to determine the overall result above.
CyclesResults (primary -2.6%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Binary sizeThis benchmark run did not return any relevant results for this metric. Bootstrap: 471.619s -> 470.601s (-0.22%) |
Small effort to reduce invalidating CFG caches.