Conversation
|
@bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
I wonder if this PR's approach will preserve some of the benefits of static flags via inlining. It's certainly nicer than having FLAGS everywhere. |
This comment has been minimized.
This comment has been minimized.
|
Finished benchmarking commit (744269b): comparison URL. Overall result: ❌ regressions - please read the text belowBenchmarking this pull request means it may be perf-sensitive – we'll automatically label it not fit for rolling up. You can override this, but we strongly advise not to, due to possible changes in compiler perf. Next Steps: If you can justify the regressions found in this try perf run, please do so in sufficient writing along with @bors rollup=never Instruction countOur most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.
Max RSS (memory usage)Results (primary 2.4%, secondary -0.2%)A less reliable metric. May be of interest, but not used to determine the overall result above.
CyclesResults (primary 3.0%, secondary 3.3%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Binary sizeThis benchmark run did not return any relevant results for this metric. Bootstrap: 480.396s -> 481.316s (0.19%) |
This comment has been minimized.
This comment has been minimized.
|
Just enough of a regression that it's hard to justify, alas. I see now that Relatedly, I found the following functions take a
They can be changed to take a |
cd94110 to
940ba86
Compare
|
@bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
|
Finished benchmarking commit (06d3be4): comparison URL. Overall result: ❌✅ regressions and improvements - please read the text belowBenchmarking this pull request means it may be perf-sensitive – we'll automatically label it not fit for rolling up. You can override this, but we strongly advise not to, due to possible changes in compiler perf. Next Steps: If you can justify the regressions found in this try perf run, please do so in sufficient writing along with @bors rollup=never Instruction countOur most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.
Max RSS (memory usage)Results (secondary -5.5%)A less reliable metric. May be of interest, but not used to determine the overall result above.
CyclesResults (primary -2.8%, secondary -3.2%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Binary sizeThis benchmark run did not return any relevant results for this metric. Bootstrap: 487.064s -> 481.364s (-1.17%) |
940ba86 to
e7fb201
Compare
|
|
|
This PR was rebased onto a different main commit. Here's a range-diff highlighting what actually changed. Rebasing is a normal part of keeping PRs up to date, so no action is needed—this note is just to help reviewers. |
|
I rebased. |
|
`SemiDynamicQueryDispatcher` is just a `QueryVTable` wrapper with an additional `const FLAGS: QueryFlags` generic parameter that contains three booleans. This arrangement exists as a performance optimization. But the performance effects are very small and it adds quite a bit of complexity to an already overly-complex part of the codebase. If it didn't exist and somebody proposed adding it and asked me to review, I almost certainly wouldn't approve it. This commit removes it. The three booleans in `QueryFlags` are moved into `QueryVTable` The non-trivial methods of `SemiDynamicQueryDispatcher` become methods of `QueryVTable`.
It's now `query_vtable` because its return type changed. And thanks to the previous commit it can be manually inlined in several places. (The only remaining calls to it are in `make_dep_kind_vtable_for_query`, which are more challenging to remove.)
f69a4e1 to
5aebfd6
Compare
|
@bors r=Zalathar |
|
Scheduling: Giving this a bump as it's on the critical path for any subsequent work involving query vtables. @bors p=1 |
This comment has been minimized.
This comment has been minimized.
Remove `const FLAGS`. *[View all comments](https://triagebot.infra.rust-lang.org/gh-comments/rust-lang/rust/pull/152791)* The performance wins provided by these types are meagre, and I don't think they justify the code complexity they introduce. r? @Zalathar
|
@bors cancel |
|
Auto build cancelled. Cancelled workflows: The next pull request likely to be tested is #153013. |
This comment has been minimized.
This comment has been minimized.
What is this?This is an experimental post-merge analysis report that shows differences in test outcomes between the merged PR and its parent PR.Comparing eeb94be (parent) -> b3869b9 (this PR) Test differencesShow 2 test diffs2 doctest diffs were found. These are ignored, as they are noisy. Test dashboardRun cargo run --manifest-path src/ci/citool/Cargo.toml -- \
test-dashboard b3869b94cd1ed4bfa2eb28f301535d5e9599c713 --output-dir test-dashboardAnd then open Job duration changes
How to interpret the job duration changes?Job durations can vary a lot, based on the actual runner instance |
|
Finished benchmarking commit (b3869b9): comparison URL. Overall result: ❌ regressions - please read the text belowOur benchmarks found a performance regression caused by this PR. Next Steps:
@rustbot label: +perf-regression Instruction countOur most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.
Max RSS (memory usage)Results (primary -1.4%, secondary -5.3%)A less reliable metric. May be of interest, but not used to determine the overall result above.
CyclesResults (secondary 8.6%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Binary sizeThis benchmark run did not return any relevant results for this metric. Bootstrap: 481.498s -> 479.41s (-0.43%) |
|
Perf regressions were minor and deemed worthwhile above for the simplicity improvements. @rustbot label: +perf-regression-triaged |
View all comments
The performance wins provided by these types are meagre, and I don't think they justify the code complexity they introduce.
r? @Zalathar