-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Run a single huge par_body_owners instead of many small ones after each other. #122140
Conversation
r? @davidtwco rustbot has assigned @davidtwco. Use r? to explicitly pick a reviewer |
@bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
Run a single huge par_body_owners instead of many small ones after each other. This improves parallel rustc parallelism by avoiding the bottleneck after each individual `par_body_owners` (because it needs to wait for queries to finish, so if there is one long running one, a lot of cores will be idle while waiting for the single query). based on rust-lang#121500
☀️ Try build successful - checks-actions |
This comment has been minimized.
This comment has been minimized.
Finished benchmarking commit (c0e49b5): comparison URL. Overall result: ❌✅ regressions and improvements - ACTION NEEDEDBenchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf. Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @bors rollup=never Instruction countThis is a highly reliable metric that was used to determine the overall result at the top of this comment.
Max RSS (memory usage)ResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Binary sizeThis benchmark run did not return any relevant results for this metric. Bootstrap: 648.501s -> 652.39s (0.60%) |
@bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
Run a single huge par_body_owners instead of many small ones after each other. This improves parallel rustc parallelism by avoiding the bottleneck after each individual `par_body_owners` (because it needs to wait for queries to finish, so if there is one long running one, a lot of cores will be idle while waiting for the single query). based on rust-lang#121500
☀️ Try build successful - checks-actions |
This comment has been minimized.
This comment has been minimized.
Finished benchmarking commit (702bd6d): comparison URL. Overall result: ❌✅ regressions and improvements - ACTION NEEDEDBenchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf. Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @bors rollup=never Instruction countThis is a highly reliable metric that was used to determine the overall result at the top of this comment.
Max RSS (memory usage)ResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Binary sizeThis benchmark run did not return any relevant results for this metric. Bootstrap: 648.792s -> 653.885s (0.78%) |
…each other. This improves parallel rustc parallelism by avoiding the bottleneck after each individual `par_body_owners` (because it needs to wait for queries to finish, so if there is one long running one, a lot of cores will be idle while waiting for the single query).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
r=me if this is ready
@bors r=davidtwco |
☀️ Test successful - checks-actions |
Finished benchmarking commit (65cd843): comparison URL. Overall result: ❌✅ regressions and improvements - no action needed@rustbot label: -perf-regression Instruction countThis is a highly reliable metric that was used to determine the overall result at the top of this comment.
Max RSS (memory usage)ResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Binary sizeThis benchmark run did not return any relevant results for this metric. Bootstrap: 647.596s -> 652.28s (0.72%) |
Umm, did this help for the parallel frontend (and how much?). Otherwise, this looks like a severe regression for the single-threaded version. |
(for those that got confused, the linked perf report is for the cycles count rather than the instructions count) |
Ah yes, sorry, forgot to mention that. The same regression is for walltime. |
That's super weird. Did I trash CPU caching of the query tables or sth? (Edit: Jup, I did https://perf.rust-lang.org/compare.html?start=d255c6a57c393db6221b1ff700daea478436f1cd&end=65cd843ae06ad00123c131a431ed5304e4cd577a&stat=cache-misses) Before this PR we invoked all queries of the same kind right after each other, now we invoke all queries for the same id together. Let's revert and revisit. |
…rrors Revert "Auto merge of rust-lang#122140 - oli-obk:track_errors13, r=davidtwco" This reverts commit 65cd843, reversing changes made to d255c6a. reverts rust-lang#122140 It was a large regression in wall time due to trashing CPU caches
Revert "Auto merge of #122140 - oli-obk:track_errors13, r=davidtwco" This reverts commit 65cd843ae06ad00123c131a431ed5304e4cd577a, reversing changes made to d255c6a57c393db6221b1ff700daea478436f1cd. reverts rust-lang/rust#122140 It was a large regression in wall time due to trashing CPU caches
This improves parallel rustc parallelism by avoiding the bottleneck after each individual
par_body_owners
(because it needs to wait for queries to finish, so if there is one long running one, a lot of cores will be idle while waiting for the single query).