-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use conditional synchronization for Lock #111713
Conversation
r? @davidtwco (rustbot has picked a reviewer for you, use r? to override) |
These commits modify the If this was unintentional then you should revert the changes before this PR is merged. Some changes occurred in src/tools/rustfmt cc @rust-lang/rustfmt |
This comment has been minimized.
This comment has been minimized.
It looks good to me. cc @cjgillot @nnethercote |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apologies for the delay in reviewing this - looks good to me, but I'll re-assign to someone involved in compiler performance/parallel compiler to double check.
r? @nnethercote |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm starting to get really uncomfortable with the amount of extra complexity being added for the whole single-thread vs multi-threaded split for the parallel front-end.
More specifically, for this PR, there is an entirely new file, lock.rs
, which is 242 lines, implementing a critical low-level type, involving UnsafeCell
and a bunch of unsafe
blocks, and it has just one two-line comment. That's not enough. Without additional documentation I can't understand the code well enough to review it and give an r+ in good conscience.
I suggest adding:
- A top-level comment to
lock.rs
explaining at a high-level what is going on. - Comments on more of the types, like
LockRawUnion
andLockRaw
. - Comments on and/or within some of the more complex functions.
- Comments on
unsafe
blocks, explaining why they're safe. - Any other comments above and beyond these would also be fine.
@@ -1,4 +1,4 @@ | |||
use crate::sync::Lock; | |||
use parking_lot::Mutex; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why the Lock
to Mutex
changes in this file? Is this code not used in the serial front-end?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not used in the serial compiler and it's quite cold.
type Target = T; | ||
#[inline] | ||
fn deref(&self) -> &T { | ||
unsafe { &*self.lock.data.get() } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this unsafe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The get
method on UnsafeCell
is not safe.
☔ The latest upstream changes (presumably #108714) made this pull request unmergeable. Please resolve the merge conflicts. |
@Zoxc any updates on this? |
I've added some documentation to this now. |
⌛ Testing commit 16eae91781eef660e19400d451ce1bab3a1dacf7 with merge e507b3d655f189b6382af2000caf319468f847b4... |
☀️ Test successful - checks-actions |
👀 Test was successful, but fast-forwarding failed: 422 Update is not a fast forward |
@bors retry |
@bors r+ |
Finished benchmarking commit (e507b3d655f189b6382af2000caf319468f847b4): comparison URL. Overall result: no relevant changes - no action needed@rustbot label: -perf-regression Instruction countThis benchmark run did not return any relevant results for this metric. Max RSS (memory usage)ResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Binary sizeThis benchmark run did not return any relevant results for this metric. Bootstrap: 630.625s -> 631.483s (0.14%) |
☀️ Test successful - checks-actions |
Finished benchmarking commit (61efe9d): comparison URL. Overall result: ❌ regressions - ACTION NEEDEDNext Steps: If you can justify the regressions found in this perf run, please indicate this with @rustbot label: +perf-regression Warning ⚠: The following benchmark(s) failed to build:
cc @rust-lang/wg-compiler-performance Instruction countThis is a highly reliable metric that was used to determine the overall result at the top of this comment.
Max RSS (memory usage)ResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Binary sizeThis benchmark run did not return any relevant results for this metric. Bootstrap: 630.791s -> 632.146s (0.21%) |
We have possible infinite loops for two rustc-perf benchmarks: exa-0.10.1 and syn-1.0.89. If they persist this PR will have to be backed out. |
@nnethercote How can we re-product this error? |
The changes are quite small without |
The latest results have 13 benchmarks for syn, while this PR has 11, so it seems like whatever happened was a transient issue maybe? |
The instruction-count regression(s) here appear spurious from what I can see. @rustbot label: +perf-regression-triaged |
This changes
Lock
to use synchronization only ifmode::is_dyn_thread_safe
could be true. This reduces overhead for the parallel compiler running with 1 thread.The emitters are changed to use
DynSend
instead ofSend
so they can still useLock
.A Rayon thread pool is not used with 1 thread anymore, as session globals contains
Lock
s which are no longerSync
.Performance improvement with 1 thread and
cfg(parallel_compiler)
:cc @SparrowLii