Replace LockCell with atomic types #56614

Zoxc · 2018-12-07T22:48:50Z

Split from #56509

michaelwoerister · 2018-12-10T15:36:33Z

src/librustc/session/mod.rs

-    pub fn consider_optimizing<T: Fn() -> String>(&self, crate_name: &str, msg: T) -> bool {
+    #[inline(never)]
+    #[cold]
+    pub fn consider_optimizing_cold<T: Fn() -> String>(&self, crate_name: &str, msg: T) -> bool {


Does splitting this into two functions really translate to faster compile-times?

It is used in layout computation. Not sure how hot it is, but it probably won't hurt. Especially if we want to add more MIR optimizations (and debug them)

Anyway this PR is mostly about getting rid of LockCell, I just found this on the way

OK. It does hurt code quality though as it makes things more complicated and thus harder to read and maintain. I personally prefer not to do things like this without evidence that it carries it's weight.

michaelwoerister · 2018-12-10T15:36:55Z

@bors try

bors · 2018-12-10T15:37:07Z

⌛ Trying commit ed78d93 with merge bcc6e63...

@michaelwoerister

Replace LockCell with atomic types Split from #56509 r? @michaelwoerister

bors · 2018-12-10T17:54:54Z

☀️ Test successful - status-travis
State: approved= try=True

ghost · 2018-12-10T18:02:42Z

Just a suggestion: Instead of LockCell<T> and AtomicT, you could use Crossbeam's AtomicCell<T>:
https://docs.rs/crossbeam/0.5.0/crossbeam/atomic/struct.AtomicCell.html

AtomicCell<T> works exactly like Cell<T>, except it's also thread-safe:

If T fits into an AtomicUsize or AtomicBool, atomic operations are used behind the scenes.
If T doesn't fit into any primitive atomic type, one of the hidden global spinlocks is used. Those spinlocks support optimistic reads, which means uncontended reads are much faster than they would be with a normal spinlock, Mutex, or RwLock (code, behchmarks).

ghost · 2018-12-10T21:34:26Z

src/librustc/session/mod.rs

            if c == crate_name {
                assert_eq!(self.query_threads(), 1);
-                let fuel = self.optimization_fuel_limit.get();
+                let fuel = self.optimization_fuel_limit.load(SeqCst);


This function is pretty racy I think.

This should all be in a cas loop and the other section (for print_fuel) should be a fetch add, right?

Granted it was racy before even with locks, but still.

It asserts that the compiler is running in single-threaded mode, so it should be fine. It would be nicer though if the two fields in question would be behind a single lock.

michaelwoerister · 2018-12-11T09:35:08Z

@stjepang This looks great. Yes, that sounds like the way to go. We already have crossbeam-utils in our crate-whitelist.

michaelwoerister · 2018-12-11T09:35:34Z

@rust-timer build bcc6e63

rust-timer · 2018-12-11T09:35:35Z

Success: Queued bcc6e63 with parent 3a75e80, comparison URL.

262: Move AtomicCell into crossbeam-utils r=jeehoonkang a=stjepang At this point, I feel pretty confident about the current state of `AtomicCell` in the sense that we probably won't make any breaking changes to its public API in the future. It's also a reasonably small utility without dependencies that deserves living in `crossbeam-utils`. The Rust compiler would like to use it in place of its `LockCell`, and already has `crossbeam-utils` whitelisted. See [this PR](rust-lang/rust#56614 (comment)). Let's move `AtomicCell` into `crossbeam-utils` so that people can use it without depending on whole `crossbeam`. Co-authored-by: Stjepan Glavina <stjepang@gmail.com>

rust-timer · 2018-12-11T12:59:47Z

Finished benchmarking try commit bcc6e63

michaelwoerister · 2018-12-11T13:45:55Z

The changes here don't seem to make much of a difference performance-wise. I'd say, as a consequence, we should optimize for code quality:

replace LockCell with crossbeam's AtomicCell,
put optimization_fuel_limit and out_of_fuel into one regular Lock together,
don't split consider_optimizing into hot and cold paths.

262: Move AtomicCell into crossbeam-utils r=stjepang a=stjepang At this point, I feel pretty confident about the current state of `AtomicCell` in the sense that we probably won't make any breaking changes to its public API in the future. It's also a reasonably small utility without dependencies that deserves living in `crossbeam-utils`. The Rust compiler would like to use it in place of its `LockCell`, and already has `crossbeam-utils` whitelisted. See [this PR](rust-lang/rust#56614 (comment)). Let's move `AtomicCell` into `crossbeam-utils` so that people can use it without depending on whole `crossbeam`. Co-authored-by: Stjepan Glavina <stjepang@gmail.com>

ghost · 2018-12-11T14:40:07Z

I've just put AtomicCell into crossbeam-utils v0.6.3 so that you can use it:
https://docs.rs/crossbeam-utils/0.6.3/crossbeam_utils/atomic/struct.AtomicCell.html

Zoxc · 2018-12-11T16:21:19Z

@stjepang The use of volatile reads doesn't inspire confidence. Some #[inline] attributes are also lacking.
The only benefits from using crossbeam-utils would be guaranteed u64 support, which we do want.

ghost · 2018-12-11T17:00:21Z

The use of volatile reads doesn't inspire confidence.

FWIW, @RalfJung has reviewed those volatile reads and written comments above them.

Some #[inline] attributes are also lacking.

The code is parametrized over <T> so #[inline] attributes didn't help in benchmarks. But we can always add them if needed.

The only benefits from using crossbeam-utils would be guaranteed u64 support, which we do want.

Another benefit is that reads are very fast (faster than Mutex and naive spinlock).

Additionally, if you don't need .get(), AtomicCell<T> doesn't require T: Copy (just like Cell<T>).

RalfJung · 2018-12-11T17:54:13Z

@stjepang Do you mean the comments in https://github.com/crossbeam-rs/crossbeam/pull/234/files? Those are different accesses though, are they not?

The volatile accesses I looked at all have a comment saying "this is UB but we do it anyway because performance".^^

ghost · 2018-12-11T21:11:23Z

@RalfJung Sorry, my bad! I confused this volatile read with the CAS that issues a SeqCst fence. If you'd like to review this situation, though, that'd be great! :)

We're using a volatile read to read a value that might be concurrently written to. However, we can detect when a data race happens by reading the sequence number. If the value was overwritten while we were reading it, then we try reading again in a loop. The Chase-Lev deque uses a very similar technique.

If we wanted to be pedantic, we'd use multiple AtomicU8 reads to read the value, but that'd be slow (because multiple atomic operations), and we have no AtomicU8 on stable Rust. What are you thoughts on this? (I have deja vu that we already had this discussion somewhere, but can't find it)

michaelwoerister · 2018-12-12T12:50:38Z

The only benefits from using crossbeam-utils would be guaranteed u64 support

The main benefit I see is being able to remove LockCell in favor of something that is maintained outside the compiler.

Zoxc · 2018-12-12T15:57:49Z

The main benefit I see is being able to remove LockCell in favor of something that is maintained outside the compiler.

This PR already does that and uses libstd's atomic types. Using crossbeam-utils wouldn't be an improvement in that regard.

michaelwoerister · 2018-12-12T17:17:47Z

AtomicCell has a slightly nicer interface because it doesn't require giving Ordering arguments. However, if it is still kind of unstable, libstd's atomic types are probably the better choice. @stjepang and @RalfJung, would you say it is production-ready?

Zoxc · 2018-12-12T17:20:49Z

I didn't notice the lack of Ordering arguments. I want those.

michaelwoerister · 2018-12-12T17:53:58Z

I depends. If I don't really need them (like in the cases modified in this PR), I prefer them not to be exposed in the interface.

ghost · 2018-12-12T18:49:26Z

@stjepang and @RalfJung, would you say it is production-ready?

I would say yes.

AtomicCell has a slightly nicer interface because it doesn't require giving Ordering arguments.

Not only that, AtomicCell<T> can't accept Ordering arguments because using Relaxed orderings might cause data races when T: !Copy.

let a = AtomicCell::new(vec![]);

// Thread #1
a.store(vec![1, 2, 3], Relaxed);

// Thread #2
println!("{:?}", a.take(Relaxed)); // unsynchronized read!

RalfJung · 2018-12-12T22:31:23Z

We're using a volatile read to read a value that might be concurrently written to. However, we can detect when a data race happens by reading the sequence number. If the value was overwritten while we were reading it, then we try reading again in a loop. The Chase-Lev deque uses a very similar technique.

This sounds a lot like sequence locks to me, which are a notorious problem for language-level memory models. AFAIK, they cannot be implemented in C and hence not in Rust, either.

So the reason you don't do a relaxed load is because this is arbitrarily-sized data?

Following our own definitions, this is UB because of a read-write race. Following LLVM's definitions, the read is fine but returns undef, and it is impossible to check whether data is undef, so the code is most likely still UB (I haven't had the chance to look at it though).

However, given taht volatile loads cannot be duplicated, the undef probably cannot be materialized as such and hence this is likely to work in practice. But we are bending the rules of both Rust and LLVM here.

If we keep wanting the LLVM memory model, maybe we should just change the documentation and say we are using the LLVM model? The relevant difference is that read-write races are not UB, they just return indeterminate data (which is the same as uninitialized data and padding bits). You are not allowed to do anything with that data, and you should probably wrap it in a MaybeUninitialized, but the read itself is not immediate UB.

For the LLVM side, can we at least get an LLVM developer on the record saying that volatile accesses never return undef, that they act like a load followed by a freeze? That seems like a reasonable assumption (and, curiously, because it would be the only way to actually write out a freeze in LLVM, it would suddenly make volatile actually meaningful in the context of concurrency), but I know LLVM far too little to estimate whether it holds true. Ideally we'd get them to add that to the LangRef...

Cc @rkruppe

RalfJung · 2018-12-12T22:32:10Z

Not only that, AtomicCell can't accept Ordering arguments because using Relaxed orderings might cause data races when T: !Copy.

So which ordering does it use, always release/acquire?

ghost · 2018-12-12T23:10:07Z

This sounds a lot like sequence locks to me, which are a notorious problem for language-level memory models. AFAIK, they cannot be implemented in C and hence not in Rust, either.

Yes, this is a sequence lock. We based our implementation on the paper you've linked.

So the reason you don't do a relaxed load is because this is arbitrarily-sized data?

Exactly.

Following our own definitions, this is UB because of a read-write race. Following LLVM's definitions, the read is fine but returns undef, and it is impossible to check whether data is undef, so the code is most likely still UB (I haven't had the chance to look at it though).

Yeah, we don't use the data if a race happens (it's mem::forgotten). The Chase-Lev deque (crossbeam-deque, which is used by rayon-core and tokio-threadpool) uses a similar trick and forgets data when a data race happens. Goroutine queues in Go's scheduler do the same thing, too.

You are not allowed to do anything with that data, and you should probably wrap it in a MaybeUninitialized, but the read itself is not immediate UB.

MaybeUninitialized is not available on stable yet. Can I use ManuallyDrop instead?

So which ordering does it use, always release/acquire?

I was a big proponent of guaranteeing SeqCst semantics, but after a lengthy dicussion we decided to guarantee that AtomicCell uses at least acquire/release. The reason is, @jeehoonkang says we cannot really guarantee SeqCst because SeqCst semantics are broken in the C/C++11 memory model (paper, talk) in the sense that there's no valid compilation scheme for some architectures (including some ARMs).

RalfJung · 2018-12-13T15:12:45Z

Yeah, we don't use the data if a race happens (it's mem::forgotten).

Ah I see, the data is literally not looked at at all if a race happened, and the detection happens through another channel. The undef/poison is fine then and you can ignore anything I said above about freeze.

The good news is that I think you don't even need a volatile read! You don't care if racy reads yield undef. But you do need the property that read-writes read undef instead of being UB, which LLVM guarantees but Rust does not. (And you do need MaybeUninitialized.) We should better clarify this before other backends are written...

I suspect Rust picked the C11 model because it is much better studied, and that is still true. I know of hardly any academic work on memory models that are as permissive with read-write races as LLVM is. However, there is some very recent work that I should read that takes this into account.

Can I use ManuallyDrop instead?

No. ManuallyDrop doesn't help a bit.

after a lengthy dicussion we decided to guarantee that AtomicCell uses at least acquire/release

I think SeqCst is overrated anyway and release-acquire is what you want almost all the time, so fine for me. ;)

I took a look at the AtomicCell code. If I understood it correctly, a write proceeds as follows:

Acquire "lock" by swapping stamp with 1 (doing an acquire-load but a relaxed store, for whatever reason)
Release fence
Non-atomic store for the actual data
Store old_timestamp+2 in stamp (release write)

A read in the fast case proceeds as

Acquire-load the stamp
Non-atomic load for actual data
Acquire fence
Relaxed-load of stamp, checking it didn't change

I just spent quite a while wondering about what the correctness argument here is, and why the fence is needed, and why the initial swap is Acquire as opposed to Relaxed. I think I got it now, but I also think the library could need some comments so that the reader doesn't have to reverse-engineer the correctness argument by themselves.

Generally, the library has way fewer comments than is adequate for such subtle concurrent data structures. In Arc, every single atomic access/fence and every Ordering choice is carefully justified. I suggest doing the same here (and for any concurrent data structure, really).

michaelwoerister · 2019-01-09T09:42:57Z

OK, let's merge the current version of the PR. The single-threaded implementation of the Atomic types seems harmless enough and I can see that having memory ordering available might be beneficial in some situations. Thanks for updating the code around optimization fuel, @Zoxc!

@bors r+

bors · 2019-01-09T09:42:58Z

📌 Commit 9b47acf has been approved by michaelwoerister

bors · 2019-01-09T11:08:25Z

⌛ Testing commit 9b47acf with merge 664c779...

@michaelwoerister

Replace LockCell with atomic types Split from #56509 r? @michaelwoerister

bors · 2019-01-09T13:48:32Z

☀️ Test successful - status-appveyor, status-travis
Approved by: michaelwoerister
Pushing 664c779 to master...

@michaelwoerister

Optimize try_mark_green and eliminate the lock on dep node colors Blocked on rust-lang#56614 r? @michaelwoerister

@michaelwoerister

Optimize try_mark_green and eliminate the lock on dep node colors Blocked on rust-lang#56614 r? @michaelwoerister

@michaelwoerister

Optimize try_mark_green and eliminate the lock on dep node colors Blocked on rust-lang#56614 r? @michaelwoerister

@michaelwoerister

Optimize try_mark_green and eliminate the lock on dep node colors Blocked on rust-lang#56614 r? @michaelwoerister

@michaelwoerister

Optimize try_mark_green and eliminate the lock on dep node colors Blocked on #56614 r? @michaelwoerister

rust-highfive assigned michaelwoerister Dec 7, 2018

rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Dec 7, 2018

Zoxc mentioned this pull request Dec 8, 2018

Tweak query code for performance #56613

Merged

michaelwoerister reviewed Dec 10, 2018

View reviewed changes

bors added a commit that referenced this pull request Dec 10, 2018

Auto merge of #56614 - Zoxc:query-perf2, r=<try>

bcc6e63

Replace LockCell with atomic types Split from #56509 r? @michaelwoerister

ghost reviewed Dec 11, 2018

View reviewed changes

ghost mentioned this pull request Dec 11, 2018

Move AtomicCell into crossbeam-utils crossbeam-rs/crossbeam#262

Merged

Zoxc mentioned this pull request Dec 22, 2018

Optimize try_mark_green and eliminate the lock on dep node colors #57065

Merged

Zoxc added 2 commits December 29, 2018 12:46

Replace LockCell with atomic types

03b7cec

Create a struct for optimization fuel data

9b47acf

Zoxc force-pushed the query-perf2 branch from ed78d93 to 9b47acf Compare December 29, 2018 12:03

bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jan 9, 2019

bors added a commit that referenced this pull request Jan 9, 2019

Auto merge of #56614 - Zoxc:query-perf2, r=michaelwoerister

664c779

Replace LockCell with atomic types Split from #56509 r? @michaelwoerister

bors merged commit 9b47acf into rust-lang:master Jan 9, 2019

bors mentioned this pull request Jan 9, 2019

Add more rust_private attributes #56559

Closed

Zoxc deleted the query-perf2 branch January 9, 2019 16:57

Centril added a commit to Centril/rust that referenced this pull request Jan 15, 2019

Rollup merge of rust-lang#57065 - Zoxc:graph-tweaks, r=michaelwoerister

0936d02

Optimize try_mark_green and eliminate the lock on dep node colors Blocked on rust-lang#56614 r? @michaelwoerister

Centril added a commit to Centril/rust that referenced this pull request Jan 17, 2019

Rollup merge of rust-lang#57065 - Zoxc:graph-tweaks, r=michaelwoerister

e3cfe66

Optimize try_mark_green and eliminate the lock on dep node colors Blocked on rust-lang#56614 r? @michaelwoerister

Centril added a commit to Centril/rust that referenced this pull request Jan 17, 2019

Rollup merge of rust-lang#57065 - Zoxc:graph-tweaks, r=michaelwoerister

d53dd10

Optimize try_mark_green and eliminate the lock on dep node colors Blocked on rust-lang#56614 r? @michaelwoerister

Centril added a commit to Centril/rust that referenced this pull request Jan 17, 2019

Rollup merge of rust-lang#57065 - Zoxc:graph-tweaks, r=michaelwoerister

d552d34

Optimize try_mark_green and eliminate the lock on dep node colors Blocked on rust-lang#56614 r? @michaelwoerister

bors added a commit that referenced this pull request Jan 18, 2019

Auto merge of #57065 - Zoxc:graph-tweaks, r=michaelwoerister

527b8d4

Optimize try_mark_green and eliminate the lock on dep node colors Blocked on #56614 r? @michaelwoerister

aturon mentioned this pull request Aug 16, 2019

Audit sources of shared state across rustc #63643

Open

37 tasks

compiler-errors mentioned this pull request Jan 2, 2024

don't reexport atomic::ordering via rustc_data_structures, use std import #119527

Merged

Replace LockCell with atomic types #56614

Replace LockCell with atomic types #56614

Uh oh!

Conversation

Zoxc commented Dec 7, 2018

Uh oh!

michaelwoerister Dec 10, 2018

Choose a reason for hiding this comment

Uh oh!

Zoxc Dec 10, 2018

Choose a reason for hiding this comment

Uh oh!

Zoxc Dec 10, 2018

Choose a reason for hiding this comment

Uh oh!

michaelwoerister Dec 10, 2018

Choose a reason for hiding this comment

Uh oh!

michaelwoerister commented Dec 10, 2018

Uh oh!

bors commented Dec 10, 2018

Uh oh!

bors commented Dec 10, 2018

Uh oh!

ghost commented Dec 10, 2018 • edited by ghost Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ghost Dec 10, 2018

Choose a reason for hiding this comment

Uh oh!

michaelwoerister Dec 11, 2018

Choose a reason for hiding this comment

Uh oh!

michaelwoerister commented Dec 11, 2018

Uh oh!

michaelwoerister commented Dec 11, 2018

Uh oh!

rust-timer commented Dec 11, 2018

Uh oh!

rust-timer commented Dec 11, 2018

Uh oh!

michaelwoerister commented Dec 11, 2018

Uh oh!

ghost commented Dec 11, 2018

Uh oh!

Zoxc commented Dec 11, 2018

Uh oh!

ghost commented Dec 11, 2018

Uh oh!

RalfJung commented Dec 11, 2018

Uh oh!

ghost commented Dec 11, 2018

Uh oh!

michaelwoerister commented Dec 12, 2018

Uh oh!

Zoxc commented Dec 12, 2018

Uh oh!

michaelwoerister commented Dec 12, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Zoxc commented Dec 12, 2018

Uh oh!

michaelwoerister commented Dec 12, 2018

Uh oh!

ghost commented Dec 12, 2018

Uh oh!

RalfJung commented Dec 12, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

RalfJung commented Dec 12, 2018

Uh oh!

ghost commented Dec 12, 2018 • edited by ghost Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

RalfJung commented Dec 13, 2018

Uh oh!

michaelwoerister commented Jan 9, 2019

Uh oh!

bors commented Jan 9, 2019

Uh oh!

ghost commented Dec 10, 2018 •

edited by ghost

Loading

michaelwoerister commented Dec 12, 2018 •

edited

Loading

RalfJung commented Dec 12, 2018 •

edited

Loading

ghost commented Dec 12, 2018 •

edited by ghost

Loading