-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prevent early finalization #30
Prevent early finalization #30
Conversation
Currently `Gc` is defined in the libgc crate. The compiler isn't aware of this so can't perform any specific operations on the `Gc` type. To fix this, we introduce a `GcSmartPointer` trait which `libgc::Gc` will implement. This should be temporary. Eventually, the `Gc` smart pointer will be part of the standard library. Right now though, it's convenient for development that they are separated.
I don't see any problem with what's here so far, other than that we will certainly need to add tests to the test suite to make sure that our pass is doing what we think it should do. I suggest that we'll want a fairly large number of test cases, though they'll mostly be variations on a theme, so shouldn't be too difficult to generate. |
Yep. I've got a fair few done now, and they've mostly teased out invariants in the MIR I missed which I was not aware of. There is a really nasty inlining bug which I'm trying to fix, but once that's done I'll re-push and make this mergeable |
Finally fix the bug where barriers weren't inserted if functions were inlined 🎉 Here's some tests and a fix. I've marked this PR as merge-able now (but it will need squashing) |
Please squash. |
5e9c7d0
to
d09dd07
Compare
Squashed |
bors r+ |
Build failed: |
@jacob-hughes Where are we with this? |
This pass works by extracting MIR from the DefID of a |
Does that mean we can try bors'ing again or ... ? |
I'll push a fix shortly |
bors try |
tryBuild failed: |
This is ready for re-review now. |
Please squash. |
The optimizer (either in MIR or LLVM) may look at GC locals and determine that their stack or register slot can be clobbered early if there are no further uses of them. While such behaviour is legal and conforms to the as-if rule, it can cause the collector finalizing values earlier than expected, which has the potential to cause UAFs. This adds a pass which extends the lifetime of GC locals in candidate functions which are at risk of early finalization by the collector. Candidate functions are those which contain local variables of type `Gc<T>` which have a finalizer that will be run once it is unreachable. This works by inserting the following asm to the end of the function, (directly before the return statement) for each GC local whose lifetime must not be shortened: llvm_asm!("" : : "r"(local_val) : "memory" : "volatile"); This pass is added early, right after borrow checking so that it happens before other passes have a chance to shorten the lifetime of Gc locals.
Squashed |
f6ff7c0
to
fade20b
Compare
bors r+ |
Build succeeded: |
This was included originally because of concerns about finalizing a non-Sync `T` value off-thread. If a value is `Sync`, it means that it is safe for two threads to access its data simultaneuosly: in other words, it has some synchronization guarantees. A common example is `RefCell`, which is safe to send between threads, but is marked `!Sync` because the dynamic borrow checking is not atomic. Finalizing `!Sync` values no longer problematic, because even though finalizers are run off-thread, they will run when the object is dead [1], so there is no chance of a data race. [1]: softdevteam/alloy#30
54: Fix the concurrency semantics for `Gc<T>`. r=ltratt a=jacob-hughes This makes two major changes to the API: 1. It removes the requirement that `T: Sync` for `Gc<T>`. 2. It makes `Gc<T> : Send + Sync` if `T: Send + Sync`, fixing the ergonomic problems raised in #49. `Sync`'s purpose is to ensure that two threads can access the data in `T` in a thread-safe way. In other words, it implies that `T` has synchronisation guarantees. Originally, this was added as a constraint on `T` because any finalizer for `T` would run on a separate thread. However, it's now safe to remove this as a constraint because softdevteam/alloy#30 guarantees that a finalizer won't run early. This means that even without synchronized access, a race can't happen, because it's impossible for the finalizer to access `T`'s data while it's still in use by the mutator. However, even though `Gc<T>` can now implement `Send` -- [thanks to multi-threaded collection support](softdevteam/alloy#31) -- `Gc<T>` still requires that `T: Send`, because `T` could be a type with shared ownership which aliases. This is necessary because off-thread finalization could mutate shared memory without synchronisation. An example using `Rc` makes this clear: ```rust struct Inner(Rc<usize>); fn foo() { let rc = Rc::new(123); { let gc = Gc::new(Inner::new(Rc::clone(rc))); } // Assume `gc` is dead here, so it will be finalized in parallel on a separate thread. // This means `Rc::drop` gets called which has the potential to race with // any `Rc` increment / decrement on the main thread. force_gc(); // Might race with gc's finalizer bar(Rc::clone(rc)); } ``` Since finalizing any non-`Send` value can cause UB, we still disallow the construction of `Gc<T: !Send>` completely. This is certainly the most conservative approach. There are others: - Not invoking finalizers for non-`Send` values. This is valid, since finalizers are not guaranteed to run. However, it's not exactly practical, it would mean that any `Gc<Rc<...>>` type structure would _always_ leak. - Finalize `!Send` values on their mutator thread. This is really dangerous in the general case, because if any locks are held on the shared data by the mutator, this will deadlock (it's basically a variant of the async-signal-safe problem). However, if `T` is `!Send`, deadlocking is unlikely [although not impossible!], and could be an acceptable compromise. It's out of the scope of this PR, but it's something I've been playing around with. Co-authored-by: Jake Hughes <jh@jakehughes.uk>
This is a draft PR because I'm still debugging the test suite.
Currently
Gc
is defined in the libgc crate. The compiler isn't awareof this so can't perform any specific operations on the
Gc
type. Tofix this, we introduce a
GcSmartPointer
trait whichlibgc::Gc
willimplement.
This should be temporary. Eventually, the
Gc
smart pointer will bepart of the standard library. Right now though, it's convenient for
development that they are separated.