-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[NLL] prohibit "two-phase borrows" with existing borrows? #56254
Comments
(It's actually not clear if we would want to backport this -- ideally we would, but it's probably a corner case.) |
Nominated for discussion on the next T-lang meeting since this seems to a affect the type system in observable ways and because I'd like to understand this better... provided that we can wait until Thursday... ;) |
I only have theoretical knowledge of NLL’s implementation but it seems extremely hard to forbid this…? |
From what I hear it's actually easy, we just have an additional constraint that such that when the two-phase borrow starts, all existing loans for that ref get killed (like they usually would for a mutable ref). The problem is the "fake read" desugaring we do to make sure that match arms cannot mutate the discriminee: fn foo(x: Option<String>) {
match x {
Some(mut ref s) if s.starts_with("hello") => s.push_str(" world!"),
_ => {},
}
} Becomes something like
When |
I once proposed an alternative to this desugaring that avoids 2-phase-borrows. I was told (by @arielb1 and probably others) it doesn't work because it doesn't preserve pointer identity (if the guard compares addresses it'd notice), but I actually don't see why: I think all pointers are the same as in the desugaring above. Namely, we should do:
where fn fake_mut<'a, 'b, T>(x: &'a &'b T) -> &'a &'b mut T {
std::mem::transmute(x)
}
|
In this translation, And if we really wanted to preserve this property, we could have |
Okay, so we agree that my proposal doesn't break more than what we currently do -- but it might be harder to fix (if we care).
It would however still be the case that the mutable reference was created after the guard runs, which could be observable in terms of Stacked Borrows / LLVM noalias. |
Sure enough. So I think @RalfJung's solution (having an |
seems best to be conservative (in terms of erring on the side of rejecting a larger set of sound programs) and do this now, if we can land a backportable patch in time for the edition. (In other words, I'm in favor of moving forward on this proposal) |
triage: We discussed this in the NLL team meeting last night, and essentially decided that we think we will make a fix for this in the master branch without backporting it. The main risk implied by that decision is that there may be 6 weeks of 2018 edition code that writes code similar to that of
|
also, in terms of triage: I don't think we need to discuss this at the T-compiler meeting. It may or may not merit discussion at the T-lang meeting, given that WG-compiler-nll is planning to plug this hole; just not in a manner that we'd backport to beta... |
Removing from the 2018 edition milestone due to #56254 (comment) |
Lang team discussion:
|
Is there some documentation of why exactly the model prohibits this, and what it would take to have a model that doesn't? Because in particular, it seems like you could have a region for |
I feel a bit torn. The fact that we see various bits of code using this makes me a bit reluctant to rule it out, and a bit more inclined to consider extending stacked borrows to account for it, though it depends on how big of a mess results. It'd be nice however to do that evaluation without time pressure, which is what gives me some incentive to want to rule it out -- and then see what it takes to allow it again. |
@nikomatsakis Likewise. I don't want to put anyone under pressure to cope with this, but I do think we ought to accept this code. |
The basic model is described in this long blog-post of mine. Model for Limited Two-Phase BorrowsThe changes required to that model to support "limited two-phase borrows" (i.e., two-phase borrows that kill existing shared borrows when they are created) are trivial: When retagging for a two-phase borrow, we follow all the usual steps of retagging for a mutable borrow, and then we re-borrow the new borrow for a shared borrow. Done. After However, creating a mutable reference still acts like a "virtual write" to this location, which of course means that outstanding, existing borrows get killed. The reasoning for this is that after creating a mutable reference, we usually want to be sure that there are no aliases. I have implemented this model (it is waiting for a PR to land), and it accepts this test-case. (The entire rest of the miri test suite does not seem to rely on two-phase borrows.) Model for Unlimited Two-Phase BorrowsThe problem with not killing existing shared borrows is that when the two-phase borrows is created, we cannot yet push it to the stack. It would have to "go below" the existing shared borrows so as to not invalidate them. We could maybe put it directly above the item matching the borrow we re-borrow from. I don't like this for two reasons: First, it kills the stack discipline. Second, one can imagine several sound extensions of two-phase borrows (and indeed @arielb1 has imagined all of them^^) that this cannot scale to. For example, in this model, creating a new two-phase borrow would still have to kill outstanding two-phase borrows. Today it seems like you don't want to support this, but given that "it doesn't introduce UB in a naive translation to LLVM" seems to be a sufficient justification, I wouldn't be surprised if a year from now y'all come to me and said "uh, now we'd really rather like to support these other things as well" and then we have to re-design two-phase stacked borrows. ;) I had another idea that should fix these problems. But I haven't had the time to implement it yet, so there may be unforeseen consequences. In this model, we add a new kind of "tag" that a pointer can carry: Beyond
This means that two-phase borrows delay the pushing of "their" item (the activation) to their first write (where a re-borrow to a "full" The downsides of this model are:
Extreme proposalOne issue I have here is that the only constraint for 2-phase-borrows seems to be "it must be sound in a naive memory model". This is in direct conflict with the expressed desire to have stronger invariants for references, that unsafe code must also follow. Are there any constraints in terms of optimizations you still want to perform, or invariants you still want to uphold, that would put another limit on how far 2-phase-borrows are supposed to go? Just to map out the design space a bit, here's a really extreme proposal that essentially gives up on tracking for 2-phase borrows: A 2-phase-borrow is just a copy of the mutable reference it is created from. This means that after let w = &2phase *v;
let vraw = v as *mut _; // let's assume this wouldn't reborrow, but just cast
let wraw = w as *mut _; // let's assume this wouldn't reborrow, but just cast
// Now this is okay, as both pointers carry the same tag
*vraw = 4;
*wraw = 5;
*vraw = 6;
*wraw = 7; The borrow checker would still have to impose some restrictions (writing to I have not implemented this model, but I think it would be feasible. Of course, this would maximally pessimize compiler transformations as the two pointers would now indeed be allowed to alias. But that's what you get if you don't want to impose any limitations on how far 2-phase borrows should go :) |
T-compiler triage: I am planning on taking point on this during this week. |
Use normal mutable borrows in matches `ref mut` borrows are currently two-phase with NLL enabled. This changes them to be proper mutable borrows. To accommodate this, first the position of fake borrows is changed: ```text [ 1. Pre-match ] | [ (old create fake borrows) ] [ 2. Discriminant testing -- check discriminants ] <-+ | | | (once a specific arm is chosen) | | | [ (old read fake borrows) ] | [ 3. Create "guard bindings" for arm ] | [ (create fake borrows) ] | | | [ 4. Execute guard code ] | [ (read fake borrows) ] --(guard is false)-----------+ | | (guard results in true) | [ 5. Create real bindings and execute arm ] | [ Exit match ] ``` The following additional changes are made to accommodate `ref mut` bindings: * We no longer create fake `Shared` borrows. These borrows are no longer needed for soundness, just to avoid some arguably strange cases. * `Shallow` borrows no longer conflict with existing borrows, avoiding conflicting access between the guard borrow access and the `ref mut` borrow. There is some further clean up done in this PR: * Avoid the "later used here" note for Shallow borrows (since it's not relevant with the message provided) * Make any use of a two-phase borrow activate it. * Simplify the cleanup_post_borrowck passes into a single pass. cc #56254 r? @nikomatsakis
More restrictive 2 phase borrows - take 2 Another try at this. Currently changes to a hard error, but we probably want to change it to a lint instead. cc #56254 r? @pnkfelix cc @RalfJung @nikomatsakis
More restrictive 2 phase borrows - take 2 Another try at this. Currently changes to a hard error, but we probably want to change it to a lint instead. cc #56254 r? @pnkfelix cc @RalfJung @nikomatsakis
Also see the tracking issue at #59159 |
Now that PR #58739 has landed, this is no longer P-high priority. (I wouldn't call it "resolved" until we settle the question of what model we will end up using. But we may want to still close this issue and open a different one to track that work?) In any case, re-prioritizing this as P-medium. |
EDIT: Nvm, someone added the
fn with_interior_mutability() {
use std::cell::Cell;
trait Thing: Sized {
fn do_the_thing(&mut self, _s: i32) {}
}
impl<T> Thing for Cell<T> {}
let mut x = Cell::new(1);
let l = &x;
#[allow(unknown_lints, mutable_borrow_reservation_conflict)]
x
.do_the_thing({
x.set(3);
l.set(4); // This is an example of overlapping 2PB! l was created before the implicit mutable borrow of x as receiver of this call.
x.get() + l.get()
})
;
} |
FWIW, I was able to support the two-phase-borrows test cases with outstanding lones that @matthewjasper wrote in a refurbished version of Stacked Borrows (Miri PR: rust-lang/miri#695). There'll be a blog post about this soon (TM). I am not sure if this is the model we want (in fact I think it is not, but for reasons mostly unrelated to two-phase borrows), and I have little idea of the consequences this will have on optimizations. EDIT: The blog post is out |
@Manishearth discovered an intersting piece of code that was not accepted by Stacked Borrows 2.0: fn unsafe_cell_2phase() { unsafe {
let x = &UnsafeCell::new(vec![]); // Tag(0)
// Stack: [0: SharedReadWrite]
let x2 = &*x; // Tag(1)
// Stack: [0: SharedReadWrite, 1: SharedReadWrite]
(*x.get()).push(0); // The raw ptr returned by get: Tag(2)
// Stack: [0: SharedReadWrite, 2: SharedReadWrite, 1: SharedReadWrite]
// Then it gets two-phase-reborrows as a unique with Tag(3):
// Stack: [0: SharedReadWrite, 2: SharedReadWrite, 3: Unique, 1: SharedReadWrite]
// And then 3 gets reborrows as a "proper" unique with Tag(4)
// Stack: [0: SharedReadWrite, 2: SharedReadWrite, 3: Unique, 4: Unique]
// Now this is UB because x2's tag is not in the stack any more.
let _val = (*x2.get()).get(0);
} } I have annotated what happens with the stack. The issue is that we add the I tried to fix this by supporting two-phase borrows "for real": I added a new fn with_interior_mutability() {
use std::cell::Cell;
trait Thing: Sized {
fn do_the_thing(&mut self, _s: i32) {}
}
impl<T> Thing for Cell<T> {}
let mut x = Cell::new(1);
let l = &x;
x
.do_the_thing({
x.set(3);
l.set(4);
x.get() + l.get()
})
;
} What happens here is that after creating the two-phase borrow, we end up with a stack like The issue here is (and this kind of came up before already) that the stack just does not encode enough information about which pointer is derived from which. We'd want to know that So, for now I basically gave up and made two-phase borrows carry |
Nightly miri now passes all of the |
Just a quick note: If we are still interested in gathering data about how much code was affected by adding the restriction to two-phase borrows, one source of data I had not considered is the set of commits that land that reference the lint's tracking issue #59159 (I count thirteen commits as of today that reference the lint issue.) It won't be a complete list, of course, but it is a different data source than say crater. |
@RalfJung raised this example in which the "two-phase" borrow of
x
is compatible with a pre-existing share:This poses a problem for stacked borrows, as well as for the potential refactoring of moving stacked borrows into MIR lowering (#53198) -- roughly for the same reason. It might be nice to change this, but -- if so -- we've got to move quick!
cc @arielb1 @pnkfelix
The text was updated successfully, but these errors were encountered: