Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MIR borrow check (under debug flag) #43108

Merged
merged 12 commits into from
Aug 16, 2017
Merged

Conversation

pnkfelix
Copy link
Member

@pnkfelix pnkfelix commented Jul 7, 2017

Here is the current state of MIR borrow check.

It consists of (1.) some refactoring, (2.) a dataflow analysis to identify the borrows themselves, and (3.) a mir "transform" that does the borrow check itself based on the aforementioned dataflow results.

(There's also a drive-by fix to dataflow that I can factor into a separate PR if necessary. Interestingly I could not find a way to observe the bug outside of MIR borrowck.)

To be clear, this branch is not ready to be used as the default borrow check. Thus the code is guarded: To get mir-borrowck to run, you need to either supply an attribute #[rustc_mir_borrowck] or a debug flag -Z borrowck-mir.

Here are the main issues with the current MIR borrowck as it stands in this PR:

  • No Notes emitted yet, just errors. (So the feedback is definitely inferior compared to AST borrowck today)
  • Lvalue rendering differs between Ast and Mir. (Mostly minor, but replacement of field names with indices is very bad; big priority for me to fix ASAP.)
  • Lots of ICEs (presumably because some MIR operations used here have well-formedness assumptions that are violated in borrowck-broken code)
  • Conflates lots of cases that are distinguished by AST-borrowck
  • Conflates "uninitialized" with "moved" (special case of previous bullet, one that I think should be fixed ASAP)

(I am hoping to fix as many of the above issues as I can in the near term, but I also would like to land this even if they are not all fixed, because the rebasing effort is getting to be a real drag.)

@rust-highfive
Copy link
Collaborator

r? @arielb1

(rust_highfive has picked a reviewer for you, use r? to override)

@pnkfelix
Copy link
Member Author

pnkfelix commented Jul 7, 2017

Here is a gist that summarizes how behavior differs on a number of test cases. I post this mostly to make it very clear that while some things are definitely working, there remains some serious work to do: MIR borrowck status

@pnkfelix
Copy link
Member Author

pnkfelix commented Jul 7, 2017

Oh, one thing that I just realized I should double check, and change if I haven't already made this so.

Currently, to make it easy to compare the results, when you opt-in to MIR-borrowck, I actually run both MIR-borrowck and AST-borrowck.

I have tried to set things up so that errors from MIR-borrowck get emitted with a suffix "(Mir)" in the output, and errors from AST-borrowck get emitted with a suffix "(Ast)".

While I think the above is great for while this is in development, its no good to expose this distinction to end users. So I need to ensure that the "(Ast)" suffix is only emitted if one has MIR-borrowck turned on.

@shepmaster shepmaster added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jul 7, 2017
Copy link
Contributor

@nikomatsakis nikomatsakis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

first round of comments

@@ -934,6 +934,7 @@ pub fn phase_3_run_analysis_passes<'tcx, F, R>(sess: &'tcx Session,
passes.push_pass(MIR_CONST, mir::transform::rustc_peek::SanityCheck);

// What we need to run borrowck etc.
passes.push_pass(MIR_CONST, mir::transform::borrow_check::BorrowckMir);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think borrowck should be the last thing in the MIR_VALIDATED passes. It should come after we qualify constants (since that is needed to get an accurate picture of what data has 'static lifetime). (i.e., if we write &(3+4), that -- per the rvalue lifetimes RFC -- should be a &'static i32).

That said, I think I was expecting borrowck to be its own query -- but maybe it's better to just have it be a pass. (If we do make it its own query, we would have to make sure it gets forced by the "optimize" stuff, though I think that already happens...?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So.. why would typeck be a pass while borrowck is a query?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, one reason is that borrowck produces output that other things consume. If nothing else, it produces data about which mut declarations are used that is later gathered up by the lints.

With respect to typeck, I expect it to be eventually subsumed, most likely, into the region inference code, which will be a part of borrowck, and hence to go away as a distinct entity.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I think we can work al this out as we go. No need to block anything on it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So.. why would typeck be a pass while borrowck is a query?

MIR typeck ATM is basically a hacky sanity check. I imagine MIR borrowck/regionck will subsume it.

mir: &'a Mir<'tcx>,
borrows: IndexVec<BorrowIndex, BorrowData<'tcx>>,
loc_map: FxHashMap<Location, BorrowIndex>,
rgn_map: FxHashMap<Region<'tcx>, FxHashSet<BorrowIndex>>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I'd prefer region_map and location_map

});
match stmt.kind {
mir::StatementKind::EndRegion(extent) => {
for idx in self.rgn_map.get(&ReScope(extent)).unwrap_or_else(|| {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: extract self.rgn_map.get(...) into a temporary, I think, this for loop is kind of hard to read

mir::StatementKind::Assign(_, ref rhs) => {
if let mir::Rvalue::Ref(region, _, _) = *rhs {
let loc = location;
let idx = self.loc_map.get(&loc).unwrap_or_else(|| {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: location, index

mir::StatementKind::SetDiscriminant { .. } |
mir::StatementKind::StorageLive(..) |
mir::StatementKind::StorageDead(..) |
mir::StatementKind::Nop => {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yay, exhaustive!

@carols10cents
Copy link
Member

Ping @nikomatsakis to get this in your queue for next time you're reviewing PRs (I know you're traveling) :)

@nikomatsakis
Copy link
Contributor

Well, I am not "assigned" this PR, I was just doing some drive-by reviewing, but I will be happy to try and push it through. I expect @arielb1 will provide very useful reviews though =) he tends to have a keener eye than me for flaws.

@nikomatsakis nikomatsakis self-assigned this Jul 17, 2017
@alexcrichton
Copy link
Member

ping r? @arielb1

@carols10cents
Copy link
Member

ping @arielb1 @rust-lang/compiler for review!

@arielb1
Copy link
Contributor

arielb1 commented Jul 24, 2017

Back from vacation! I'll review this as soon as I'll finish catching up on stuff.

@bors
Copy link
Contributor

bors commented Jul 24, 2017

☔ The latest upstream changes (presumably #43454) made this pull request unmergeable. Please resolve the merge conflicts.

@alexcrichton
Copy link
Member

Just another traige ping for you @arielb1

@aidanhs aidanhs added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Aug 3, 2017
@aidanhs
Copy link
Member

aidanhs commented Aug 3, 2017

@pnkfelix unfortunately it looks like this needs a rebase!

@arielb1
Copy link
Contributor

arielb1 commented Aug 8, 2017

I'm done with a bunch of other stuff. Maybe I'll review it tonight.

#[derive(Copy, Clone, PartialEq, Eq, Debug)]
pub enum Origin { Ast, Mir }

impl fmt::Display for Origin {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't you want AST borrowck to format as the empty string and MIR borrowck to format as (Mir), at least for the time being?

Copy link
Member Author

@pnkfelix pnkfelix Aug 14, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm apparently I never pushed up the changes to render Origin::Ast as empty string. Will address pronto. (My plan had been to have the empty string rendering be conditional, so that when one passes -Z borrowck-mir, you get a symmetry between the error messages in terms of whether they say "... (Ast)" or "... (Mir)".)

(Update: well, I did push my intended change in 09f8040c8ddaddb257c6f652258d22a0a901a1a8 ... so now I'm trying to why the PR's diff is not showing the effects of that change...)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or, maybe I misinterpreted the point of @arielb1's comment? Perhaps the idea is that errors from MIR borrowck should unconditionally include the "... (Mir)", regardless of what flags are set? That sounds good to me.

@@ -934,6 +934,7 @@ pub fn phase_3_run_analysis_passes<'tcx, F, R>(sess: &'tcx Session,
passes.push_pass(MIR_CONST, mir::transform::rustc_peek::SanityCheck);

// What we need to run borrowck etc.
passes.push_pass(MIR_CONST, mir::transform::borrow_check::BorrowckMir);
passes.push_pass(MIR_VALIDATED, mir::transform::qualify_consts::QualifyAndPromoteConstants);
passes.push_pass(MIR_VALIDATED,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see a SimplifyBranches pass here within the MIR_VALIDATED pipeline. It's important that this won't run before borrowck, because otherwise code within an if false won't be checked.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't disagree with your assessment, but I am left somewhat confused as to how things end up in this state. Does this mean that SimplifyBranches should be in the MIR_OPTIMIZED pipeline instead of the MIR_VALIDATED one?

Copy link
Member Author

@pnkfelix pnkfelix Aug 14, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I see. I guess niko's (and perhaps others' ) expectation was that borrowck would be injected in between MIR_VALIDATED and MIR_OPTIMIZED; but this PR as written is running its MIR borrowck right before the qualify_consts step, which does not match that expectation.

Still not 100% sure how we'll end up dealing with this. I'll add a FIXME regarding the conflict for now.

for idx in borrow_indexes { sets.kill(&idx); }
}

mir::StatementKind::Assign(_, ref rhs) => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any consideration of assign/borrow interactions? Or does current borrowck not deal with any of them, so that they have to wait for NLL?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My current thinking is to wait for NLL to address matters here.

(But even when we do, it more likely would go into transform::borrow_check, not dataflow::impls::borrows, right?)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The nice thing here is that everything is pretty much orthogonal - NLL determines where each borrow ends, 2φB when a borrow/lifetime starts - depending on the implementation, and borrows when they are active. We should probably place all of these features under the same feature gate, but there's no reason to implement them simultaneously.

Copy link
Contributor

@arielb1 arielb1 Aug 14, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The thing is, when we have something like

let mut t1 = 1;
let mut t2 = 2;

let mut x = &'α mut t1;
let y = &'β mut *x;

x = &mut y2; // this is an error today, but is actually sound
// `*x` is no longer borrowed here (because of the assignment)
u = (x, y);
print(u);
// - but α & β can't end until here (even with NLL)

The assignment to x "ends" all borrows to things pointed to by x - the borrows still exists, but the alias to them through x is inaccessible, which means there's no need to track accesses through it.

This is also required for the standard x = &mut (*x).next; pattern to work - if the borrow of (*x).next (actually for the old value of x) remained, that won't work the next time we go around the loop.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay, i can see how dataflow::impls::borrows could be affected

}

mir::StatementKind::Assign(_, ref rhs) => {
if let mir::Rvalue::Ref(region, _, _) = *rhs {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice to see that this ended up so compact. I expected more... trouble.

(rhs, span), location, flow_state);
}
StatementKind::SetDiscriminant { ref lvalue, variant_index: _ } => {
// FIXME: should this count as a mutate from borrowck POV?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deaggregator always runs after borrowck, so this doesn't matter. But this should count as a mutate.

StatementKind::Nop |
StatementKind::StorageLive(..) |
StatementKind::StorageDead(..) => {
// ignored by borrowck
Copy link
Contributor

@arielb1 arielb1 Aug 13, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

StorageDead is not ignored by borrowck. It is what causes non-drop values to be dropped when they go out of scope.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah! That seems important.

//
// FIXME: is above actually true? Do we want to track
// the fact that uninitialized data can be created via
// `NullOp::Box`?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's done by gather_moves.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... especially with the changes added by #43772, right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And even before that. Yeah.

}
}

// does any loan generated here conflict with another loan also generated here?
Copy link
Contributor

@arielb1 arielb1 Aug 13, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this happen? If so, please comment.

My model is that a borrow is a "local" mutate_lvalue + a "global" loan. How is that incorrect?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. This loop was meant to be analogous to the one here in AST borrowck.

I was starting from an assumption that in general, a given statement in a control-flow graph could introduce multiple loans (and those can be mutually conflicting).

But now that you have questioned the presence of this code, I am experiencing some deja-vu, in that I am realizing once again that MIR is designed such that each borrow arises solely from an Rvalue::Ref, each of which in turn arises solely from a StatementKind::Assign (perhaps that is what you were getting at by pointing out that a borrow "is a mutate_lvalue" ...?), and thus there is at most one loan generated from any given point in the control flow graph.

So, that's good news; not only can this loop go away, but also the supporting code like the IdxSet::pairs method...

@@ -306,3 +237,36 @@ pub(crate) fn drop_flag_effects_for_location<'a, 'tcx, F>(
}
}
}

fn propagate_assignment<'a, 'tcx, F>(tcx: TyCtxt<'a, 'tcx, 'tcx>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this will be killed by rebase.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(confirmed; that whole commit goes away)

@arielb1
Copy link
Contributor

arielb1 commented Aug 14, 2017

The infrastructure looks ok. I'll r+ after a rebase and then we'll go to make it perfect.

@pnkfelix pnkfelix force-pushed the mir-borrowck3c branch 2 times, most recently from ff07dec to 0aefd7d Compare August 16, 2017 10:46
ty::TyRawPtr(_) => {
return false;
}
ty::TyRef(_/*rgn*/, ty::TypeAndMut { ty: _, mutbl: hir::MutImmutable }) => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[00:03:55] tidy error: /checkout/src/librustc_mir/transform/borrow_check.rs:771: line longer than 100 chars
[00:03:55] tidy error: /checkout/src/librustc_mir/transform/borrow_check.rs:777: line longer than 100 chars
[00:03:55] tidy error: /checkout/src/librustc_mir/transform/borrow_check.rs:835: line longer than 100 chars
[00:03:55] tidy error: /checkout/src/librustc_mir/transform/borrow_check.rs:968: line longer than 100 chars
[00:03:55] tidy error: /checkout/src/librustc_mir/dataflow/mod.rs:50: line longer than 100 chars
[00:03:56] some tidy checks failed

…ter reuse by mir borrowck).

post-rebase: Do not put "(Ast)" suffix in error msg unless passed `-Z borrowck-mir`.
(But unconditionally include "(Mir)" suffix for mir-borrowck errors.)
post-rebase: addressed review comment: rename `loc_map`/`location_map` and `rgn_map`/`region_map`.

post-rebase: remove now unnecessary `mut` decl.

post-rebase: address comments: bind iterator expr, and alpha-rename `loc`/`location` and `idx`/`index`.
…parent module.

(This code is more general purpose than just supporting drop flag elaboration.)
…parent module.

Refactored `each_bit`, which traverses a `IdxSet`, so that the bulk of
its implementation lives in `rustc_data_structures`.
One can either use `-Z borrowck-mir` or add the `#[rustc_mir_borrowck]` attribute
to opt into MIR based borrow checking.

Note that regardless of whether one opts in or not, AST-based borrow
check will still run as well.  The errors emitted from AST-based
borrow check will include a "(Ast)" suffix in their error message,
while the errors emitted from MIR-based borrow check will include a
"(Mir)" suffix.

post-rebase: removed check for intra-statement mutual conflict;
replaced with assertion checking that at most one borrow is generated
per statement.

post-rebase: removed dead code: `IdxSet::pairs` and supporting stuff.
… in.

Post-rebase: ariel confirmed `SetDiscriminant` should indeed be a mutate.
Added two fixmes: The `SimplifyBranches` pass cannot stay where it is,
and `BorrowckMir` should be a query, not a pass. But I am going to
leave those changes to a future PR.
@pnkfelix
Copy link
Member Author

@bors r=arielb1

@bors
Copy link
Contributor

bors commented Aug 16, 2017

📌 Commit 8738a08 has been approved by arielb1

@bors
Copy link
Contributor

bors commented Aug 16, 2017

⌛ Testing commit 8738a08 with merge 00a6797...

bors added a commit that referenced this pull request Aug 16, 2017
MIR borrow check (under debug flag)

Here is the current state of MIR borrow check.

It consists of (1.) some refactoring, (2.) a dataflow analysis to identify the borrows themselves, and (3.) a mir "transform" that does the borrow check itself based on the aforementioned dataflow results.

(There's also a drive-by fix to dataflow that I can factor into a separate PR if necessary. Interestingly I could not find a way to observe the bug outside of MIR borrowck.)

To be clear, this branch is not ready to be used as the default borrow check. Thus the code is guarded: To get mir-borrowck to run, you need to either supply an attribute `#[rustc_mir_borrowck]` or a debug flag `-Z borrowck-mir`.

Here are the main issues with the current MIR borrowck as it stands in this PR:

 * No Notes emitted yet, just errors. (So the feedback is definitely inferior compared to AST borrowck today)
 * Lvalue rendering differs between Ast and Mir. (Mostly minor, but replacement of field names with indices is very bad; big priority for me to fix ASAP.)
 * Lots of ICEs (presumably because some MIR operations used here have well-formedness assumptions that are violated in borrowck-broken code)
 * Conflates lots of cases that are distinguished by AST-borrowck
 * Conflates "uninitialized" with "moved" (special case of previous bullet, one that I think should be fixed ASAP)

 (I am hoping to fix as many of the above issues as I can in the near term, but I also would like to land this even if they are *not* all fixed, because the rebasing effort is getting to be a real drag.)
@bors
Copy link
Contributor

bors commented Aug 16, 2017

☀️ Test successful - status-appveyor, status-travis
Approved by: arielb1
Pushing 00a6797 to master...

@bors bors merged commit 8738a08 into rust-lang:master Aug 16, 2017
@bors bors mentioned this pull request Aug 16, 2017
7 tasks
@pnkfelix pnkfelix mentioned this pull request Aug 21, 2017
bors added a commit that referenced this pull request Aug 21, 2017
Mir borrowck as query

Turn the `mir-borrowck` pass (aka "transform") into a query.

(If I had realized how relatively easy this was going to be, I would have made it part of #43108. `let hindsight = 20/20;`)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants