Scopes in mir #32428

nikomatsakis · 2016-03-22T16:15:08Z

This PR adds scopes to MIR. There is a tree of scopes (each represented by a ScopeId). Every statement, variable, and terminator now has an associated scope and span. It also adds a -Z dump-mir switch one can use to conveniently examine the MIR as optimizations proceed.

The intention is two-fold. First, to support MIR debug-info. This PR does not attempt to modify trans to make use of the scope information, however.

Second, in a more temporary capacity, to support the goal of moving regionck and borowck into the MIR. To that end, the PR also constructs a "scope auxiliary" table storing the extent of each span (this is kept separate from the main MIR, since it contains node-ids) and the dom/post-dom of the region in the graph where the scope occurs. When we move to non-lexical lifetimes, I expect this auxiliary information to be discarded, but that is still some ways in the future (requires, at minimum, an RFC, and there are some thorny details to work out -- though I've got an in-progress draft).

Right now, I'm just dropping this auxiliary information after it is constructed. I was debating for some time whether to add some sort of sanity tests, but decided to just open this PR instead, because I couldn't figure out what such a test would look like (and we don't have independent tests for this today beyond the regionck and borrowck tests).

I'd prefer not to store the auxiliary data into any kind of "per-fn" map. Rather, I'd prefer that we do regionck/borrowck/whatever-else immediately after construction -- that is, we build the MIR for fn X and immediately thereafter do extended correctness checking on it. This will reduce peak memory usage and also ensure that the auxiliary data doesn't exist once optimizations begin. It also clarifies the transition point where static checks are complete and MIR can be more freely optimized.

cc @rust-lang/compiler @nagisa

rust-highfive · 2016-03-22T16:15:20Z

r? @Aatch

(rust_highfive has picked a reviewer for you, use r? to override)

eddyb · 2016-03-22T16:35:29Z

src/librustc/mir/repr.rs

+}
+
+#[derive(Copy, Clone, Debug, RustcEncodable, RustcDecodable)]
+pub struct ScopeId(u32);


I almost feel like we should have a general-purpose Id<T> type for these kinds of things.

I've been wanting for some time to make a macro for this purpose.

On Tue, Mar 22, 2016 at 09:36:17AM -0700, Eduard-Mihai Burtescu wrote:

#[inline]

fn index(&self, index: ScopeId) -> &ScopeData {

&self.vec[index.index()]

}
+}

+impl IndexMut for ScopeDataVec {

#[inline]

fn index_mut(&mut self, index: ScopeId) -> &mut ScopeData {

&mut self.vec[index.index()]

}
+}

+#[derive(Copy, Clone, Debug, RustcEncodable, RustcDecodable)]
+pub struct ScopeId(u32);

I almost feel like we should have a general-purpose Id<T> type for these kinds of things.

You are receiving this because you authored the thread.
Reply to this email directly or view it on GitHub:
https://github.com/rust-lang/rust/pull/32428/files/9e8bcf4901c547a29ceed5e27606110ff26a9842#r57020384

At least in the HIR, it would be more practical to use a single generic type IMO.

nagisa · 2016-03-23T00:37:26Z

src/librustc_mir/transform/simplify_cfg.rs

        let mut changed = true;
        while changed {
+            pretty::dump_mir(tcx, "simplify_cfg", &counter, id, mir, None);


I suppose that was for debugging purposes?

@nagisa

I suppose that was for debugging purposes?

I was debugging things, but I left it in on purpose. My intention is that eventually we will have the option to dump MIR at every potentially interesting point during execution.

nikomatsakis · 2016-03-23T00:49:24Z

Haven't had time to review comments yet; I just added a kind of drive-by cleanup of scope dropping code to make it a bit less complex, and supply a better span to the resume terminator. Seems to work locally but I havent' done a -Z orbit build.

nagisa · 2016-03-23T12:30:17Z

src/librustc_mir/build/scope.rs

+        resumeblk
+    } else {
+        // Diverging from any other scope chains up to the previous scope.
+        build_diverge_scope(tcx, cfg, unit_temp, earlier_scopes)


I originally avoided recursion, because the common case is to have many scopes each having few drops. This means that even for simple functions we would recurse a lot (on top of already big amount of recursion in other MIR build code), which potentially will not work out very well on small stack environments (i.e. musl).

This will also needlessly recurse on empty scopes (which are quite common), which were previously ignored by the filter(|s| !s.drops.is_empty() || s.free.is_some()).

nikomatsakis · 2016-03-23T13:44:25Z

@nagisa

I missed two of your comments, an you raise a good point! You wrote:

Is the arg_scope_id here the ID of the scope of whole function?

and

I suspect this might be not correct. AFAIR the loop above does not pop any scopes, therefore this will be the scope_id of the inner-most scope. I’m not sure if that’s the real intention.

The way I coded it up, every terminator etc must have a scope, but that leaves the question of what scope the return block itself is. Perhaps we want an outer-most scope that is never popped. This would I guess correspond to the ParameterScope extent. Return would target that extent (but not exit it).
Hmm, good catch, I think you are correct that the scope there is probably wrong. I wll ad

nikomatsakis · 2016-03-23T14:14:46Z

er, not the ParameterScope, but the CallSiteScope

bors · 2016-03-23T19:37:12Z

☔ The latest upstream changes (presumably #32390) made this pull request unmergeable. Please resolve the merge conflicts.

We should probably settle on some conventions here. In MIR code, I have generally been importing `*`, but perhaps borrowck does not want to do that.

also, when exiting a scope, assign the final goto terminator with the target scope's id

nikomatsakis · 2016-03-23T20:44:17Z

Rebased, make check passes locally.

nikomatsakis · 2016-03-23T20:44:40Z

@nagisa

Weren’t we supposed to execute drops first and then frees second (drops may depend on memory still being allocated, but frees do not otherwise care about things being (not-)dropped)?

I believe the code as written does execute drops first. Note that it builds up the drops in reverse.

nikomatsakis · 2016-03-23T21:09:29Z

@nagisa btw, the review sort of "defaulted" to @Aatch, which is fine with me -- but I don't know how much time he has. You want to be reviewer? (seeing as you've already taken a look) Does anyone have objections to this basic approach?

One thing that occurred to me while discussing with @pnkfelix -- we should figure out how we plan to handle "lifetime start/lifetime end" intrinsics in MIR trans. If we want to emit them when we exit a scope, we have to keep the "scope-auxiliary" information up-to-date. This seems plausible, but it may be easier to do it by having a SCOPE_EXIT(x) statement (so it would be carried around automatically). If we added such an instruction, it would also be nice for dataflow, I think, since it would correspond to kills (e.g. for all loans tied to that scope). (In trans, it would correspond to emitting lifetime-end intrinsics).

An alternative would be added lifetime-end explicitly in MIR, but it seems like that will result in many more instructions -- you would need N*M "words" where N is number of variables and M is number of exits from a scope, vs N+M (one word per variable to tie it to a scope, one word per exit for the statement).

nikomatsakis · 2016-03-23T21:52:32Z

(To be clear, if we were going to have SCOPE_EXIT statements, I'd remove the doms/postdoms fields from "scope auxiliary" -- and I guess just change it to ScopeExtents)

nagisa · 2016-03-23T23:02:57Z

@nikomatsakis I don’t think I have much more outstanding comments other than my dislike about needlessly making exit_scope recursive again.

re lifetimes: it seems to me we already store the necessary information in scopes and could implicitly insert e.g. lifetime-end on scope exits. I’m not sure it is necessary to encode MIR variable lifetimes in MIR itself. At very least it would make writing MIR passes considerably more complex, because most of these do not need or want to care about variable lifetimes at all.

nagisa · 2016-03-23T23:16:26Z

r? @nagisa

Sure.

nikomatsakis · 2016-03-24T00:03:31Z

re lifetimes: it seems to me we already store the necessary information in scopes and could implicitly insert e.g. lifetime-end on scope exits.

I don't quite follow what you mean. The current version of this branch stores the information in the doms/post-doms vectors, which are going to be a pain to maintain (much harder than pseudo-instructions, which virtually maintain themselves; even better if they take a vector of operands that link to the variables that are to be dropped, though that gets back to the N*M storage question). I was imagining that we would just throw them way once optimization begins, but of course we can't do that if we need them in mir-trans.

I was thinking that perhaps we could recompute the end-lifetime instructions, but that's not really true. The problem is borrows: we'd need (at least) the region information for each borrow, which I plan to erase. But even if we didn't erase it, those are just going to be regions of the CFG which must be maintained, which again complicates later optimizations and so forth.

Basically: the reason to contemplate adding statements is to ensure that other optimizations don't have to think about this. These statements essentially are the same as DROP, but they are not terminators (and hence are only needed for variables that are Copy).

nikomatsakis · 2016-03-24T00:06:43Z

@nagisa

I don’t think I have much more outstanding comments other than my dislike about needlessly making exit_scope recursive again.

Heh. Recursion seemed pretty natural, but I guess I can rewrite it into a loop easily enough.

while I'm at it, remove the "extra caching" that I was doing for no good reason except laziness. Basically before I was caching at each scope in the chain, but there's not really a reason to do that, since the cached entry point at level N is always equal to the last cached exit point from level N-1.

nikomatsakis · 2016-03-24T00:51:00Z

@nagisa made it iterative.

We now visit more things (e.g., types) and also follow a deliberate style designed to reduce omissions.

nagisa · 2016-03-24T11:34:53Z

src/librustc_mir/build/scope.rs

            return None;
        }
+        assert!(!self.scopes.is_empty()); // or `all` above would be true


Is this assertion really necessary or, perhaps, it is misplaced and should go above the if?

@nagisa it's more a comment than an assertion, I guess. Although I think that, in practice, the list of scopes cannot be empty here, so it could in principle go above the if

nagisa · 2016-03-24T11:36:19Z

Commits up to 0769865 look okay to me. It seems you have more plans in mind for this PR, so just ping me once it is ready for final review.

nikomatsakis · 2016-03-24T13:20:35Z

@nagisa meh, I think I may well just land what you reviewed, and we can iterate more in other PRs.

nikomatsakis · 2016-03-24T13:26:39Z

(@pnkfelix reviewed the visitor commit)

nikomatsakis · 2016-03-24T13:26:43Z

@bors r=nagisa

bors · 2016-03-24T13:26:44Z

📌 Commit 091a007 has been approved by nagisa

bors · 2016-03-25T06:12:58Z

⌛ Testing commit 091a007 with merge f1578d3...

@nagisa

Scopes in mir This PR adds scopes to MIR. There is a tree of scopes (each represented by a `ScopeId`). Every statement, variable, and terminator now has an associated scope and span. It also adds a `-Z dump-mir` switch one can use to conveniently examine the MIR as optimizations proceed. The intention is two-fold. First, to support MIR debug-info. This PR does not attempt to modify trans to make use of the scope information, however. Second, in a more temporary capacity, to support the goal of moving regionck and borowck into the MIR. To that end, the PR also constructs a "scope auxiliary" table storing the extent of each span (this is kept separate from the main MIR, since it contains node-ids) and the dom/post-dom of the region in the graph where the scope occurs. When we move to non-lexical lifetimes, I expect this auxiliary information to be discarded, but that is still some ways in the future (requires, at minimum, an RFC, and there are some thorny details to work out -- though I've got an in-progress draft). Right now, I'm just dropping this auxiliary information after it is constructed. I was debating for some time whether to add some sort of sanity tests, but decided to just open this PR instead, because I couldn't figure out what such a test would look like (and we don't have independent tests for this today beyond the regionck and borrowck tests). I'd prefer not to store the auxiliary data into any kind of "per-fn" map. Rather, I'd prefer that we do regionck/borrowck/whatever-else immediately after construction -- that is, we build the MIR for fn X and immediately thereafter do extended correctness checking on it. This will reduce peak memory usage and also ensure that the auxiliary data doesn't exist once optimizations begin. It also clarifies the transition point where static checks are complete and MIR can be more freely optimized. cc @rust-lang/compiler @nagisa

bors · 2016-03-25T12:03:11Z

rust-highfive assigned Aatch Mar 22, 2016

eddyb reviewed Mar 22, 2016
View reviewed changes

nikomatsakis force-pushed the scopes-in-mir branch from 1804fd8 to c3e7929 Compare March 23, 2016 00:37

nagisa reviewed Mar 23, 2016
View reviewed changes

nikomatsakis force-pushed the scopes-in-mir branch 2 times, most recently from 443d02f to 07e4843 Compare March 23, 2016 09:59

nagisa reviewed Mar 23, 2016
View reviewed changes

nikomatsakis added 2 commits March 23, 2016 16:37

integrate scopes into MIR

464c02e

record a scope for each VarDecl

323d7f4

nikomatsakis added 3 commits March 23, 2016 16:42

add comments on remaining fields

2b96cfb

Update borrowck to use repr::* instead of a mix

1c0fa34

We should probably settle on some conventions here. In MIR code, I have generally been importing `*`, but perhaps borrowck does not want to do that.

introduce "call-site-scope" as the outermost scope

a276e75

also, when exiting a scope, assign the final goto terminator with the target scope's id

nikomatsakis force-pushed the scopes-in-mir branch from e1962b8 to a276e75 Compare March 23, 2016 20:43

rust-highfive assigned nagisa and unassigned Aatch Mar 23, 2016

rework MIR visitor

ed7c30b

We now visit more things (e.g., types) and also follow a deliberate style designed to reduce omissions.

nagisa reviewed Mar 24, 2016
View reviewed changes

pacify the merciless tidy

091a007

bors merged commit 091a007 into rust-lang:master Mar 25, 2016

bors mentioned this pull request Mar 25, 2016

Flatten rustc and rustc_trans module hierarchy slightly. #32432

Merged

nikomatsakis deleted the scopes-in-mir branch March 30, 2016 16:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Scopes in mir #32428

Scopes in mir #32428

nikomatsakis commented Mar 22, 2016

rust-highfive commented Mar 22, 2016

eddyb Mar 22, 2016

nikomatsakis Mar 22, 2016

eddyb Mar 22, 2016

nagisa Mar 23, 2016

nikomatsakis Mar 23, 2016

nikomatsakis commented Mar 23, 2016

nagisa Mar 23, 2016

nagisa Mar 23, 2016

nikomatsakis commented Mar 23, 2016

nikomatsakis commented Mar 23, 2016

bors commented Mar 23, 2016

nikomatsakis commented Mar 23, 2016

nikomatsakis commented Mar 23, 2016

nikomatsakis commented Mar 23, 2016

nikomatsakis commented Mar 23, 2016

nagisa commented Mar 23, 2016

nagisa commented Mar 23, 2016

nikomatsakis commented Mar 24, 2016

nikomatsakis commented Mar 24, 2016

nikomatsakis commented Mar 24, 2016

nagisa Mar 24, 2016

nikomatsakis Mar 24, 2016

nagisa commented Mar 24, 2016

nikomatsakis commented Mar 24, 2016

nikomatsakis commented Mar 24, 2016

nikomatsakis commented Mar 24, 2016

bors commented Mar 24, 2016

bors commented Mar 25, 2016

bors commented Mar 25, 2016

Scopes in mir #32428

Scopes in mir #32428

Conversation

nikomatsakis commented Mar 22, 2016

rust-highfive commented Mar 22, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nikomatsakis commented Mar 23, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nikomatsakis commented Mar 23, 2016

nikomatsakis commented Mar 23, 2016

bors commented Mar 23, 2016

nikomatsakis commented Mar 23, 2016

nikomatsakis commented Mar 23, 2016

nikomatsakis commented Mar 23, 2016

nikomatsakis commented Mar 23, 2016

nagisa commented Mar 23, 2016

nagisa commented Mar 23, 2016

nikomatsakis commented Mar 24, 2016

nikomatsakis commented Mar 24, 2016

nikomatsakis commented Mar 24, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nagisa commented Mar 24, 2016

nikomatsakis commented Mar 24, 2016

nikomatsakis commented Mar 24, 2016

nikomatsakis commented Mar 24, 2016

bors commented Mar 24, 2016

bors commented Mar 25, 2016

bors commented Mar 25, 2016