@@ -2,13 +2,12 @@ Region inference
22
33# Terminology
44
5- Note that we use the terms region and lifetime interchangeably,
6- though the term ` lifetime ` is preferred.
5+ Note that we use the terms region and lifetime interchangeably.
76
87# Introduction
98
109Region inference uses a somewhat more involved algorithm than type
11- inference. It is not the most efficient thing ever written though it
10+ inference. It is not the most efficient thing ever written though it
1211seems to work well enough in practice (famous last words). The reason
1312that we use a different algorithm is because, unlike with types, it is
1413impractical to hand-annotate with regions (in some cases, there aren't
@@ -25,22 +24,42 @@ once.
2524
2625The constraints are always of one of three possible forms:
2726
28- - ConstrainVarSubVar(R_i, R_j) states that region variable R_i
29- must be a subregion of R_j
30- - ConstrainRegSubVar(R, R_i) states that the concrete region R
31- (which must not be a variable) must be a subregion of the variable R_i
32- - ConstrainVarSubReg(R_i, R) is the inverse
27+ - ` ConstrainVarSubVar(Ri, Rj) ` states that region variable Ri must be
28+ a subregion of Rj
29+ - ` ConstrainRegSubVar(R, Ri) ` states that the concrete region R (which
30+ must not be a variable) must be a subregion of the variable Ri
31+ - ` ConstrainVarSubReg(Ri, R) ` states the variable Ri shoudl be less
32+ than the concrete region R. This is kind of deprecated and ought to
33+ be replaced with a verify (they essentially play the same role).
34+
35+ In addition to constraints, we also gather up a set of "verifys"
36+ (what, you don't think Verify is a noun? Get used to it my
37+ friend!). These represent relations that must hold but which don't
38+ influence inference proper. These take the form of:
39+
40+ - ` VerifyRegSubReg(Ri, Rj) ` indicates that Ri <= Rj must hold,
41+ where Rj is not an inference variable (and Ri may or may not contain
42+ one). This doesn't influence inference because we will already have
43+ inferred Ri to be as small as possible, so then we just test whether
44+ that result was less than Rj or not.
45+ - ` VerifyGenericBound(R, Vb) ` is a more complex expression which tests
46+ that the region R must satisfy the bound ` Vb ` . The bounds themselves
47+ may have structure like "must outlive one of the following regions"
48+ or "must outlive ALL of the following regions. These bounds arise
49+ from constraints like ` T: 'a ` -- if we know that ` T: 'b ` and ` T: 'c `
50+ (say, from where clauses), then we can conclude that ` T: 'a ` if `'b:
51+ 'a` *or* ` 'c: 'a`.
3352
3453# Building up the constraints
3554
3655Variables and constraints are created using the following methods:
3756
3857- ` new_region_var() ` creates a new, unconstrained region variable;
39- - ` make_subregion(R_i, R_j ) ` states that R_i is a subregion of R_j
40- - ` lub_regions(R_i, R_j ) -> R_k ` returns a region R_k which is
41- the smallest region that is greater than both R_i and R_j
42- - ` glb_regions(R_i, R_j ) -> R_k ` returns a region R_k which is
43- the greatest region that is smaller than both R_i and R_j
58+ - ` make_subregion(Ri, Rj ) ` states that Ri is a subregion of Rj
59+ - ` lub_regions(Ri, Rj ) -> Rk ` returns a region Rk which is
60+ the smallest region that is greater than both Ri and Rj
61+ - ` glb_regions(Ri, Rj ) -> Rk ` returns a region Rk which is
62+ the greatest region that is smaller than both Ri and Rj
4463
4564The actual region resolution algorithm is not entirely
4665obvious, though it is also not overly complex.
@@ -54,14 +73,6 @@ Alternatively, you can call `commit()` which ends all snapshots.
5473Snapshots can be recursive---so you can start a snapshot when another
5574is in progress, but only the root snapshot can "commit".
5675
57- # Resolving constraints
58-
59- The constraint resolution algorithm is not super complex but also not
60- entirely obvious. Here I describe the problem somewhat abstractly,
61- then describe how the current code works. There may be other, smarter
62- ways of doing this with which I am unfamiliar and can't be bothered to
63- research at the moment. - NDM
64-
6576## The problem
6677
6778Basically our input is a directed graph where nodes can be divided
@@ -83,31 +94,20 @@ Before resolution begins, we build up the constraints in a hashmap
8394that maps ` Constraint ` keys to spans. During resolution, we construct
8495the actual ` Graph ` structure that we describe here.
8596
86- ## Our current algorithm
87-
88- We divide region variables into two groups: Expanding and Contracting.
89- Expanding region variables are those that have a concrete region
90- predecessor (direct or indirect). Contracting region variables are
91- all others.
92-
93- We first resolve the values of Expanding region variables and then
94- process Contracting ones. We currently use an iterative, fixed-point
95- procedure (but read on, I believe this could be replaced with a linear
96- walk). Basically we iterate over the edges in the graph, ensuring
97- that, if the source of the edge has a value, then this value is a
98- subregion of the target value. If the target does not yet have a
99- value, it takes the value from the source. If the target already had
100- a value, then the resulting value is Least Upper Bound of the old and
101- new values. When we are done, each Expanding node will have the
102- smallest region that it could possibly have and still satisfy the
103- constraints.
104-
105- We next process the Contracting nodes. Here we again iterate over the
106- edges, only this time we move values from target to source (if the
107- source is a Contracting node). For each contracting node, we compute
108- its value as the GLB of all its successors. Basically contracting
109- nodes ensure that there is overlap between their successors; we will
110- ultimately infer the largest overlap possible.
97+ ## Computing the values for region variables
98+
99+ The algorithm is a simple dataflow algorithm. Each region variable
100+ begins as empty. We iterate over the constraints, and for each constraint
101+ we grow the relevant region variable to be as big as it must be to meet all the
102+ constraints. This means the region variables can grow to be ` 'static ` if
103+ necessary.
104+
105+ ## Verification
106+
107+ After all constraints are fully propoagated, we do a "verification"
108+ step where we walk over the verify bounds and check that they are
109+ satisfied. These bounds represent the "maximal" values that a region
110+ variable can take on, basically.
111111
112112# The Region Hierarchy
113113
0 commit comments