Review feedback.

dotnet · Oct 4, 2018 · 5ed2b35 · 5ed2b35
1 parent ead5e0f
commit 5ed2b35
Showing 1 changed file with 13 additions and 4 deletions.
diff --git a/Documentation/design-docs/object-stack-allocation.md b/Documentation/design-docs/object-stack-allocation.md
@@ -4,12 +4,12 @@ This document describes work to enable object stack allocation in .NET Core.
 
 ## Motivation
 
-In .NET instances of object types are allocated on the garbage-collected heap.
+In .NET instances of reference types are allocated on the garbage-collected heap.
 Such allocations have performance overhead at garbage collection time. The allocator also has to ensure that the memory is fully zero-initialized.
 If the lifetime of an object is bounded by the lifetime of the allocating method, the allocation
 may be moved to the stack. The benefits of this optimization:
 
-* The pressure on the garbage collector is reduced because the GC heap becomes smaller.
+* The pressure on the garbage collector is reduced because the GC heap becomes smaller. The garbage collector doesn't have to be involved in allocating or deallocating these objects.
 * Object field accesses may become cheaper if the compiler is able to do scalar replacement of the fields of the stack-allocated object
 (i.e., if the fields can be promoted).
 * Some field zero-initializations may be elided by the compiler.
@@ -32,7 +32,7 @@ Several escape algorithms have been implemented in different Java implementation
 [[1]](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.73.4799&rep=rep1&type=pdf)
 is the most precise and most expensive (it is based on connection graphs) and was used in the context of a static Java compiler,
 [[3]](https://pdfs.semanticscholar.org/1b33/dff471644f309392049c2791bca9a7f3b19c.pdf)
-is the least precise and cheapest (it doesn't track references through assignments of fields) and was used in MSR's Marmot implementation
+is the least precise and cheapest (it doesn't track references through assignments of fields) and was used in MSR's Marmot implementation.
 [[2]](https://www.usenix.org/legacy/events/vee05/full_papers/p111-kotzmann.pdf)
 is between
 [[1]](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.73.4799&rep=rep1&type=pdf) and 
@@ -42,6 +42,8 @@ both in analysis precision and cost. It was used in Java HotSpot.
 Effectiveness of object stack allocation depends in large part on whether escape analysis is done inter-procedurally.
 With intra-procedural analysis only, the compiler has to assume that arguments escape at all non-inlined call sites,
 which blocks many stack allocations. In particular, assuming that 'this' argument always escapes hurts the optimization.
+[[4]](http://www.ssw.uni-linz.ac.at/Research/Papers/Stadler14/Stadler2014-CGO-PEA.pdf) describes an approach that
+handle objects that only escape on some paths by promoting them to the heap "just in time" as control reaches those paths.
 
 There are several choices for where escape analysis can be performed:
 
@@ -54,7 +56,8 @@ There are several choices for where escape analysis can be performed:
 
 **Cons:**
 * The jit analyzes methods top-down, i.e., callers before callees (when inlining), which doesn't fit well with the stack allocation optimization.
-* Full interprocedural analysis is too expensive for the jit, even at high tiering levels.
+* Full interprocedural analysis is expensive for the jit, even at high tiering levels. Background on-demand/full interprocedural analysis might be feasible
+  if we have the ability to memoize method properties with (in)validation.
 
 Possible approaches to interprocedural analysis in the jit:
 * We can run escape analysis concurrently with inlining and analyze callee's parameters for escaping while inspecting
@@ -93,6 +96,10 @@ newobj for the object that was determined to be non-escaping. Note that assembli
 An alternative is to annotate parameters with escape information so that the annotations can be verified by the jit with
 local analysis.
 
+If the methods whose info was used for interprocedural escape analysis are allowed to change after the analysis, the jit either needs
+to inline those methods or there should be a mechanism to immediately revoke methods with stack allocated objects that relied on
+that analysis.
+
 ## Other restrictions on stack allocations
 
 * Objects with finalizers can't be stack-allocated since they always escape to the finalizer queue. 
@@ -188,3 +195,5 @@ Also, we may be able to reuse the infrastructure from other projects, i.e., [ILS
 [[2] Thomas Kotzmann and Hanspeter Moessenbroeck. Escape Analysis in the Context of Dynamic Compilation and Deoptimization](https://www.usenix.org/legacy/events/vee05/full_papers/p111-kotzmann.pdf)
 
 [[3] David Gay and Bjarne Steensgaard. Fast Escape Analysis and Stack Allocation for Object-Based Programs](https://pdfs.semanticscholar.org/1b33/dff471644f309392049c2791bca9a7f3b19c.pdf)
+
+[[4] Lukas Stadler at al. Partial Escape Analysis and Scalar Replacement for Java](http://www.ssw.uni-linz.ac.at/Research/Papers/Stadler14/Stadler2014-CGO-PEA.pdf)