Skip to content

Commit

Permalink
a simple and flow-insensitive alias analysis
Browse files Browse the repository at this point in the history
This commit implements a simple, flow-insensitive alias analysis using an
approach inspired by the escape analysis algorithm explained in the old JVM paper [^JVM05].

`EscapeLattice` is extended so that it also keeps track of possible field values.
In more detail, `x::EscapeLattice` has the new field called
`x.FieldSet::Union{Vector{IdSet{Any}},Bool}`, where:
- `x.FieldSets === false` indicates the fields of `x` isn't analyzed yet
- `x.FieldSets === true` indicates the fields of `x` can't be analyzed,
  e.g. the type of `x` is not concrete and thus the number of its fields
  can't known precisely
- otherwise `x.FieldSets::Vector{IdSet{Any}}` holds all the possible
  values of each field, where `x.FieldSets[i]` keeps all possibilities
  that the `i`th field can be

And now, in addition to managing escape lattice elements, the analysis
state also maintains an "alias set" `state.aliasset::IntDisjointSet{Int}`,
which is implemented as a disjoint set of aliased arguments and SSA statements.
When the fields of object `x` are known precisely (i.e. `x.FieldSets isa Vector{IdSet{Any}}` holds),
the alias set is updated each time `z = getfield(x, y)` is encountered in a way that `z` is
aliased to all values of `x.FieldSets[y]`, so that escape information imposed on `z` will be
propagated to all the aliased values and `z` can be replaced with an aliased value later.
Note that in a case when the fields of object `x` can't known precisely (i.e. `x.FieldSets` is `true`),
when `z = getfield(x, y)` is analyzed, escape information of `z` is propagated to `x` rather
than any of `x`'s fields, which is the most conservative propagation since escape information
imposed on `x` will end up being propagated to all of its fields anyway at definitions of `x`
(i.e. `:new` expression or `setfield!` call).

[^JVM05]: Escape Analysis in the Context of Dynamic Compilation and Deoptimization.
          Thomas Kotzmann and Hanspeter Mössenböck, 2005, June.
          <https://dl.acm.org/doi/10.1145/1064979.1064996>.

Now this alias analysis should allow us to implement a "stronger" SROA,
which eliminates the allocation of `r` within the following code:
```julia
julia> result = analyze_escapes((String,)) do s
           r = Ref(s)
           broadcast(identity, r)
       end
\#3(_2::String *, _3::Base.RefValue{String} ◌) in Main at REPL[2]:2
2 ↓ 1 ─ %1 = %new(Base.RefValue{String}, _2)::Base.RefValue{String}                                                                                                                                        │╻╷╷     Ref
3 ✓ │   %2 = Core.tuple(%1)::Tuple{Base.RefValue{String}}                                                                                                                                                  │╻       broadcast
  ↓ │   %3 = Core.getfield(%2, 1)::Base.RefValue{String}                                                                                                                                                   ││
  ◌ └──      goto #3 if not true                                                                                                                                                                           ││╻╷      materialize
  ◌ 2 ─      nothing::Nothing                                                                                                                                                                              │
  * 3 ┄ %6 = Base.getfield(%3, :x)::String                                                                                                                                                                 │││╻╷╷╷╷   copy
  ◌ └──      goto #4                                                                                                                                                                                       ││││┃       getindex
  ◌ 4 ─      goto #5                                                                                                                                                                                       ││││
  ◌ 5 ─      goto #6                                                                                                                                                                                       │││
  ◌ 6 ─      goto #7                                                                                                                                                                                       ││
  ◌ 7 ─      return %6                                                                                                                                                                                     │

julia> EscapeAnalysis.get_aliases(result.state.aliasset, Core.SSAValue(6), result.ir)
2-element Vector{Union{Core.Argument, Core.SSAValue}}:
 Core.Argument(2)
 :(%6)
```
Note that the allocation `%1` isn't analyzed as `ReturnEscape`, still `_2` is analyzed so.
  • Loading branch information
aviatesk committed Nov 17, 2021
1 parent 4ba961c commit fd63869
Show file tree
Hide file tree
Showing 4 changed files with 813 additions and 78 deletions.
26 changes: 25 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,12 @@ This analysis works on a lattice called `x::EscapeLattice`, which holds the foll
the caller simply because it's passed as call argument
- `x.ThrownEscape::Bool`: indicates `x` may escape to somewhere through an exception (possibly as a field)
- `x.EscapeSites::BitSet`: records program counters (SSA numbers) where `x` can escape
- `x.FieldSets::Union{Vector{IdSet{Any}},Bool}`: maintains the sets of possible values of fields of `x`:
* `x.FieldSets === false` indicates the fields of `x` isn't analyzed yet
* `x.FieldSets === true` indicates the fields of `x` can't be analyzed, e.g. the type of `x`
is not concrete and thus the number of its fields can't known precisely
* otherwise `x.FieldSets::Vector{IdSet{Any}}` holds all the possible values of each field,
where `x.FieldSets[i]` keeps all possibilities that the `i`th field can be
- `x.ArgEscape::Int` (not implemented yet): indicates it will escape to the caller through `setfield!` on argument(s)
* `-1` : no escape
* `0` : unknown or multiple
Expand All @@ -30,7 +36,7 @@ An abstract state will be initialized with the bottom(-like) elements:
is slightly lower than `NoEscape`, but at the same time doesn't represent any meaning
other than it's not analyzed yet (thus it's not formally part of the lattice).

Escape analysis implementation is based on the data-flow algorithm described in the paper [^MM02].
Escape analysis implementation is based on the data-flow algorithm described in the old paper [^MM02].
The analysis works on the lattice of [`EscapeLattice`](@ref) and transitions lattice elements
from the bottom to the top in a _backward_ way, i.e. data flows from usage cites to definitions,
until every lattice gets converged to a fixed point by maintaining a (conceptual) working set
Expand All @@ -39,6 +45,24 @@ The analysis only manages a single global state that tracks `EscapeLattice` of e
and SSA statement, but also note that some flow-sensitivity is encoded as program counters
recorded in the `EscapeSites` property of each each lattice element.

The analysis also collects alias information using an approach, which is inspired by
the escape analysis algorithm explained in yet another old paper [^JVM05].
In addition to managing escape lattice elements, the analysis state also maintains an "alias set",
which is implemented as a disjoint set of aliased arguments and SSA statements.
When the fields of object `x` are known precisely (i.e. `x.FieldSets isa Vector{IdSet{Any}}` holds),
the alias set is updated each time `z = getfield(x, y)` is encountered in a way that `z` is
aliased to all values of `x.FieldSets[y]`, so that escape information imposed on `z` will be
propagated to all the aliased values and `z` can be replaced with an aliased value later.
Note that in a case when the fields of object `x` can't known precisely (i.e. `x.FieldSets` is `true`),
when `z = getfield(x, y)` is analyzed, escape information of `z` is propagated to `x` rather
than any of `x`'s fields, which is the most conservative propagation since escape information
imposed on `x` will end up being propagated to all of its fields anyway at definitions of `x`
(i.e. `:new` expression or `setfield!` call).

[^MM02]: _A Graph-Free approach to Data-Flow Analysis_.
Markas Mohnen, 2002, April.
<https://api.semanticscholar.org/CorpusID:28519618>.

[^JVM05]: _Escape Analysis in the Context of Dynamic Compilation and Deoptimization_.
Thomas Kotzmann and Hanspeter Mössenböck, 2005, June.
<https://dl.acm.org/doi/10.1145/1064979.1064996>.
Loading

0 comments on commit fd63869

Please sign in to comment.