inlining: relax finalizer inlining control-flow restriction #46651

aviatesk · 2022-09-06T15:10:51Z

Eager finalizer inlining (#45272) currently has a restriction that
requires all the def/uses to be in a same basic block.

This commit relaxes that restriction a bit by allowing def/uses to
involve control flow when all of them are dominated by a finalizer
call to be inlined, since in that case it is safe to insert the body of
finalizer at the end of all the def/uses, e.g.

const FINALIZATION_COUNT = Ref(0)
init_finalization_count!() = FINALIZATION_COUNT[] = 0
get_finalization_count() = FINALIZATION_COUNT[]
@noinline add_finalization_count!(x) = FINALIZATION_COUNT[] += x
@noinline Base.@assume_effects :nothrow safeprint(io::IO, x...) = (@nospecialize; print(io, x...))
mutable struct DoAllocWithFieldInter
    x::Int
end
function register_finalizer!(obj::DoAllocWithFieldInter)
    finalizer(obj) do this
        add_finalization_count!(this.x)
    end
end

function cfg_finalization3(io)
    for i = -999:1000
        o = DoAllocWithFieldInter(i)
        register_finalizer!(o)
        if i == 1000
            safeprint(io, o.x, '\n')
        elseif i > 0
            safeprint(io, o.x)
        end
    end
end
let src = code_typed1(cfg_finalization3, (IO,))
    @test count(isinvoke(:add_finalization_count!), src.code) == 1
end
let
    init_finalization_count!()
    cfg_finalization3(IOBuffer())
    @test get_finalization_count() == 1000
end

There is a room for further improvement though -- if we have a way to
compute the post-domination, we can also inline a finalizer call
in a case when a finalizer block post-dominateds all the def/uses e.g.

function cfg_finalization5(io)
    for i = -999:1000
        o = DoAllocWithFieldInter(i)
        if i == 1000
            safeprint(io, o.x, '\n')
        elseif i > 0
            safeprint(io, o.x)
        end
        register_finalizer!(o)
    end
end
let src = code_typed1(cfg_finalization5, (IO,))
    # TODO we can fix this case by checking a case when a finalizer block is post-dominated by all the def/uses
    @test_broken count(isinvoke(:nothrow_side_effect), src.code) == 1
end

aviatesk · 2022-09-06T15:13:29Z

test/compiler/inline.jl

-    @test count(isnew, src.code) == 0
+    # FIXME
+    @test_broken count(isnew, src.code) == 0 && count(iscall(DoAllocNoEscapeSparam), src.code) == 0


This is not because of this PR, but I found that this test case has been broken.
On master:

julia> src = code_typed1(Tuple{Any}) do x for i = 1:1000 DoAllocNoEscapeSparam(x) end end CodeInfo( 1 ─ goto #7 if not true 2 ┄ %2 = φ (#1 => 1, #6 => %9)::Int64 │ Main.DoAllocNoEscapeSparam(x)::Any │ %4 = (%2 === 1000)::Bool └── goto #4 if not %4 3 ─ goto #5 4 ─ %7 = Base.add_int(%2, 1)::Int64 └── goto #5 5 ┄ %9 = φ (#4 => %7)::Int64 │ %10 = φ (#3 => true, #4 => false)::Bool │ %11 = Base.not_int(%10)::Bool └── goto #7 if not %11 6 ─ goto #2 7 ┄ return nothing )

Maybe because of #45062? /cc @ianatol

Sure thing, I will take a look today

vtjnash

sgtm

base/compiler/ssair/passes.jl

Keno · 2022-09-07T07:40:05Z

base/compiler/ssair/passes.jl

-    all(check_in_range, defuse.uses) || return nothing
-    all(check_in_range, defuse.defs) || return nothing
+    all(check_defuse, defuse.uses) || return nothing
+    all(check_defuse, defuse.defs) || return nothing

    # For now: Require all statements in the basic block range to be nothrow.
    all(minval:maxval) do idx::Int


This should probably only look at the statements in blocks on the domination path, but we can do that in the followup. I'd also like a error_is_fatal optimizer setting that assumes that any errors are fatal and thus finalizers don't need to run after errors, implying that we can skip this legality check.

Keno · 2022-09-07T08:31:17Z

With this restriction lifted, it's no longer sufficient to just inline the finalizer after the numerically last SSA value that uses it. Something needs to ensure the finalizer happens after the last use on any path to the exit:

julia> @noinline nothrow_side_effect(x) =
           Base.@assume_effects :total !:effect_free @ccall jl_(x::Any)::Cvoid
nothrow_side_effect (generic function with 1 method)

julia> mutable struct DoAllocWithField
           x
           function DoAllocWithField(x)
               finalizer(new(x)) do this
                   nothrow_side_effect(this.x)
               end
           end
       end

julia> function foo(b::Bool)
           s = DoAllocWithField(1)
           if b
               nothrow_side_effect(s.x)
               return 1
           else
               return 0
           end
       end
foo (generic function with 1 method)

julia> code_typed(foo, Tuple{Bool})
1-element Vector{Any}:
 CodeInfo(
1 ─ %1 = %new(Main.:(var"#3#4"))::var"#3#4"
│   %2 = %new(Main.DoAllocWithField, 1)::DoAllocWithField
└──      goto #3 if not b
2 ─ %4 = Base.getfield(%2, :x)::Any
│        invoke %1(%2::DoAllocWithField)::Nothing
│        Main.nothrow_side_effect(%4)::Any
└──      return 1
3 ─      return 0
) => Int64

julia> foo(true)
1
1
1

julia> foo(false)
0

Keno

I think this is the right direction overall, but see the correctness issue mentioned in the main issue body.

staticfloat · 2022-09-08T18:48:10Z

Just FYI, multiple windows buildbots hardlocked on this PR, and I had to manually reboot them. I'm not sure if that's a coincidence or not, but you can see the builds here:

So perhaps something on this branch causes things to jam up in a bad way? Of course, it is a failing of the current windows buildbots that they do not more forcefully kill the windows processes and get stuck like this, but that's something we are working to fix in the Buildkite era.

Keno · 2022-09-21T03:24:39Z

I've implemented the post-dominator analysis and updated this. I think this should be reasonably complete now. @aviatesk take a look, please.

fixed

aviatesk · 2022-09-21T09:52:27Z

Looks great, thanks sooo much for implementing the post-domination analysis, that is very exciting itself.
I think there seems to be a problem in finalizer insertion point though, MRE would be:

const FINALIZATION_COUNT = Ref(0)
init_finalization_count!() = FINALIZATION_COUNT[] = 0
get_finalization_count() = FINALIZATION_COUNT[]
@noinline add_finalization_count!(x) = FINALIZATION_COUNT[] += x
@noinline Base.@assume_effects :nothrow safeprint(io::IO, x...) = (@nospecialize; print(io, x...))
@test Core.Compiler.is_finalizer_inlineable(Base.infer_effects(add_finalization_count!, (Int,)))

mutable struct DoAllocWithFieldInter
    x::Int
end
function register_finalizer!(obj::DoAllocWithFieldInter)
    finalizer(obj) do this
        add_finalization_count!(this.x)
    end
end

function cfg_finalization6(io)
    for i = -999:1000
        o = DoAllocWithFieldInter(0)
        register_finalizer!(o)
        if i == 1000
            o.x = i # with `setfield!`
        elseif i > 0
            safeprint(io, o.x, '\n')
        end
        # <= shouldn't the finalizer be inlined here?
    end
end
let src = code_typed1(cfg_finalization6, (IO,))
    @test count(isinvoke(:add_finalization_count!), src.code) == 1
end
let
    init_finalization_count!()
    cfg_finalization6(IOBuffer())
    @test get_finalization_count() == 1000 # this fails
end

Currently the finalizer is inlined after the allocation site, so can't observe the field value changed by setfield!:

julia> src = code_typed1(cfg_finalization6, (IO,))
CodeInfo(
1 ──       goto #11 if not true
2 ┄─ %2  = φ (#1 => -999, #10 => %20)::Int64
│    %3  = φ (#1 => -999, #10 => %21)::Int64
│    %4  = %new(Main.DoAllocWithFieldInter, 0)::DoAllocWithFieldInter
│    %5  = Base.getfield(%4, :x)::Int64
│          invoke Main.add_finalization_count!(%5::Int64)::Int64
│    %7  = (%2 === 1000)::Bool
└───       goto #4 if not %7
3 ──       Base.setfield!(%4, :x, %2)::Int64
└───       goto #6
4 ── %11 = Base.slt_int(0, %2)::Bool
└───       goto #6 if not %11
5 ── %13 = Base.getfield(%4, :x)::Int64
└───       invoke Main.safeprint(io::IO, %13::Any, '\n'::Vararg{Any})::Any
6 ┄─ %15 = (%3 === 1000)::Bool
└───       goto #8 if not %15
7 ──       goto #9
8 ── %18 = Base.add_int(%3, 1)::Int64
└───       goto #9
9 ┄─ %20 = φ (#8 => %18)::Int64
│    %21 = φ (#8 => %18)::Int64
│    %22 = φ (#7 => true, #8 => false)::Bool
│    %23 = Base.not_int(%22)::Bool
└───       goto #11 if not %23
10 ─       goto #2
11 ┄       return nothing
)

aviatesk

Just minor style nits.

test/compiler/inline.jl

base/compiler/ssair/passes.jl

base/compiler/ssair/domtree.jl

Keno · 2022-09-21T10:25:20Z

I think there seems to be a problem in finalizer insertion point though, MRE would be:

Should be fixed by the following, if you want to apply that and add the test

diff --git a/base/compiler/ssair/passes.jl b/base/compiler/ssair/passes.jl
index 00577526ac..1e9d616a96 100644
--- a/base/compiler/ssair/passes.jl
+++ b/base/compiler/ssair/passes.jl
@@ -1105,8 +1105,8 @@ end
 is_nothrow(ir::IRCode, pc::Int) = (ir.stmts[pc][:flag] & IR_FLAG_NOTHROW) ≠ 0

 function reachable_blocks(cfg::CFG, from_bb, to_bb = nothing)
-    worklist = Int[]
-    visited = BitSet()
+    worklist = Int[from_bb]
+    visited = BitSet(from_bb)
     if to_bb !== nothing
         push!(visited, to_bb)
     end
@@ -1116,7 +1116,6 @@ function reachable_blocks(cfg::CFG, from_bb, to_bb = nothing)
             push!(worklist, bb)
         end
     end
-    visit!(from_bb)
     while !isempty(worklist)
         foreach(visit!, cfg.blocks[pop!(worklist)].succs)
     end
@@ -1148,7 +1147,15 @@ function try_resolve_finalizer!(ir::IRCode, idx::Int, finalizer_idx::Int, defuse
     bb_insert_idx = finalizer_idx
     function note_block_use!(bb, idx)
         new_bb_insert_block = nearest_common_dominator(get!(lazypostdomtree), bb_insert_block, bb)
-        bb_insert_idx = new_bb_insert_block == bb_insert_block ? max(idx, bb_insert_idx) : nothing
+        if new_bb_insert_block == bb
+            if bb == bb_insert_block
+                bb_insert_idx = max(bb_insert_idx, idx)
+            else
+                bb_insert_idx = idx
+            end
+        else
+            bb_insert_idx = nothing
+        end
         bb_insert_block = new_bb_insert_block
         nothing
     end

Plus an appropriate adjustment to the other use of reachable_blocks:

-         blocks = reachable_blocks(ir.cfg, finalizer_bb, bb_insert_block)
+        # Collect all reachable blocks between the finalizer registration and the
+        # insertion point
+        blocks = finalizer_bb == bb_insert_block ? Int[finalizer_bb] :
+            reachable_blocks(ir.cfg, finalizer_bb, bb_insert_block)

aviatesk · 2022-09-21T13:05:27Z

base/compiler/ssair/passes.jl

+    function check_range_nothrow(range)
+        return all(finalizer_idx:maxval) do sidx::Int


This function seems to do nothing actually (maxval is actually set to 0). I will give a try to enable it.

Good catch, thanks. It should obviously check the range

@noinline

Eager `finalizer` inlining (#45272) currently has a restriction that requires all the def/uses to be in a same basic block. This commit relaxes that restriction a bit by allowing def/uses to involve control flow when all of them are dominated by a `finalizer` call to be inlined, since in that case it is safe to insert the body of `finalizer` at the end of all the def/uses, e.g. ```julia const FINALIZATION_COUNT = Ref(0) init_finalization_count!() = FINALIZATION_COUNT[] = 0 get_finalization_count() = FINALIZATION_COUNT[] @noinline add_finalization_count!(x) = FINALIZATION_COUNT[] += x @noinline Base.@assume_effects :nothrow safeprint(io::IO, x...) = (@nospecialize; print(io, x...)) mutable struct DoAllocWithFieldInter x::Int end function register_finalizer!(obj::DoAllocWithFieldInter) finalizer(obj) do this add_finalization_count!(this.x) end end function cfg_finalization3(io) for i = -999:1000 o = DoAllocWithFieldInter(i) register_finalizer!(o) if i == 1000 safeprint(io, o.x, '\n') elseif i > 0 safeprint(io, o.x) end end end let src = code_typed1(cfg_finalization3, (IO,)) @test count(isinvoke(:add_finalization_count!), src.code) == 1 end let init_finalization_count!() cfg_finalization3(IOBuffer()) @test get_finalization_count() == 1000 end ``` To support this transformation, the domtree code also gains the ability to represent post-dominator trees, which is generally useful.

ianatol

I don't fully understand all of the domination logic here, but in general it seems to make sense. LGTM!

ianatol · 2022-09-21T16:13:25Z

base/compiler/ssair/domtree.jl

@@ -109,10 +109,16 @@ end

 length(D::DFSTree) = length(D.from_pre)

-function DFS!(D::DFSTree, blocks::Vector{BasicBlock})
+function DFS!(D::DFSTree, blocks::Vector{BasicBlock}, is_post_dominator::Bool)


Suggested change

function DFS!(D::DFSTree, blocks::Vector{BasicBlock}, is_post_dominator::Bool)

function DFS!(D::DFSTree, blocks::Vector{BasicBlock}, is_postdominator::Bool)

Per #46651 (comment)

Fair callout, but I don't like this variable name that much anyway, so rather than going through a new CI cycle for this, I'll probably just rename this on the next refactor. DFS is really independent of domination, so this should probably just be is_reversed.

Found while testing bigger examples.

Found while testing bigger examples. Fixes the logic introduced in #46651

aviatesk requested a review from Keno September 6, 2022 15:10

aviatesk added compiler:optimizer Optimization passes (mostly in base/compiler/ssair/) don't squash Don't squash merge labels Sep 6, 2022

aviatesk commented Sep 6, 2022

View reviewed changes

aviatesk force-pushed the avi/inline-cfg-finalizer branch from b112600 to 28f12c5 Compare September 6, 2022 16:20

vtjnash reviewed Sep 6, 2022

View reviewed changes

base/compiler/ssair/passes.jl Outdated Show resolved Hide resolved

Keno reviewed Sep 7, 2022

View reviewed changes

Keno previously requested changes Sep 7, 2022

View reviewed changes

aviatesk force-pushed the avi/inline-cfg-finalizer branch 3 times, most recently from bb2e304 to 252b159 Compare September 8, 2022 01:51

aviatesk removed the don't squash Don't squash merge label Sep 8, 2022

aviatesk force-pushed the avi/inline-cfg-finalizer branch 2 times, most recently from c742a32 to 7dcc2b5 Compare September 8, 2022 14:30

aviatesk changed the title ~~inlining: relax finalizer inlining control-flow restriction~~ wip: inlining: relax finalizer inlining control-flow restriction Sep 8, 2022

aviatesk mentioned this pull request Sep 12, 2022

optimizer: fixup IR_FLAG_NOTHROW flagging #46653

Merged

Keno force-pushed the avi/inline-cfg-finalizer branch from 7dcc2b5 to b01d02b Compare September 21, 2022 03:21

Keno changed the title ~~wip: inlining: relax finalizer inlining control-flow restriction~~ inlining: relax finalizer inlining control-flow restriction Sep 21, 2022

Keno force-pushed the avi/inline-cfg-finalizer branch 2 times, most recently from 6d143ca to 18bd85e Compare September 21, 2022 03:24

aviatesk commented Sep 21, 2022

View reviewed changes

test/compiler/inline.jl Outdated Show resolved Hide resolved

base/compiler/ssair/passes.jl Outdated Show resolved Hide resolved

base/compiler/ssair/passes.jl Outdated Show resolved Hide resolved

base/compiler/ssair/domtree.jl Outdated Show resolved Hide resolved

Keno reviewed Sep 21, 2022

View reviewed changes

base/compiler/ssair/domtree.jl Outdated Show resolved Hide resolved

aviatesk force-pushed the avi/inline-cfg-finalizer branch from bd8e4a1 to 56c8a9c Compare September 21, 2022 13:02

aviatesk commented Sep 21, 2022

View reviewed changes

aviatesk force-pushed the avi/inline-cfg-finalizer branch from 56c8a9c to 64f7e99 Compare September 21, 2022 13:35

ianatol approved these changes Sep 21, 2022

View reviewed changes

Keno merged commit 62ac26a into master Sep 21, 2022

Keno deleted the avi/inline-cfg-finalizer branch September 21, 2022 22:59

Keno added a commit that referenced this pull request Sep 22, 2022

Follow-up #46651 - Fix a slight logic bug

9390f7b

Found while testing bigger examples.

Keno added a commit that referenced this pull request Sep 22, 2022

Follow-up #46651 - Fix a slight logic bug

be04d9a

Found while testing bigger examples.

Keno added a commit that referenced this pull request Sep 22, 2022

Follow-up #46651 - Fix a slight logic bug

d594566

Found while testing bigger examples.

Keno added a commit that referenced this pull request Sep 22, 2022

Fix small logic bug in finalizer inlining

a959ed8

Found while testing bigger examples. Fixes the logic introduced in #46651

jmkuhn mentioned this pull request Oct 2, 2022

Nightly build crash when runing some GMT tests that call functions in the GDAL lib #47003

Closed

timholy mentioned this pull request Jan 17, 2024

Control flow: use domtree analysis JuliaDebug/LoweredCodeUtils.jl#98

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

inlining: relax finalizer inlining control-flow restriction #46651

inlining: relax finalizer inlining control-flow restriction #46651

aviatesk commented Sep 6, 2022 •

edited

Loading

aviatesk Sep 6, 2022

aviatesk Sep 6, 2022

ianatol Sep 6, 2022

vtjnash left a comment

Keno Sep 7, 2022

Keno commented Sep 7, 2022

Keno left a comment

staticfloat commented Sep 8, 2022 •

edited

Loading

Keno commented Sep 21, 2022

aviatesk commented Sep 21, 2022

aviatesk left a comment

Keno commented Sep 21, 2022

aviatesk Sep 21, 2022

Keno Sep 21, 2022

ianatol left a comment

ianatol Sep 21, 2022

Keno Sep 21, 2022

		function check_range_nothrow(range)
		return all(finalizer_idx:maxval) do sidx::Int

	function DFS!(D::DFSTree, blocks::Vector{BasicBlock}, is_post_dominator::Bool)
	function DFS!(D::DFSTree, blocks::Vector{BasicBlock}, is_postdominator::Bool)

inlining: relax finalizer inlining control-flow restriction #46651

inlining: relax finalizer inlining control-flow restriction #46651

Conversation

aviatesk commented Sep 6, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vtjnash left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Keno commented Sep 7, 2022

Keno left a comment

Choose a reason for hiding this comment

staticfloat commented Sep 8, 2022 • edited Loading

Keno commented Sep 21, 2022

aviatesk commented Sep 21, 2022

aviatesk left a comment

Choose a reason for hiding this comment

Keno commented Sep 21, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ianatol left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

aviatesk commented Sep 6, 2022 •

edited

Loading

staticfloat commented Sep 8, 2022 •

edited

Loading