Eager finalizer insertion #45272

Keno · 2022-05-11T07:13:22Z

This is a variant of the eager-finalization idea
(e.g. as seen in #44056), but with a focus on the mechanism
of finalizer insertion, since I need a similar pass downstream.
Integration of EscapeAnalysis is left to #44056.

My motivation for this change is somewhat different. In particular,
I want to be able to insert finalize call such that I can
subsequently SROA the mutable object. This requires a couple
design points that are more stringent than the pass from #44056,
so I decided to prototype them as an independent PR. The primary
things I need here that are not seen in #44056 are:

The ability to forgo finalizer registration with the runtime
entirely (requires additional legality analyis)
The ability to inline the registered finalizer at the deallocation
point (to enable subsequent SROA)

To this end, adding a finalizer is promoted to a builtin
that is recognized by inference and inlining (such that inference
can produce an inferred version of the finalizer for inlining).

The current status is that this fixes the minimal example I wanted
to have work, but does not yet extend to the motivating case I had.
Nevertheless, I felt that this was a good checkpoint to synchronize
with other efforts along these lines.

Currently working demo:

julia> const total_deallocations = Ref{Int}(0)
Base.RefValue{Int64}(0)

julia> mutable struct DoAlloc
           function DoAlloc()
               this = new()
               Core.finalizer(this) do this
                   global total_deallocations[] += 1
               end
               return this
           end
       end

julia> function foo()
           for i = 1:1000
               DoAlloc()
           end
       end
foo (generic function with 1 method)

julia> @code_llvm foo()
;  @ REPL[3]:1 within `foo`
define void @julia_foo_428() #0 {
top:
  %.promoted = load i64, i64* inttoptr (i64 4373384000 to i64*), align 64
;  @ REPL[3]:2 within `foo`
  %0 = add i64 %.promoted, 1000
;  @ REPL[3]:3 within `foo`
; ┌ @ REPL[2]:4 within `DoAlloc`
; │┌ @ REPL[2]:5 within `#3`
; ││┌ @ Base.jl within `setproperty!`
     store i64 %0, i64* inttoptr (i64 4373384000 to i64*), align 64
; └└└
;  @ REPL[3]:4 within `foo`
  ret void
}

julia> foo()

julia> total_deallocations[]
1000

Thoughts @jpsamaroo @aviatesk ?

jpsamaroo

Awesome!

test/compiler/inline.jl

jpsamaroo · 2022-05-11T16:17:02Z

This is great! I think this approach will probably cover the cases that #44056 intended to address.

Note: I could only follow the general gist of what you're doing here, so my review approval really just pertains to the idea in general. I haven't thoroughly scrutinized the implementation.

Keno · 2022-05-16T04:59:29Z

I've addressed the correctness issues that made me call this "WIP", though I haven't yet expanded the legality analysis to allow all the cases I want yet. That said, I think that can be part of a follow up PR (including any potential interaction with EA), so I think this is basically good to go.

Keno · 2022-05-16T20:37:08Z

@nanosoldier runtests(ALL, vs = ":master")

nanosoldier · 2022-05-17T04:33:05Z

Your package evaluation job has completed - possible new issues were detected. A full report can be found here.

maleadt · 2022-05-17T16:19:00Z

base/compiler/ssair/inlining.jl

+        # Handle finalizer
+        if sig.f === Core._add_finalizer
+            if isa(info, FinalizerInfo)
+                # Only inline finalizers that are known nothrow and notls.


Why do we have this restriction? Would it be invalid to inline finalizers and have them run in the original task environment?

Obligatory GPU context: our finalizers access the TLS to inspect the current stream. On finalizer tasks, there's no such stream, so we perform a blocking free, whereas on 'regular' tasks the memory operation can be ordered against other operations on that stream. I had assumed that inlining finalizers would additionally get them to use the local stream, promoting blocking frees to asynchronously-ordered ones. Maybe that's too magical though.

It's to prevent things like #42752 where finalizers unexpectedly mess with task local state. I think we're now conceptually moving into the direction of finalizers having their own pseudo-task (maybe even a real task eventually), so inlining a finalizer that looks at task state would not be legal since it changes the task environment.

That said, for your case, you want to allow that to happen, even though it's not technically semantically legal. I think for that case, we could just put a notls effect override on the finalizer.

Keno · 2022-05-18T04:30:29Z

I worked through PkgEval and fixed 3 issues that I could reproduce, but none were related to this PR. The remaining issues are the known random precompile segfaults in the SciML stack, which as usual, I don't see locally. One of these days we'll need to rr a PkgEval run to catch those. Overall though, no issues I can see with merging this from a PkgEval perspective.

src/builtins.c

base/compiler/abstractinterpretation.jl

base/compiler/ssair/passes.jl

base/compiler/types.jl

src/builtins.c

test/compiler/inline.jl

base/compiler/stmtinfo.jl

base/gcutils.jl

base/compiler/ssair/passes.jl

Split out from #45272. This effect models the legality of moving code between tasks. It is somewhat related to effect-free/consistent, but only with respect to task-local state. As an example consider something like: ``` global glob function bar() @async (global glob = 1; some_other_code()) end ``` The newly created task is not effect-free, but it would be legal to inline the assignment of `glob` into `bar` (as long it is inlined before the creation of the task of `some_other_code` does not access `glob`). For comparison, the following is neither `notls`, nor `effect_free`: ``` function bar() @async (task_local_storage()[:var] = 1; some_other_code()) end ``` The same implies to implicit task-local state such as the RNG state. Implementation wise, there isn't a lot here, because the implicit tainting by ccall is the correct conservative default. In the future, we may want to annotate various ccalls as being permissible for notls, but let's worry about that when we have a case that needs it.

Split out from #45272. This is prepratory work towards adding optimization passes that recognize this builtin. This PR adds `Core.finalizer` with essentially the same interface as `Base.finalizer`, but without the error checking or raw-C-pointer feature. In future commits, the Core.finalizer interface will likely expand slightly, but Base.finalizer will remain unchanged and is the supported interface for this functionality.

- renamed `AddFianlizerUse` to `FinalizerUse` - removed dead pieces of code - improved type stabilities by handling edge cases

- added more comments to explain the purpose of code - properly handle `ConstantCase` - erase `Core.finalizer` call more aggressively - improved type stabilities by handling edge cases - renamed `AddFianlizerUse` to `FinalizerUse` - removed dead pieces of code

inlining: follow #45272, improve the finalizer inlining implementation

- added more comments to explain the purpose of code - properly handle `ConstantCase` - erase `Core.finalizer` call more aggressively - improved type stabilities by handling edge cases - renamed `AddFianlizerUse` to `FinalizerUse` - removed dead pieces of code

…plementation - added more comments to explain the purpose of code - properly handle `ConstantCase` - erase `Core.finalizer` call more aggressively - improved type stabilities by handling edge cases - renamed `AddFianlizerUse` to `FinalizerUse` - removed dead pieces of code

I added the flag in JuliaLang#45272, but didn't hook it up to the nothrow model. In the fullness of time, these should forwarded from inference and only recomputed if necessary, but for the moment, just make it work.

…plementation - added more comments to explain the purpose of code - properly handle `ConstantCase` - erase `Core.finalizer` call more aggressively - improved type stabilities by handling edge cases - renamed `AddFianlizerUse` to `FinalizerUse` - removed dead pieces of code

@noinline

Eager `finalizer` inlining (#45272) currently has a restriction that requires all the def/uses are in a same basic block. This commit relaxes that restriction a bit by allowing def/uses to involve control flow when all of them are dominated by a `finalizer` call to be inlined, since in that case it is safe to insert the body of `finalizer` at the end of all the def/uses, e.g. ```julia Base.@assume_effects :nothrow safeprint(@nospecialize x...) = print(x...) @noinline nothrow_side_effect(x) = Base.@assume_effects :total !:effect_free @CCall jl_(x::Any)::Cvoid mutable struct DoAllocWithFieldInter x end function register_finalizer!(obj::DoAllocWithFieldInter) finalizer(obj) do this nothrow_side_effect(nothing) end end let src = code_typed1() do for i = -1000:1000 o = DoAllocWithFieldInter(i) register_finalizer!(o) if i == 1000 safeprint(o.x, '\n') elseif i > 0 safeprint(o.x) end end end # the finalzier call gets inlined @test count(isinvoke(:nothrow_side_effect), src.code) == 1 end ``` There is a room for further improvement though -- if we have a way to compute the post-domination, we can also inline a `finalizer` call in a case when a `finalizer` block is post-dominated by all the def/uses e.g. ``` let src = code_typed1() do for i = -1000:1000 o = DoAllocWithFieldInter(i) if i == 1000 safeprint(o.x, '\n') elseif i > 0 safeprint(o.x) end register_finalizer!(o) end end # currently this is broken @test_broken count(isinvoke(:nothrow_side_effect), src.code) == 1 end ```

@noinline

Eager `finalizer` inlining (#45272) currently has a restriction that requires all the def/uses are in a same basic block. This commit relaxes that restriction a bit by allowing def/uses to involve control flow when all of them are dominated by a `finalizer` call to be inlined, since in that case it is safe to insert the body of `finalizer` at the end of all the def/uses, e.g. ```julia Base.@assume_effects :nothrow safeprint(@nospecialize x...) = print(x...) @noinline nothrow_side_effect(x) = Base.@assume_effects :total !:effect_free @CCall jl_(x::Any)::Cvoid mutable struct DoAllocWithFieldInter x end function register_finalizer!(obj::DoAllocWithFieldInter) finalizer(obj) do this nothrow_side_effect(nothing) end end let src = code_typed1() do for i = -1000:1000 o = DoAllocWithFieldInter(i) register_finalizer!(o) if i == 1000 safeprint(o.x, '\n') elseif i > 0 safeprint(o.x) end end end # the finalzier call gets inlined @test count(isinvoke(:nothrow_side_effect), src.code) == 1 end ``` There is a room for further improvement though -- if we have a way to compute the post-domination, we can also inline a `finalizer` call in a case when a `finalizer` block is post-dominated by all the def/uses e.g. ``` let src = code_typed1() do for i = -1000:1000 o = DoAllocWithFieldInter(i) if i == 1000 safeprint(o.x, '\n') elseif i > 0 safeprint(o.x) end register_finalizer!(o) end end # currently this is broken @test_broken count(isinvoke(:nothrow_side_effect), src.code) == 1 end ```

@noinline

Eager `finalizer` inlining (#45272) currently has a restriction that requires all the def/uses are in a same basic block. This commit relaxes that restriction a bit by allowing def/uses to involve control flow when all of them are dominated by a `finalizer` call to be inlined, since in that case it is safe to insert the body of `finalizer` at the end of all the def/uses, e.g. ```julia Base.@assume_effects :nothrow safeprint(@nospecialize x...) = print(x...) @noinline nothrow_side_effect(x) = Base.@assume_effects :total !:effect_free @CCall jl_(x::Any)::Cvoid mutable struct DoAllocWithFieldInter x end function register_finalizer!(obj::DoAllocWithFieldInter) finalizer(obj) do this nothrow_side_effect(nothing) end end let src = code_typed1() do for i = -1000:1000 o = DoAllocWithFieldInter(i) register_finalizer!(o) if i == 1000 safeprint(o.x, '\n') elseif i > 0 safeprint(o.x) end end end # the finalzier call gets inlined @test count(isinvoke(:nothrow_side_effect), src.code) == 1 end ``` There is a room for further improvement though -- if we have a way to compute the post-domination, we can also inline a `finalizer` call in a case when a `finalizer` block is post-dominated by all the def/uses e.g. ``` let src = code_typed1() do for i = -1000:1000 o = DoAllocWithFieldInter(i) if i == 1000 safeprint(o.x, '\n') elseif i > 0 safeprint(o.x) end register_finalizer!(o) end end # currently this is broken @test_broken count(isinvoke(:nothrow_side_effect), src.code) == 1 end ```

@noinline

Eager `finalizer` inlining (#45272) currently has a restriction that requires all the def/uses are in a same basic block. This commit relaxes that restriction a bit by allowing def/uses to involve control flow when all of them are dominated by a `finalizer` call to be inlined, since in that case it is safe to insert the body of `finalizer` at the end of all the def/uses, e.g. ```julia Base.@assume_effects :nothrow safeprint(@nospecialize x...) = print(x...) @noinline nothrow_side_effect(x) = Base.@assume_effects :total !:effect_free @CCall jl_(x::Any)::Cvoid mutable struct DoAllocWithFieldInter x end function register_finalizer!(obj::DoAllocWithFieldInter) finalizer(obj) do this nothrow_side_effect(nothing) end end let src = code_typed1() do for i = -1000:1000 o = DoAllocWithFieldInter(i) register_finalizer!(o) if i == 1000 safeprint(o.x, '\n') elseif i > 0 safeprint(o.x) end end end # the finalzier call gets inlined @test count(isinvoke(:nothrow_side_effect), src.code) == 1 end ``` There is a room for further improvement though -- if we have a way to compute the post-domination, we can also inline a `finalizer` call in a case when a `finalizer` block is post-dominated by all the def/uses e.g. ``` let src = code_typed1() do for i = -1000:1000 o = DoAllocWithFieldInter(i) if i == 1000 safeprint(o.x, '\n') elseif i > 0 safeprint(o.x) end register_finalizer!(o) end end # currently this is broken @test_broken count(isinvoke(:nothrow_side_effect), src.code) == 1 end ```

@noinline

Eager `finalizer` inlining (#45272) currently has a restriction that requires all the def/uses are in a same basic block. This commit relaxes that restriction a bit by allowing def/uses to involve control flow when all of them are dominated by a `finalizer` call to be inlined, since in that case it is safe to insert the body of `finalizer` at the end of all the def/uses, e.g. ```julia Base.@assume_effects :nothrow safeprint(@nospecialize x...) = print(x...) @noinline nothrow_side_effect(x) = Base.@assume_effects :total !:effect_free @CCall jl_(x::Any)::Cvoid mutable struct DoAllocWithFieldInter x end function register_finalizer!(obj::DoAllocWithFieldInter) finalizer(obj) do this nothrow_side_effect(nothing) end end let src = code_typed1() do for i = -1000:1000 o = DoAllocWithFieldInter(i) register_finalizer!(o) if i == 1000 safeprint(o.x, '\n') elseif i > 0 safeprint(o.x) end end end # the finalzier call gets inlined @test count(isinvoke(:nothrow_side_effect), src.code) == 1 end ``` There is a room for further improvement though -- if we have a way to compute the post-domination, we can also inline a `finalizer` call in a case when a `finalizer` block is post-dominated by all the def/uses e.g. ``` let src = code_typed1() do for i = -1000:1000 o = DoAllocWithFieldInter(i) if i == 1000 safeprint(o.x, '\n') elseif i > 0 safeprint(o.x) end register_finalizer!(o) end end # currently this is broken @test_broken count(isinvoke(:nothrow_side_effect), src.code) == 1 end ```

@noinline

Eager `finalizer` inlining (#45272) currently has a restriction that requires all the def/uses are in a same basic block. This commit relaxes that restriction a bit by allowing def/uses to involve control flow when all of them are dominated by a `finalizer` call to be inlined, since in that case it is safe to insert the body of `finalizer` at the end of all the def/uses, e.g. ```julia const FINALIZATION_COUNT = Ref(0) init_finalization_count!() = FINALIZATION_COUNT[] = 0 get_finalization_count() = FINALIZATION_COUNT[] @noinline add_finalization_count!(x) = FINALIZATION_COUNT[] += x @noinline Base.@assume_effects :nothrow safeprint(io::IO, x...) = (@nospecialize; print(io, x...)) mutable struct DoAllocWithFieldInter x::Int end function register_finalizer!(obj::DoAllocWithFieldInter) finalizer(obj) do this add_finalization_count!(this.x) end end function cfg_finalization3(io) for i = -999:1000 o = DoAllocWithFieldInter(i) register_finalizer!(o) if i == 1000 safeprint(io, o.x, '\n') elseif i > 0 safeprint(io, o.x) end end end let src = code_typed1(cfg_finalization3, (IO,)) @test count(isinvoke(:add_finalization_count!), src.code) == 1 end let init_finalization_count!() cfg_finalization3(IOBuffer()) @test get_finalization_count() == 1000 end ``` There is a room for further improvement though -- if we have a way to compute the post-domination, we can also inline a `finalizer` call in a case when a `finalizer` block post-dominateds all the def/uses e.g. ```julia function cfg_finalization5(io) for i = -999:1000 o = DoAllocWithFieldInter(i) if i == 1000 safeprint(io, o.x, '\n') elseif i > 0 safeprint(io, o.x) end register_finalizer!(o) end end let src = code_typed1(cfg_finalization5, (IO,)) # TODO we can fix this case by checking a case when a finalizer block is post-dominated by all the def/uses @test_broken count(isinvoke(:nothrow_side_effect), src.code) == 1 end ```

@noinline

Eager `finalizer` inlining (#45272) currently has a restriction that requires all the def/uses to be in a same basic block. This commit relaxes that restriction a bit by allowing def/uses to involve control flow when all of them are dominated by a `finalizer` call to be inlined, since in that case it is safe to insert the body of `finalizer` at the end of all the def/uses, e.g. ```julia const FINALIZATION_COUNT = Ref(0) init_finalization_count!() = FINALIZATION_COUNT[] = 0 get_finalization_count() = FINALIZATION_COUNT[] @noinline add_finalization_count!(x) = FINALIZATION_COUNT[] += x @noinline Base.@assume_effects :nothrow safeprint(io::IO, x...) = (@nospecialize; print(io, x...)) mutable struct DoAllocWithFieldInter x::Int end function register_finalizer!(obj::DoAllocWithFieldInter) finalizer(obj) do this add_finalization_count!(this.x) end end function cfg_finalization3(io) for i = -999:1000 o = DoAllocWithFieldInter(i) register_finalizer!(o) if i == 1000 safeprint(io, o.x, '\n') elseif i > 0 safeprint(io, o.x) end end end let src = code_typed1(cfg_finalization3, (IO,)) @test count(isinvoke(:add_finalization_count!), src.code) == 1 end let init_finalization_count!() cfg_finalization3(IOBuffer()) @test get_finalization_count() == 1000 end ``` There is a room for further improvement though -- if we have a way to compute the post-domination, we can also inline a `finalizer` call in a case when a `finalizer` block post-dominateds all the def/uses e.g. ```julia function cfg_finalization5(io) for i = -999:1000 o = DoAllocWithFieldInter(i) if i == 1000 safeprint(io, o.x, '\n') elseif i > 0 safeprint(io, o.x) end register_finalizer!(o) end end let src = code_typed1(cfg_finalization5, (IO,)) # TODO we can fix this case by checking a case when a finalizer block is post-dominated by all the def/uses @test_broken count(isinvoke(:nothrow_side_effect), src.code) == 1 end ```

@noinline

Eager `finalizer` inlining (#45272) currently has a restriction that requires all the def/uses to be in a same basic block. This commit relaxes that restriction a bit by allowing def/uses to involve control flow when all of them are dominated by a `finalizer` call to be inlined, since in that case it is safe to insert the body of `finalizer` at the end of all the def/uses, e.g. ```julia const FINALIZATION_COUNT = Ref(0) init_finalization_count!() = FINALIZATION_COUNT[] = 0 get_finalization_count() = FINALIZATION_COUNT[] @noinline add_finalization_count!(x) = FINALIZATION_COUNT[] += x @noinline Base.@assume_effects :nothrow safeprint(io::IO, x...) = (@nospecialize; print(io, x...)) mutable struct DoAllocWithFieldInter x::Int end function register_finalizer!(obj::DoAllocWithFieldInter) finalizer(obj) do this add_finalization_count!(this.x) end end function cfg_finalization3(io) for i = -999:1000 o = DoAllocWithFieldInter(i) register_finalizer!(o) if i == 1000 safeprint(io, o.x, '\n') elseif i > 0 safeprint(io, o.x) end end end let src = code_typed1(cfg_finalization3, (IO,)) @test count(isinvoke(:add_finalization_count!), src.code) == 1 end let init_finalization_count!() cfg_finalization3(IOBuffer()) @test get_finalization_count() == 1000 end ``` There is a room for further improvement though -- if we have a way to compute the post-domination, we can also inline a `finalizer` call in a case when a `finalizer` block post-dominateds all the def/uses e.g. ```julia function cfg_finalization5(io) for i = -999:1000 o = DoAllocWithFieldInter(i) if i == 1000 safeprint(io, o.x, '\n') elseif i > 0 safeprint(io, o.x) end register_finalizer!(o) end end let src = code_typed1(cfg_finalization5, (IO,)) # TODO we can fix this case by checking a case when a finalizer block is post-dominated by all the def/uses @test_broken count(isinvoke(:nothrow_side_effect), src.code) == 1 end ```

@noinline

Eager `finalizer` inlining (#45272) currently has a restriction that requires all the def/uses to be in a same basic block. This commit relaxes that restriction a bit by allowing def/uses to involve control flow when all of them are dominated by a `finalizer` call to be inlined, since in that case it is safe to insert the body of `finalizer` at the end of all the def/uses, e.g. ```julia const FINALIZATION_COUNT = Ref(0) init_finalization_count!() = FINALIZATION_COUNT[] = 0 get_finalization_count() = FINALIZATION_COUNT[] @noinline add_finalization_count!(x) = FINALIZATION_COUNT[] += x @noinline Base.@assume_effects :nothrow safeprint(io::IO, x...) = (@nospecialize; print(io, x...)) mutable struct DoAllocWithFieldInter x::Int end function register_finalizer!(obj::DoAllocWithFieldInter) finalizer(obj) do this add_finalization_count!(this.x) end end function cfg_finalization3(io) for i = -999:1000 o = DoAllocWithFieldInter(i) register_finalizer!(o) if i == 1000 safeprint(io, o.x, '\n') elseif i > 0 safeprint(io, o.x) end end end let src = code_typed1(cfg_finalization3, (IO,)) @test count(isinvoke(:add_finalization_count!), src.code) == 1 end let init_finalization_count!() cfg_finalization3(IOBuffer()) @test get_finalization_count() == 1000 end ``` To support this transformation, the domtree code also gains the ability to represent post-dominator trees, which is generally useful. Co-authored-by: Keno Fischer <keno@juliacomputing.com>

@noinline

Eager `finalizer` inlining (#45272) currently has a restriction that requires all the def/uses to be in a same basic block. This commit relaxes that restriction a bit by allowing def/uses to involve control flow when all of them are dominated by a `finalizer` call to be inlined, since in that case it is safe to insert the body of `finalizer` at the end of all the def/uses, e.g. ```julia const FINALIZATION_COUNT = Ref(0) init_finalization_count!() = FINALIZATION_COUNT[] = 0 get_finalization_count() = FINALIZATION_COUNT[] @noinline add_finalization_count!(x) = FINALIZATION_COUNT[] += x @noinline Base.@assume_effects :nothrow safeprint(io::IO, x...) = (@nospecialize; print(io, x...)) mutable struct DoAllocWithFieldInter x::Int end function register_finalizer!(obj::DoAllocWithFieldInter) finalizer(obj) do this add_finalization_count!(this.x) end end function cfg_finalization3(io) for i = -999:1000 o = DoAllocWithFieldInter(i) register_finalizer!(o) if i == 1000 safeprint(io, o.x, '\n') elseif i > 0 safeprint(io, o.x) end end end let src = code_typed1(cfg_finalization3, (IO,)) @test count(isinvoke(:add_finalization_count!), src.code) == 1 end let init_finalization_count!() cfg_finalization3(IOBuffer()) @test get_finalization_count() == 1000 end ``` To support this transformation, the domtree code also gains the ability to represent post-dominator trees, which is generally useful. Co-authored-by: Keno Fischer <keno@juliacomputing.com>

@noinline

Eager `finalizer` inlining (#45272) currently has a restriction that requires all the def/uses to be in a same basic block. This commit relaxes that restriction a bit by allowing def/uses to involve control flow when all of them are dominated by a `finalizer` call to be inlined, since in that case it is safe to insert the body of `finalizer` at the end of all the def/uses, e.g. ```julia const FINALIZATION_COUNT = Ref(0) init_finalization_count!() = FINALIZATION_COUNT[] = 0 get_finalization_count() = FINALIZATION_COUNT[] @noinline add_finalization_count!(x) = FINALIZATION_COUNT[] += x @noinline Base.@assume_effects :nothrow safeprint(io::IO, x...) = (@nospecialize; print(io, x...)) mutable struct DoAllocWithFieldInter x::Int end function register_finalizer!(obj::DoAllocWithFieldInter) finalizer(obj) do this add_finalization_count!(this.x) end end function cfg_finalization3(io) for i = -999:1000 o = DoAllocWithFieldInter(i) register_finalizer!(o) if i == 1000 safeprint(io, o.x, '\n') elseif i > 0 safeprint(io, o.x) end end end let src = code_typed1(cfg_finalization3, (IO,)) @test count(isinvoke(:add_finalization_count!), src.code) == 1 end let init_finalization_count!() cfg_finalization3(IOBuffer()) @test get_finalization_count() == 1000 end ``` To support this transformation, the domtree code also gains the ability to represent post-dominator trees, which is generally useful.

@noinline

Eager `finalizer` inlining (#45272) currently has a restriction that requires all the def/uses to be in a same basic block. This commit relaxes that restriction a bit by allowing def/uses to involve control flow when all of them are dominated by a `finalizer` call to be inlined, since in that case it is safe to insert the body of `finalizer` at the end of all the def/uses, e.g. ```julia const FINALIZATION_COUNT = Ref(0) init_finalization_count!() = FINALIZATION_COUNT[] = 0 get_finalization_count() = FINALIZATION_COUNT[] @noinline add_finalization_count!(x) = FINALIZATION_COUNT[] += x @noinline Base.@assume_effects :nothrow safeprint(io::IO, x...) = (@nospecialize; print(io, x...)) mutable struct DoAllocWithFieldInter x::Int end function register_finalizer!(obj::DoAllocWithFieldInter) finalizer(obj) do this add_finalization_count!(this.x) end end function cfg_finalization3(io) for i = -999:1000 o = DoAllocWithFieldInter(i) register_finalizer!(o) if i == 1000 safeprint(io, o.x, '\n') elseif i > 0 safeprint(io, o.x) end end end let src = code_typed1(cfg_finalization3, (IO,)) @test count(isinvoke(:add_finalization_count!), src.code) == 1 end let init_finalization_count!() cfg_finalization3(IOBuffer()) @test get_finalization_count() == 1000 end ``` To support this transformation, the domtree code also gains the ability to represent post-dominator trees, which is generally useful.

@noinline

Eager `finalizer` inlining (#45272) currently has a restriction that requires all the def/uses to be in a same basic block. This commit relaxes that restriction a bit by allowing def/uses to involve control flow when all of them are dominated by a `finalizer` call to be inlined, since in that case it is safe to insert the body of `finalizer` at the end of all the def/uses, e.g. ```julia const FINALIZATION_COUNT = Ref(0) init_finalization_count!() = FINALIZATION_COUNT[] = 0 get_finalization_count() = FINALIZATION_COUNT[] @noinline add_finalization_count!(x) = FINALIZATION_COUNT[] += x @noinline Base.@assume_effects :nothrow safeprint(io::IO, x...) = (@nospecialize; print(io, x...)) mutable struct DoAllocWithFieldInter x::Int end function register_finalizer!(obj::DoAllocWithFieldInter) finalizer(obj) do this add_finalization_count!(this.x) end end function cfg_finalization3(io) for i = -999:1000 o = DoAllocWithFieldInter(i) register_finalizer!(o) if i == 1000 safeprint(io, o.x, '\n') elseif i > 0 safeprint(io, o.x) end end end let src = code_typed1(cfg_finalization3, (IO,)) @test count(isinvoke(:add_finalization_count!), src.code) == 1 end let init_finalization_count!() cfg_finalization3(IOBuffer()) @test get_finalization_count() == 1000 end ``` To support this transformation, the domtree code also gains the ability to represent post-dominator trees, which is generally useful.

jpsamaroo approved these changes May 11, 2022

View reviewed changes

test/compiler/inline.jl Outdated Show resolved Hide resolved

Keno force-pushed the kf/eagerfinalizers branch 2 times, most recently from ff35e48 to 458c7f6 Compare May 12, 2022 00:52

Keno changed the title ~~Very WIP: Eager finalizer insertion~~ Eager finalizer insertion May 16, 2022

Keno force-pushed the kf/eagerfinalizers branch from 458c7f6 to 67ec007 Compare May 16, 2022 04:58

Keno force-pushed the kf/eagerfinalizers branch from 67ec007 to dc8a715 Compare May 16, 2022 18:05

maleadt reviewed May 17, 2022

View reviewed changes

Keno mentioned this pull request May 18, 2022

isdispatchtuple inconsistency for typeof(Union{}) #45347

Closed