Refine effects based on optimizer-derived information #50805

Keno · 2023-08-05T21:10:04Z

The optimizer may be able to derive information that is not available to inference. For example, it may SROA a mutable value to derive additional constant information. Additionally, some effects, like :consistent are path-dependent and should ideally be scanned once all optimizations are done. Now, there is a bit of a complication that we have generally so far taken the position that the optimizer may do non-IPO-safe optimizations, although in practice we never actually implemented any. This was a sensible choice, because we weren't really doing anything with the post-optimized IR other than feeding it into codegen anyway. However, with irinterp and this change, there's now two consumers of IPO-safely optimized IR. I do still think we may at some point want to run passes that allow IPO-unsafe optimizations, but we can always add them at the end of the pipeline.

With these changes, the effect analysis is a lot more precise. For example, we can now derive :consistent for these functions:

function f1(b)
    if Base.inferencebarrier(b)
        error()
    end
    return b
end

function f3(x)
    @fastmath sqrt(x)
    return x
end

and we can derive :nothrow for this function:

function f2()
    if Ref(false)[]
        error()
    end
    return true
end

The current `:consistent` effect bit carries dual meanings: 1. "the return value is always consistent" 2. "this method does not cause any undefined behavior". This design makes the effect bit unclear and hard to manage. Specifically, the current design prevents a post-inference analysis (as discussed in #50805) from safely refining "consistent"-cy using post-optimization state IR. This is because it is impossible to tell whether the `:consistent`-cy has been influenced by the first or second meaning. To address this, this commit splits them into two distinct effect bits: `:consistent` for consistent return values and `:noub` for no undefined behavior.

The current `:consistent` effect bit carries dual meanings: 1. "the return value is always consistent" 2. "this method does not cause any undefined behavior". This design makes the effect bit unclear and hard to manage. Specifically, the current design prevents a post-inference analysis (as discussed in #50805) from safely refining "consistent"-cy using post-optimization state IR. This is because it is impossible to tell whether the `:consistent`-cy has been influenced by the first or second meaning. To address this, this commit splits them into two distinct effect bits: `:consistent` for consistent return values and `:noub` for no undefined behavior. This commit also introduces an override mechanism for `:noub` as it is necessary for `@assume_effects` to concrete-evaluate the annotated methods. While this might sound risky and not in line with the existing designs of `:nonoverlayed` and `:noinbounds`, where their overrides are prohibited, but we already have an override mechanism in place for `:consistent`, which implicitly overrides `:noub`. Given this precedent, the override for `:noub` should probably be justified.

vtjnash

Now, there is a bit of a complication that we have generally so far taken the position that the optimizer may do non-IPO-safe optimizations, although in practice we never actually implemented any. This was a sensible choice, because we weren't really doing anything with the post-optimized IR other than feeding it into codegen anyway. However, with irinterp and this change, there's now two consumers of IPO-safely optimized IR. I do still think we may at some point want to run passes that allow IPO-unsafe optimizations, but we can always add them at the end of the pipeline.

We don't do as many as we should, but there is a long way between doing only a few and being prohibited from doing any ever. For a direct example, Base.inferencebarrier is required to be optimized away, but must not improve IPO information afterwards. Other similar examples may include 1. the legality of removing boundscheck depends on having determined whether it is guaranteed to be not inlined (which premise is broken by this PR). 2. folding Expr(:new) with undef fields to a constant and other similar operations with undefined bits such as accessing .parameters or doing math with floating point numbers assumes IPO cannot see that change to the implementation 3. DCE cannot mark a value as unused after all loads are removed (changing it to readnone), since later IPO effects may incorrectly assume this argument could be optimized away in the caller

Keno · 2023-08-10T05:45:53Z

We don't do as many as we should, but there is a long way between doing only a few and being prohibited from doing any ever.

You're not prohibited from doing any ever, you're just prohibited from doing them before the analysis pass runs. There's no problem doing more things afterwards.

For a direct example, Base.inferencebarrier is required to be optimized away, but must not improve IPO information afterwards.

Yes, that needed to be adjusted in this PR.

Other similar examples may include 1. the legality of removing boundscheck depends on having determined whether it is guaranteed to be not inlined (which premise is broken by this PR).

This PR is on the path of overhauling this mechanism. It's possible there's some extra tweaking that needs to happen here in this intermediate step.

folding Expr(:new) with undef fields to a constant and other similar operations with undefined bits such as accessing .parameters or doing math with floating point numbers assumes IPO cannot see that change to the implementation

I don't believe we do this in the current pass pipeline.

DCE cannot mark a value as unused after all loads are removed (changing it to readnone), since later IPO effects may incorrectly assume this argument could be optimized away in the caller

We don't currently derive such information interprocedurally.

vtjnash · 2023-08-10T12:42:53Z

you're just prohibited from doing them before the analysis pass runs

In particular though, this means the analysis pass must run before inlining, since inlining already ran all passes. Which seems hardly indistinguishable from not being able to do any useful optimizations?

This PR is on the path of overhauling this mechanism

I am talking about the default environment, which is not affected by that PR. In the default cases where it is not forced on or forced off, the effect of inlining on boundscheck branches is known to not be consistent, but that inconsistency is usually removed by the inlining pass. Though I suppose that falls into the larger bucket above that inlining is not allowed before the refinement analysis, so it might not matter as an independent point.

The current `:consistent` effect bit carries dual meanings: 1. "the return value is always consistent" 2. "this method does not cause any undefined behavior". This design makes the effect bit unclear and hard to manage. Specifically, the current design prevents a post-inference analysis (as discussed in #50805) from safely refining `:consistent`-cy using post-optimization state IR. This is because it is impossible to tell whether the `:consistent`-cy has been tainted by the first or second meaning. To address this, this commit splits them into two distinct effect bits: `:consistent` for consistent return values and `:noub` for no undefined behavior. This commit also introduces an override mechanism for `:noub` as it is necessary for `@assume_effects` to concrete-evaluate the annotated methods. While this might sound risky and not in line with the existing designs of `:nonoverlayed` and `:noinbounds`, where their overrides are prohibited, but we already have an override mechanism in place for `:consistent`, which implicitly overrides `:noub`. Given this precedent, the override for `:noub` should probably be justified.

Keno · 2023-08-18T21:22:06Z

As discussed today, we'll stop doing :boundscheck transforms in inlining and move those to codegen instead. Then we can cache the IR as IPO safe. At whatever point in the future we want to do IPO unsafe IR transforms at the julia level, we'll have a callback from codegen back into julia that does that.

The current `:consistent` effect bit carries dual meanings: 1. "the return value is always consistent" 2. "this method does not cause any undefined behavior". This design makes the effect bit unclear and hard to manage. Specifically, the current design prevents a post-inference analysis (as discussed in #50805) from safely refining `:consistent`-cy using post-optimization state IR. This is because it is impossible to tell whether the `:consistent`-cy has been tainted by the first or second meaning. To address this, this commit splits them into two distinct effect bits: `:consistent` for consistent return values and `:noub` for no undefined behavior. This commit also introduces an override mechanism for `:noub` as it is necessary for `@assume_effects` to concrete-evaluate the annotated methods. While this might sound risky and not in line with the existing designs of `:nonoverlayed` and `:noinbounds`, where their overrides are prohibited, but we already have an override mechanism in place for `:consistent`, which implicitly overrides `:noub`. Given this precedent, the override for `:noub` should probably be justified.

@nanosoldier

The current `:consistent` effect bit carries dual meanings: 1. "the return value is always consistent" 2. "this method does not cause any undefined behavior". This design makes the effect bit unclear and hard to manage. Specifically, the current design prevents a post-inference analysis (as discussed in #50805) from safely refining "consistent"-cy using post-optimization state IR. This is because it is impossible to tell whether the `:consistent`-cy has been tainted by the first or second meaning. To address this, this commit splits them into two distinct effect bits: `:consistent` for consistent return values and `:noub` for no undefined behavior. This commit also introduces an override mechanism for `:noub` as it is necessary for `@assume_effects` to concrete-evaluate the annotated methods. While this might sound risky and not in line with the existing designs of `:nonoverlayed` and `:noinbounds`, where their overrides are prohibited, but we already have an override mechanism in place for `:consistent`, which implicitly overrides `:noub`. Given this precedent, the override for `:noub` should probably be justified. @nanosoldier `runbenchmarks("inference", vs=":master")`

vtjnash · 2023-08-21T17:01:40Z

How did we decide to deal with the fact this means that the result of codegen (e.g. validity of using concrete eval) will no longer be consistent with the IPO flags on the same CodeInfo object?

Keno · 2023-08-21T17:03:36Z

How did we decide to deal with the fact this means that the result of codegen (e.g. validity of using concrete eval) will no longer be consistent with the IPO flags on the same CodeInfo object?

You mean CodeInstance? They already have two sets of flags, but if something is determined for IPO, it should always be valid even after codegen transforms.

vtjnash · 2023-08-21T17:04:23Z

base/compiler/optimize.jl

@@ -509,7 +711,7 @@ function run_passes(
    @pass "compact 2" ir = compact!(ir)


This pass currently seems to not be IPO safe, since the access to ci.propagate_inbounds (and :boundscheck removal folding) is not currently IPO safe

Yes, I'm working on that right now.

base/compiler/abstractinterpretation.jl

base/compiler/effects.jl

base/compiler/optimize.jl

aviatesk · 2023-08-28T13:52:34Z

base/reflection.jl

@@ -1647,9 +1647,9 @@ function infer_effects(@nospecialize(f), @nospecialize(types=default_tt(f));
    for match in matches.matches
        match = match::Core.MethodMatch
        frame = Core.Compiler.typeinf_frame(interp,
-            match.method, match.spec_types, match.sparams, #=run_optimizer=#false)
+            match.method, match.spec_types, match.sparams, #=run_optimizer=#true)


New optional argument post_optimization_analysis::Bool=true might be useful for testing/debugging purposes. We can add it later when needed though.

aviatesk · 2023-08-28T14:44:10Z

base/compiler/abstractinterpretation.jl

    mi = result.edge
-    if mi !== nothing && is_foldable(effects)
+    if mi !== nothing && is_foldable(effects, !stmt_taints_inbounds_consistency(sv))


Maybe we need a similar change to concrete_eval_invoke?

Actually, I don't think it's required here at all. Both concrete eval and semi concrete eval are safe here independent of the state of the inbounds flag.

base/compiler/optimize.jl

aviatesk · 2023-08-28T16:04:58Z

base/compiler/optimize.jl

+        boundscheck = stmt.args[end]
+        argextype(boundscheck, ir) === Bool || return false
+        isa(boundscheck, SSAValue) || return false
+        return true


Include isexpr(ir[boundscheck][:stmt], :boundscheck) here?

This is meant to check for having a boundscheck argument rather than the boundscheck expr, I'll rename the function.

aviatesk · 2023-08-28T16:14:08Z

base/compiler/optimize.jl

+                else isa(stmt, GotoIfNot)
+                    bb = block_for_inst(ir, idx)
+                    sa, sb = ir.cfg.blocks[bb].succs
+                    for succ in iterated_dominance_frontier(ir.cfg, BlockLiveness((sa, sb), nothing), get!(lazydomtree))


Suggested change

for succ in iterated_dominance_frontier(ir.cfg, BlockLiveness((sa, sb), nothing), get!(lazydomtree))

for succ in iterated_dominance_frontier(ir.cfg, BlockLiveness(ir.cfg.blocks[bb].succs, nothing), get!(lazydomtree))

and then we can keep typing BlockLiveness.def_bbs::Vector{Int}.

aviatesk · 2023-08-28T16:23:53Z

base/compiler/ssair/ir.jl

@@ -1390,7 +1410,7 @@ function process_node!(compact::IncrementalCompact, result_idx::Int, inst::Instr
        if cfg_transforms_enabled
            # Rename phi node edges
            let bb_rename_pred=bb_rename_pred
-                map!(i::Int32->bb_rename_pred[i], stmt.edges, stmt.edges)
+                map!(i::Int32->i == 0 ? 0 : bb_rename_pred[i], stmt.edges, stmt.edges)


Is this change necessary for this PR, or a fix for some related issues?

It fixed some test failure, but I don't think it's related to this PR other than by coincidence. There's a special i == 0 predecessor for entry into catch blocks.

base/compiler/optimize.jl

aviatesk · 2023-08-28T16:41:50Z

base/compiler/optimize.jl

+        if isexpr(stmt, :enter)
+            # try/catch not yet modeled
+            had_trycatch = true
+            return true


Suggested change

return true

return false

Shouldn't return false to bail out the scan?

base/compiler/optimize.jl

aviatesk · 2023-08-28T16:46:10Z

base/compiler/optimize.jl

+                return
+            end
+        else
+            callback(idx, stmt)


Suggested change

callback(idx, stmt)

callback(idx)

stmt::PhiNode seems to be never used.

aviatesk · 2023-08-28T16:53:31Z

I was seeing a fair number of inference instabilities on master
and tracked this down to us deleting the inferred code after
codegen making it unavailable to irinterp.

Doesn't absint-based constprop' usually lead to better inference? I'm in favor of preserving sources for irinterp, but I want to assert my understanding that irinterp is faster but less precise than the absint constprop'.

Keno · 2023-08-29T04:40:12Z

Doesn't absint-based constprop' usually lead to better inference? I'm in favor of preserving sources for irinterp, but I want to assert my understanding that irinterp is faster but less precise than the absint constprop'.

I don't think it's necessarily always true that irinterp is less precise than absint. For example, irinterp can in theory propagate through more complicated sroa patterns. I didn't track down exactly what was going on in this case, but the failing test was arrayops 1746.

@fastmath

The optimizer may be able to derive information that is not available to inference. For example, it may SROA a mutable value to derive additional constant information. Additionally, some effects, like :consistent are path-dependent and should ideally be scanned once all optimizations are done. Now, there is a bit of a complication that we have generally so far taken the position that the optimizer may do non-IPO-safe optimizations, although in practice we never actually implemented any. This was a sensible choice, because we weren't really doing anything with the post-optimized IR other than feeding it into codegen anyway. However, with irinterp and this change, there's now two consumers of IPO-safely optimized IR. I do still think we may at some point want to run passes that allow IPO-unsafe optimizations, but we can always add them at the end of the pipeline. With these changes, the effect analysis is a lot more precise. For example, we can now derive :consistent for these functions: ``` function f1(b) if Base.inferencebarrier(b) error() end return b end function f3(x) @fastmath sqrt(x) return x end ``` and we can derive `:nothrow` for this function: ``` function f2() if Ref(false)[] error() end return true end ```

I was seeing a fair number of inference instabilities on master and tracked this down to us deleting the inferred code after codegen making it unavailable to irinterp.

Also adds some cosmetic changes.

Following the discussions and changes in #50805, we now consider post-inlining IR as IPO-valid. Revisiting EA, I've realized that running EA twice—once for computing IPO-valid escape cache and once for local optimization analysis—is redundant. This commit streamlines the EA process to perform the analysis just once on post-optimization IR, and caches that result. This change also removes all interprocedural EA code, which had significant overlap with inlining code.

…A` (#51318) Following the discussions and changes in #50805, we now consider post-inlining IR as IPO-valid. Revisiting EA, I've realized that running EA twice—once for computing IPO-valid escape cache and once for local optimization analysis—is redundant. This commit streamlines the EA process to perform the analysis just once on post-optimization IR, and caches that result. This change also removes all interprocedural EA code, which had significant overlap with inlining code. --------- Co-authored-by: Julian Samaroo <jpsamaroo@jpsamaroo.me>

This particular fix was part of #50805, but it wasn't included in version 1.10, leading to situations where an incorrect `:nothrow` could occur in 1.10 (#53062). This commit implements a minimal correction in 1.10 and also added some test cases. Fixes #53062.

The fix for #53062 was included in #50805, but it lacked explicit test cases added. This commit cherry-picks the test cases from #53076 to prevent future regression.

This particular fix was part of #50805, but it wasn't included in version 1.10, leading to situations where an incorrect `:nothrow` could occur in 1.10 (#53062). This commit implements a minimal correction in 1.10 and also added some test cases. Fixes #53062.

This particular fix was part of JuliaLang#50805, but it wasn't included in version 1.10, leading to situations where an incorrect `:nothrow` could occur in 1.10 (JuliaLang#53062). This commit implements a minimal correction in 1.10 and also added some test cases. Fixes JuliaLang#53062.

Keno requested a review from aviatesk August 5, 2023 21:10

Keno force-pushed the kf/ipoeffectrefine branch from 76d19dc to 2ab07cc Compare August 5, 2023 21:15

brenhinkeller added the compiler:effects effect analysis label Aug 6, 2023

aviatesk mentioned this pull request Aug 6, 2023

effects: separate :noub effect bit from :consistent #50808

Merged

vtjnash previously requested changes Aug 8, 2023

View reviewed changes

Keno force-pushed the kf/ipoeffectrefine branch from 2ab07cc to 0aa6b47 Compare August 20, 2023 04:50

vtjnash reviewed Aug 21, 2023

View reviewed changes

Keno force-pushed the kf/ipoeffectrefine branch from 96b55e0 to 5f7d950 Compare August 27, 2023 01:22

aviatesk reviewed Aug 28, 2023

View reviewed changes

Keno mentioned this pull request Aug 28, 2023

Always rewrite GotoIfNot with unreachable branches #51062

Merged

Keno force-pushed the kf/ipoeffectrefine branch from 25cf427 to 1ed5f15 Compare August 29, 2023 06:56

Keno added 2 commits August 29, 2023 06:57

Don't delete code that may be needed by irinterp

23755dd

I was seeing a fair number of inference instabilities on master and tracked this down to us deleting the inferred code after codegen making it unavailable to irinterp.

aviatesk added a commit to JuliaDebug/Cthulhu.jl that referenced this pull request Sep 5, 2023

adjust to JuliaLang/julia#50805

985e259

aviatesk added a commit that referenced this pull request Sep 5, 2023

add missing type annotations from #50805

dd58e9e

Also adds some cosmetic changes.

aviatesk added a commit that referenced this pull request Sep 6, 2023

add missing type annotations from #50805

2517cb2

Also adds some cosmetic changes.

aviatesk added a commit to JuliaDebug/Cthulhu.jl that referenced this pull request Sep 6, 2023

adjust to JuliaLang/julia#50805 (#497)

0a13208

aviatesk added a commit that referenced this pull request Sep 6, 2023

add missing type annotations from #50805

c000595

Also adds some cosmetic changes.

aviatesk added a commit that referenced this pull request Sep 6, 2023

add missing type annotations from #50805

7b144b7

Also adds some cosmetic changes.

aviatesk added a commit that referenced this pull request Sep 8, 2023

add missing type annotations from #50805

39f84a2

Also adds some cosmetic changes.

aviatesk added a commit that referenced this pull request Sep 8, 2023

add missing type annotations from #50805

a32ce71

Also adds some cosmetic changes.

aviatesk added a commit that referenced this pull request Sep 8, 2023

add missing type annotations from #50805

7509098

Also adds some cosmetic changes.

aviatesk added a commit to JuliaDiff/Diffractor.jl that referenced this pull request Sep 11, 2023

adjust to JuliaLang/julia#50805 (#214)

7a9e9ba

aviatesk mentioned this pull request Sep 14, 2023

EA: perform analysis once for post-optimization IR, and remove IPO EA #51318

Merged

maleadt mentioned this pull request Oct 3, 2023

Invalid IR causes OOB, trampling over LLVM objects, causing segfaults #51561

Closed

vtjnash mentioned this pull request Oct 11, 2023

strings: fix performance of nextind #51671

Merged

aviatesk mentioned this pull request Nov 14, 2023

tidy up the inlining algorithm #52158

Merged

aviatesk mentioned this pull request Jan 27, 2024

1.10: inlining: fix joint_effects calculation #53076

Merged

aviatesk added a commit that referenced this pull request Jan 27, 2024

forward-port the test cases added in #53076

a40db47

The fix for #53062 was included in #50805, but it lacked explicit test cases added. This commit cherry-picks the test cases from #53076 to prevent future regression.

aviatesk mentioned this pull request Jan 27, 2024

forward-port the test cases added in #53076 #53077

Merged

aviatesk added a commit that referenced this pull request Jan 27, 2024

forward-port the test cases added in #53076

b6922e2

The fix for #53062 was included in #50805, but it lacked explicit test cases added. This commit cherry-picks the test cases from #53076 to prevent future regression.

bvdmitri mentioned this pull request Feb 12, 2024

[release 1.10] effects wrong for function that throws MethodError #53259

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refine effects based on optimizer-derived information #50805

Refine effects based on optimizer-derived information #50805

Keno commented Aug 5, 2023

vtjnash left a comment

Keno commented Aug 10, 2023

vtjnash commented Aug 10, 2023

Keno commented Aug 18, 2023

vtjnash commented Aug 21, 2023

Keno commented Aug 21, 2023

vtjnash Aug 21, 2023

Keno Aug 21, 2023

aviatesk Aug 28, 2023

aviatesk Aug 28, 2023

Keno Aug 29, 2023

aviatesk Aug 28, 2023

Keno Aug 29, 2023

aviatesk Aug 28, 2023

aviatesk Aug 28, 2023

Keno Aug 29, 2023

aviatesk Aug 28, 2023

aviatesk Aug 28, 2023

aviatesk commented Aug 28, 2023

Keno commented Aug 29, 2023

		@@ -509,7 +711,7 @@ function run_passes(
		@pass "compact 2" ir = compact!(ir)

	for succ in iterated_dominance_frontier(ir.cfg, BlockLiveness((sa, sb), nothing), get!(lazydomtree))
	for succ in iterated_dominance_frontier(ir.cfg, BlockLiveness(ir.cfg.blocks[bb].succs, nothing), get!(lazydomtree))

Refine effects based on optimizer-derived information #50805

Refine effects based on optimizer-derived information #50805

Conversation

Keno commented Aug 5, 2023

vtjnash left a comment

Choose a reason for hiding this comment

Keno commented Aug 10, 2023

vtjnash commented Aug 10, 2023

Keno commented Aug 18, 2023

vtjnash commented Aug 21, 2023

Keno commented Aug 21, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

aviatesk commented Aug 28, 2023

Keno commented Aug 29, 2023