-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce improved SROA pass #27126
Introduce improved SROA pass #27126
Conversation
This is a rebased and fixed version of the improved SROA pass from #26778. There's a decent piece of new infrastructure wrapped up in this: The ability to insert new nodes during compaction. This is a bit tricky because it requires tracking which version of the statements buffer a given SSAValue belongs to. At the moment this is done mostly manually, but I'm hoping to clean that up in the future. The idea of the new SROA pass is fairly straightforward: Given a use of an interesting value, it traces through all phi nodes, finding all leaves, applies whatever transformation to those leaves and then re-inserts a phi nest corresponding to the phi nest of the original value.
I apologize in advance for some of the messiness in this PR, but since it's on the critical path for 0.7alpha, my plan is to merge this as soon as acceptable by CI (and nanosoldier), so we can merge #25261 and tag the alpha. There's a whole bunch of cleanup to be done after. |
@nanosoldier |
elseif isa(op, Union{OldSSAValue, NewSSAValue}) | ||
#@Base.show ir | ||
@verify_error "Left over SSA marker" | ||
error() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
need to fix these error()
calls before release
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is in the verifier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, that's why I said before release, not before merge
end | ||
|
||
function resort_pending!(compact) | ||
sort!(compact.pending_perm, DEFAULT_STABLE, Order.By(x->compact.pending_nodes[x].pos)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this supposed to be returning compact.pending_perm
?
sort!(compact.pending_perm, DEFAULT_STABLE, Order.By(x->compact.pending_nodes[x].pos)) | ||
end | ||
|
||
function insert_node!(compact::IncrementalCompact, before, @nospecialize(typ), @nospecialize(val), attach_after::Bool=false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
type signature, or use dispatch
end | ||
|
||
function insert_node_here!(compact::IncrementalCompact, @nospecialize(val), @nospecialize(typ), ltable_idx::Int) | ||
function insert_node_here!(compact::IncrementalCompact, @nospecialize(val), @nospecialize(typ), ltable_idx::Int, reverse_affinity=false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
type signature
# Add count for new use | ||
if count_added_node!(compact, v) | ||
push!(compact.late_fixup, idx.id) | ||
end | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
setindex!
should return first argument
# Add count for new use | ||
if count_added_node!(compact, v) | ||
push!(compact.late_fixup, idx.id) | ||
end | ||
end | ||
|
||
function setindex!(compact::IncrementalCompact, @nospecialize(v), idx::Int) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
setindex!
should return first argument
@@ -509,10 +621,14 @@ end | |||
|
|||
function getindex(view::TypesView, idx) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
type-signature
@@ -528,57 +644,114 @@ function done(compact::IncrementalCompact, (idx, _a)::Tuple{Int, Int}) | |||
return idx > length(compact.ir.stmts) && (compact.new_nodes_idx > length(compact.perm)) | |||
end | |||
|
|||
function getindex(view::TypesView, idx::NewSSAValue) | |||
@assert isa(view.ir, IncrementalCompact) | |||
compact = view.ir |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use type-assert here for better performance and better error messages
return process_newnode!(compact, new_idx, new_node_entry, idx, active_bb, true) | ||
elseif !isempty(compact.pending_perm) && | ||
(entry = compact.pending_nodes[compact.pending_perm[1]]; | ||
entry.attach_after ? entry.pos == idx - 1 : entry.pos == idx) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we make the control flow here less convoluted?
@@ -673,22 +853,29 @@ function next(compact::IncrementalCompact, (idx, active_bb)::Tuple{Int, Int}) | |||
return Pair{Int, Any}(old_result_idx, compact.result[old_result_idx]), (compact.idx, active_bb) | |||
end | |||
|
|||
function maybe_erase_unused!(extra_worklist, compact, idx) | |||
effect_free = stmt_effect_free(compact.result[idx], compact, compact.ir.mod) | |||
function maybe_erase_unused!(extra_worklist, compact, idx, callback = x->nothing) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
type-signature
Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @ararslan |
Looks fine. I'll merge this and address the review in a follow up PR. |
What's the PR number of the followup? This PR re-introduced many of the performance issues common to the rest of this new code (failing to annotate types in argument signatures, using |
Still on my todo list. There were a number of correctness issues to address first. |
* Use the field index passed in in `lift_leaves` The caller has already done all the computation including bound checking. The `field` computed in this function is also affecting all the following iterations which is almost certainly wrong. * Remove unnecessary type check on `field` in `lift_leaves` since it is always `Int` * Move a branch disabling `return nothing` higher up * Remove some duplicated calculation on field index in `getfield_elim_pass!` * Fix `try_compute_fieldidx` to return `nothing` for non-`Int` `Integer` field index. This can cause `getfield_nothrow` to return incorrect result. It also gives the caller worse type info about the return value. * Teach `getfield_nothrow` that `isbits` field cannot be undefined and getfield on such field cannot throw. This is already handled in `isdefined_tfunc`. ---- Ref #26948 (fa02d34) Ref #27126 (9100329)
* Use the field index passed in in `lift_leaves` The caller has already done all the computation including bound checking. The `field` computed in this function is also affecting all the following iterations which is almost certainly wrong. * Remove unnecessary type check on `field` in `lift_leaves` since it is always `Int` * Move a branch disabling `return nothing` higher up * Remove some duplicated calculation on field index in `getfield_elim_pass!` * Fix `try_compute_fieldidx` to return `nothing` for non-`Int` `Integer` field index. This can cause `getfield_nothrow` to return incorrect result. It also gives the caller worse type info about the return value. * Teach `getfield_nothrow` that `isbits` field cannot be undefined and getfield on such field cannot throw. This is already handled in `isdefined_tfunc`. ---- Ref #26948 (fa02d34) Ref #27126 (9100329)
* Use the field index passed in in `lift_leaves` The caller has already done all the computation including bound checking. The `field` computed in this function is also affecting all the following iterations which is almost certainly wrong. * Remove unnecessary type check on `field` in `lift_leaves` since it is always `Int` * Move a branch disabling `return nothing` higher up * Remove some duplicated calculation on field index in `getfield_elim_pass!` * Fix `try_compute_fieldidx` to return `nothing` for non-`Int` `Integer` field index. This can cause `getfield_nothrow` to return incorrect result. It also gives the caller worse type info about the return value. * Teach `getfield_nothrow` that `isbits` field cannot be undefined and getfield on such field cannot throw. This is already handled in `isdefined_tfunc`. * Fix a few wrong use of `isbits` in dead branches ---- Ref #26948 (fa02d34) Ref #27126 (9100329)
* Use the field index passed in in `lift_leaves` The caller has already done all the computation including bound checking. The `field` computed in this function is also affecting all the following iterations which is almost certainly wrong. * Remove unnecessary type check on `field` in `lift_leaves` since it is always `Int` * Move a branch disabling `return nothing` higher up * Remove some duplicated calculation on field index in `getfield_elim_pass!` * Fix `try_compute_fieldidx` to return `nothing` for non-`Int` `Integer` field index. This can cause `getfield_nothrow` to return incorrect result. It also gives the caller worse type info about the return value. * Teach `getfield_nothrow` that `isbits` field cannot be undefined and getfield on such field cannot throw. This is already handled in `isdefined_tfunc`. * Fix a few wrong use of `isbits` in dead branches ---- Ref #26948 (fa02d34) Ref #27126 (9100329)
This has been there since JuliaLang#27126 (9100329) Dudging by the use of `reverse_mapping[stmt_val]` below this should either be a branch condition or an assert.
* Use the field index passed in in `lift_leaves` The caller has already done all the computation including bound checking. The `field` computed in this function is also affecting all the following iterations which is almost certainly wrong. * Remove unnecessary type check on `field` in `lift_leaves` since it is always `Int` * Move a branch disabling `return nothing` higher up * Remove some duplicated calculation on field index in `getfield_elim_pass!` * Fix `try_compute_fieldidx` to return `nothing` for non-`Int` `Integer` field index. This can cause `getfield_nothrow` to return incorrect result. It also gives the caller worse type info about the return value. * Teach `getfield_nothrow` that `isbits` field cannot be undefined and getfield on such field cannot throw. This is already handled in `isdefined_tfunc`. * Fix a few wrong use of `isbits` in dead branches ---- Ref JuliaLang#26948 (fa02d34) Ref JuliaLang#27126 (9100329)
This is a rebased and fixed version of the improved SROA pass from #26778.
There's a decent piece of new infrastructure wrapped up in this: The ability
to insert new nodes during compaction. This is a bit tricky because it requires
tracking which version of the statements buffer a given SSAValue belongs to.
At the moment this is done mostly manually, but I'm hoping to clean that up
in the future. The idea of the new SROA pass is fairly straightforward:
Given a use of an interesting value, it traces through all phi nodes, finding
all leaves, applies whatever transformation to those leaves and then re-inserts
a phi nest corresponding to the phi nest of the original value.