Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JIT: Put all CSEs into SSA #106637

Merged
merged 29 commits into from
Nov 7, 2024
Merged

JIT: Put all CSEs into SSA #106637

merged 29 commits into from
Nov 7, 2024

Conversation

jakobbotsch
Copy link
Member

@jakobbotsch jakobbotsch commented Aug 19, 2024

This adds an SSA updater that can incrementally put locals that were added to the IR into SSA form and starts using this infrastructure from CSE. Previously only single-def CSEs were put into SSA, which can cause e.g. IV opts to miss out on them.

The SSA updater requires all uses and definitions to be supplied. Based on the definitions it is then possible to compute the candidate blocks for phi nodes in the usual way as the iterated dominance frontier. Since we do not have liveness for the local we do not insert the phi definitions eagerly; instead, we recursively compute the reaching def for each use by walking its dominators.
Walking the dominators we either expect to find a real definition, or to hit a block that is part of the iterated dominance frontier of the definitions, in which case we know we have a live phi definition that we can insert.

Inserting phi definitions needs to recursively do the same thing for the phi arguments. To make this faster we could memoize the reaching SSA numbers for each block, though this is not currently done.

This also requires computing the DFS tree, dominators, and dominance frontiers after RBO (which invalidates those). Since the DFS tree and dominator tree is usually computed anyway by IV opts, this is not that costly.
The cost is further reduced by avoiding any of these computations in the single-def case (which CSE already handled). This special case is also moved into the SSA updater.

Finally, since IV opts need liveness to work, the SSA inserter also computes liveness for the inserted locals. This is done by marking all paths from each use back to its reaching definition as having that local live-in.

Fix #109412

This adds an SSA updater that can incrementally put locals that were
added to the IR into SSA form and starts using this infrastructure from
CSE. It updates CSE to use this SSA updater to put all CSEs into SSA.
Previously only single-def CSEs were put into SSA, which can cause e.g.
IV opts to miss out on them.

The SSA updater requires all uses and definitions to be supplied. Based
on the definitions it is then possible to compute the candidate blocks
for phi nodes in the usual way as the iterated dominance frontier. Since
we do not have liveness for the local we do not insert the phi
definitions eagerly; instead, we recursively compute the reaching def
for each use by walking its dominators.
Walking the dominators we either expect to find a real definition, or to
hit a block that is part of the iterated dominance frontier of the
definitions, in which case we know we have a live phi definition that we
can insert.

Inserting phi definitions needs to recursively do the same thing for the
phi arguments. To make this faster we could memoize the reaching SSA
numbers for each block, though this is not currently done.

This also requires computing the DFS tree, dominators, and dominance
frontiers after RBO (which invalidates those). The DFS tree and
dominators should be reusable by IV opts in most cases, though currently
we invalidate them after CSE because assertion prop sometimes will
change the flow graph and does not know how to invalidate the DFS tree.
This can also be fixed to improve throughput.

We currently do not get much use out of this because IV opts also need
liveness for the IVs. More specifically it needs to know whether the IVs
are live out of the loop; it should be possible to keep some breadcrumbs
around for inserted locals to cheaply know this, I think.
@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Aug 19, 2024
@jakobbotsch

This comment was marked as outdated.

@jakobbotsch
Copy link
Member Author

TP diffs for benchmarks.run_pgo now look like:

Base: 120833007777, Diff: 120935018667, +0.0844%

317810259  : NA       : 25.72% : +0.2630% : public: void __cdecl FlowGraphDominanceFrontiers::ComputeIteratedDominanceFrontier(struct BasicBlock *, class jitstd::vector<struct BasicBlock *, class jitstd::allocator<struct BasicBlock *>> *)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            
162099515  : NA       : 13.12% : +0.1342% : public: static class FlowGraphDominanceFrontiers * __cdecl FlowGraphDominanceFrontiers::Build(class FlowGraphDominatorTree *)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 
39047720   : NA       : 3.16%  : +0.0323% : private: static void __cdecl SsaBuilder::AddNewPhiArg(class Compiler *, struct BasicBlock *, struct Statement *, struct GenTreePhi *, unsigned int, unsigned int, struct BasicBlock *)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        
38346141   : NA       : 3.10%  : +0.0317% : public: bool __cdecl Compiler::fgMorphBlockStmt(struct BasicBlock *, struct Statement *, bool)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                
18905583   : +52.64%  : 1.53%  : +0.0156% : public: virtual void __cdecl CSE_HeuristicCommon::PerformCSE(class CSE_Candidate *)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           
13835238   : NA       : 1.12%  : +0.0114% : private: static struct Statement * __cdecl SsaBuilder::InsertPhi(class Compiler *, struct BasicBlock *, unsigned int)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         
8753827    : +0.85%   : 0.71%  : +0.0072% : public: __cdecl `public: unsigned int __cdecl Compiler::fgRunDfs<class `public: class FlowGraphDfsTree * __cdecl Compiler::fgComputeDfs<0>(void)'::`2'::<lambda_1>, class `public: class FlowGraphDfsTree * __cdecl Compiler::fgComputeDfs<0>(void)'::`2'::<lambda_2>, class `public: class FlowGraphDfsTree * __cdecl Compiler::fgComputeDfs<0>(void)'::`2'::<lambda_3>, 0>(class `public: class FlowGraphDfsTree * __cdecl Compiler::fgComputeDfs<0>(void)'::`2'::<lambda_1>, class `public: class FlowGraphDfsTree * __cdecl Compiler::fgComputeDfs<0>(void)'::`2'::<lambda_2>, class `public: class FlowGraphDfsTree * __cdecl Compiler::fgComputeDfs<0>(void)'::`2'::<lambda_3>)'::`2'::<lambda_1>::operator()(struct BasicBlock *) const
7221209    : +0.28%   : 0.58%  : +0.0060% : public: void * __cdecl ArenaAllocator::allocateMemory(unsigned __int64)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       
7119522    : +2.54%   : 0.58%  : +0.0059% : public: static class FlowGraphDominatorTree * __cdecl FlowGraphDominatorTree::Build(class FlowGraphDfsTree const *)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           
6499831    : +7.25%   : 0.53%  : +0.0054% : private: bool __cdecl jitstd::vector<struct BasicBlock *, class jitstd::allocator<struct BasicBlock *>>::ensure_capacity(unsigned __int64)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    
4424130    : +12.56%  : 0.36%  : +0.0037% : private: void __cdecl JitHashTable<struct BasicBlock *, struct JitPtrKeyFuncs<struct BasicBlock>, struct FlowEdge *, class CompAllocator, class JitHashTableBehavior>::Grow(void)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             
3705888    : NA       : 0.30%  : +0.0031% : public: static void __cdecl SsaBuilder::InsertInSsa(class Compiler *, unsigned int, class ArrayStack<struct UseDefLocation> &, class ArrayStack<struct UseDefLocation> &)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     
3361499    : NA       : 0.27%  : +0.0028% : private: struct UseDefLocation __cdecl IncrementalSsaBuilder::FindOrCreateReachingDef(struct UseDefLocation const &)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          
2892505    : +0.42%   : 0.23%  : +0.0024% : public: unsigned short __cdecl Compiler::optAddAssertion(struct Compiler::AssertionDsc *)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     
2651134    : +0.69%   : 0.21%  : +0.0022% : BasicBlock::VisitAllSuccs<`AllSuccessorEnumerator::AllSuccessorEnumerator'::`2'::<lambda_1> >                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 
2535856    : +4.24%   : 0.21%  : +0.0021% : public: struct FlowEdge * __cdecl Compiler::BlockPredsWithEH(struct BasicBlock *)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             
1743270    : NA       : 0.14%  : +0.0014% : public: void __cdecl IncrementalSsaBuilder::Insert(void)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      
1693606    : +0.58%   : 0.14%  : +0.0014% : public: void __cdecl RangeCheck::MergeEdgeAssertions(unsigned int, unsigned __int64 *const &, struct Range *)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 
1583199    : NA       : 0.13%  : +0.0013% : private: bool __cdecl IncrementalSsaBuilder::FindReachingDefInBlock(struct UseDefLocation const &, struct BasicBlock *, struct UseDefLocation *)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              
1557652    : +1.89%   : 0.13%  : +0.0013% : public: void __cdecl Compiler::fgValueNumberPhiDef(struct GenTreeLclVar *, struct BasicBlock *, bool)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         
1349037    : +0.75%   : 0.11%  : +0.0011% : VisitEHSuccs<0,`AllSuccessorEnumerator::AllSuccessorEnumerator'::`2'::<lambda_1> >                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            
1261895    : +0.82%   : 0.10%  : +0.0010% : `Compiler::optReachable'::`12'::<lambda_1>::operator()                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        
-5032126   : -0.49%   : 0.41%  : -0.0042% : memset                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        
-26356661  : -50.03%  : 2.13%  : -0.0218% : private: void __cdecl SsaBuilder::AddPhiArg(struct BasicBlock *, struct Statement *, struct GenTreePhi *, unsigned int, unsigned int, struct BasicBlock *)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    
-30940686  : -15.02%  : 2.50%  : -0.0256% : private: void __cdecl SsaBuilder::InsertPhiFunctions(void)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    
-36618036  : -100.00% : 2.96%  : -0.0303% : public: bool __cdecl Compiler::fgMorphBlockStmt(struct BasicBlock *, struct Statement *)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      
-139925973 : -100.00% : 11.32% : -0.1158% : private: void __cdecl SsaBuilder::ComputeDominanceFrontiers(struct BasicBlock **, int, class JitHashTable<struct BasicBlock *, struct JitPtrKeyFuncs<struct BasicBlock>, class jitstd::vector<struct BasicBlock *, class jitstd::allocator<struct BasicBlock *>>, class CompAllocator, class JitHashTableBehavior> *)                                                                                                                                                                                                                                                                                                                                                                                                                         
-322436319 : -100.00% : 26.09% : -0.2668% : private: void __cdecl SsaBuilder::ComputeIteratedDominanceFrontier(struct BasicBlock *, class JitHashTable<struct BasicBlock *, struct JitPtrKeyFuncs<struct BasicBlock>, class jitstd::vector<struct BasicBlock *, class jitstd::allocator<struct BasicBlock *>>, class CompAllocator, class JitHashTableBehavior> const *, class jitstd::vector<struct BasicBlock *, class jitstd::allocator<struct BasicBlock *>> *)                                                                                                                                                                                                                                                                                                                       

So looks small enough that we can take this change without much more consideration.

One thing is whether we need to memoize reaching defs in some way to avoid quadratic (or even potentially exponential) behavior. Need to think about that a bit.

@jakobbotsch
Copy link
Member Author

benchmarks.run_pgo seems to be one of the least affected collections. The most affected one is benchmarks.run. With some manual reconciliation of renamed functions in the diff, the detailed TP diff for it looks like:

Base: 40469192366, Diff: 40551805661, +0.2041%

16221652 : +33.98% : 13.58% : +0.0401% : private: void __cdecl SsaBuilder::ComputeDominanceFrontiers(struct BasicBlock **, int, class JitHashTable<struct BasicBlock *, struct JitPtrKeyFuncs<struct BasicBlock>, class jitstd::vector<struct BasicBlock *, class jitstd::allocator<struct BasicBlock *>>, class CompAllocator, class JitHashTableBehavior> *)                                                                                                                                                                                                                                                                                                                                                                                                                         
14025772 : +48.54% : 11.74% : +0.0347% : public: virtual void __cdecl CSE_HeuristicCommon::PerformCSE(class CSE_Candidate *)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           
11802368 : NA      : 9.88%  : +0.0292% : private: static void __cdecl SsaBuilder::AddNewPhiArg(class Compiler *, struct BasicBlock *, struct Statement *, struct GenTreePhi *, unsigned int, unsigned int, struct BasicBlock *)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        
6227765  : +2.88%  : 5.21%  : +0.0154% : public: __cdecl `public: unsigned int __cdecl Compiler::fgRunDfs<class `public: class FlowGraphDfsTree * __cdecl Compiler::fgComputeDfs<0>(void)'::`2'::<lambda_1>, class `public: class FlowGraphDfsTree * __cdecl Compiler::fgComputeDfs<0>(void)'::`2'::<lambda_2>, class `public: class FlowGraphDfsTree * __cdecl Compiler::fgComputeDfs<0>(void)'::`2'::<lambda_3>, 0>(class `public: class FlowGraphDfsTree * __cdecl Compiler::fgComputeDfs<0>(void)'::`2'::<lambda_1>, class `public: class FlowGraphDfsTree * __cdecl Compiler::fgComputeDfs<0>(void)'::`2'::<lambda_2>, class `public: class FlowGraphDfsTree * __cdecl Compiler::fgComputeDfs<0>(void)'::`2'::<lambda_3>)'::`2'::<lambda_1>::operator()(struct BasicBlock *) const
4914583  : +0.60%  : 4.11%  : +0.0121% : public: void * __cdecl ArenaAllocator::allocateMemory(unsigned __int64)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       
4651869  : +6.52%  : 3.89%  : +0.0115% : public: static class FlowGraphDominatorTree * __cdecl FlowGraphDominatorTree::Build(class FlowGraphDfsTree const *)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           
4369172  : +17.68% : 3.66%  : +0.0108% : private: bool __cdecl jitstd::vector<struct BasicBlock *, class jitstd::allocator<struct BasicBlock *>>::ensure_capacity(unsigned __int64)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    
3469086  : NA      : 2.90%  : +0.0086% : private: static struct Statement * __cdecl SsaBuilder::InsertPhi(class Compiler *, struct BasicBlock *, unsigned int)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         
2793726  : NA      : 2.34%  : +0.0069% : public: static void __cdecl SsaBuilder::InsertInSsa(class Compiler *, unsigned int, class ArrayStack<struct UseDefLocation> &, class ArrayStack<struct UseDefLocation> &)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     
2747882  : NA      : 2.30%  : +0.0068% : private: struct UseDefLocation __cdecl IncrementalSsaBuilder::FindOrCreateReachingDef(struct UseDefLocation const &)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          
2745600  : +25.93% : 2.30%  : +0.0068% : private: void __cdecl JitHashTable<struct BasicBlock *, struct JitPtrKeyFuncs<struct BasicBlock>, struct FlowEdge *, class CompAllocator, class JitHashTableBehavior>::Grow(void)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             
2538778  : +2.19%  : 2.13%  : +0.0063% : public: void __cdecl RangeCheck::MergeEdgeAssertions(unsigned int, unsigned __int64 *const &, struct Range *)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 
2129497  : +12.45% : 1.78%  : +0.0053% : public: struct FlowEdge * __cdecl Compiler::BlockPredsWithEH(struct BasicBlock *)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             
1869746  : +2.27%  : 1.57%  : +0.0046% : BasicBlock::VisitAllSuccs<`AllSuccessorEnumerator::AllSuccessorEnumerator'::`2'::<lambda_1> >                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 
1822996  : NA      : 1.53%  : +0.0045% : private: bool __cdecl IncrementalSsaBuilder::FindReachingDefInBlock(struct UseDefLocation const &, struct BasicBlock *, struct UseDefLocation *)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              
1790224  : NA      : 1.50%  : +0.0044% : public: void __cdecl IncrementalSsaBuilder::Insert(void)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      
1522909  : +0.56%  : 1.28%  : +0.0038% : public: unsigned short __cdecl Compiler::optAddAssertion(struct Compiler::AssertionDsc *)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     
1018306  : +2.48%  : 0.85%  : +0.0025% : VisitEHSuccs<0,`AllSuccessorEnumerator::AllSuccessorEnumerator'::`2'::<lambda_1> >                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            
891863   : +2.70%  : 0.75%  : +0.0022% : public: struct Range __cdecl RangeCheck::ComputeRange(struct BasicBlock *, struct GenTree *, bool)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            
790041   : +4.09%  : 0.66%  : +0.0020% : public: bool __cdecl Compiler::fgMorphBlockStmt(struct BasicBlock *, struct Statement *)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      
717239   : +6.03%  : 0.60%  : +0.0018% : public: struct FlowEdge * __cdecl Compiler::BlockDominancePreds(struct BasicBlock *)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          
595655   : +0.18%  : 0.50%  : +0.0015% : memset                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        
573612   : +2.48%  : 0.48%  : +0.0014% : public: void __cdecl Compiler::fgValueNumberPhiDef(struct GenTreeLclVar *, struct BasicBlock *, bool)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         
551275   : +31.48% : 0.46%  : +0.0014% : public: struct EHblkDsc * __cdecl Compiler::ehGetBlockExnFlowDsc(struct BasicBlock *)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         
383941   : +0.44%  : 0.32%  : +0.0009% : public: void __cdecl Compiler::compInit(class ArenaAllocator *, struct CORINFO_METHOD_STRUCT_*, class ICorJitInfo *, struct CORINFO_METHOD_INFO *, struct InlineInfo *)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       
374654   : +2.29%  : 0.31%  : +0.0009% : public: struct Scev * __cdecl ScalarEvolutionContext::Simplify(struct Scev *, struct SimplificationAssumptions const &)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       
367471   : +0.29%  : 0.31%  : +0.0009% : private: void __cdecl SsaBuilder::BlockRenameVariables(struct BasicBlock *)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   
312764   : +2.04%  : 0.26%  : +0.0008% : public: bool __cdecl JitHashTable<struct GenTree *, struct JitPtrKeyFuncs<struct GenTree>, struct Scev *, class CompAllocator, class JitHashTableBehavior>::Set(struct GenTree *, struct Scev *, enum JitHashTable<struct GenTree *, struct JitPtrKeyFuncs<struct GenTree>, struct Scev *, class CompAllocator, class JitHashTableBehavior>::SetKind)                                                                                                                                                                                                                                                                                                                                                                                         
304582   : +0.07%  : 0.26%  : +0.0008% : public: struct GenTree * __cdecl Compiler::optAssertionProp_LclVar(unsigned __int64 *const &, struct GenTreeLclVarCommon *, struct Statement *)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               
297432   : +4.41%  : 0.25%  : +0.0007% : public: enum RelopEvaluationResult __cdecl ScalarEvolutionContext::EvaluateRelop(unsigned int)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                
281935   : +0.55%  : 0.24%  : +0.0007% : private: void __cdecl JitHashTable<struct CORINFO_FIELD_STRUCT_*, struct JitPtrKeyFuncs<struct CORINFO_FIELD_STRUCT_>, class FieldSeq, class CompAllocator, class JitHashTableBehavior>::Grow(void)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           
254204   : +2.33%  : 0.21%  : +0.0006% : public: static void __cdecl BitSetOps<unsigned __int64 *, 1, struct BitVecTraits *, struct BitVecTraits>::ClearD(struct BitVecTraits *, unsigned __int64 *&)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  
221830   : +1.38%  : 0.19%  : +0.0005% : private: void __cdecl SsaDefArray<class LclSsaVarDsc>::GrowArray(class CompAllocator)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         
220372   : +0.14%  : 0.18%  : +0.0005% : public: enum PhaseStatus __cdecl Compiler::optAssertionPropMain(void)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         
197369   : +3.85%  : 0.17%  : +0.0005% : public: bool __cdecl RangeCheck::ComputeDoesOverflow(struct BasicBlock *, struct GenTree *, struct Range const &)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             
194313   : +0.26%  : 0.16%  : +0.0005% : public: void __cdecl Compiler::optImpliedAssertions(unsigned short, unsigned __int64 *&)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      
186496   : +2.80%  : 0.16%  : +0.0005% : public: struct Range __cdecl RangeCheck::GetRange(struct BasicBlock *, struct GenTree *, bool)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                
178680   : +0.22%  : 0.15%  : +0.0004% : private: void __cdecl SsaBuilder::ComputeIteratedDominanceFrontier(struct BasicBlock *, class JitHashTable<struct BasicBlock *, struct JitPtrKeyFuncs<struct BasicBlock>, class jitstd::vector<struct BasicBlock *, class jitstd::allocator<struct BasicBlock *>>, class CompAllocator, class JitHashTableBehavior> const *, class jitstd::vector<struct BasicBlock *, class jitstd::allocator<struct BasicBlock *>> *)                                                                                                                                                                                                                                                                                                                       
175067   : +0.24%  : 0.15%  : +0.0004% : public: void __cdecl DataFlow::ForwardAnalysis<class AssertionPropFlowCallback>(class AssertionPropFlowCallback &)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            
174825   : +0.88%  : 0.15%  : +0.0004% : public: void __cdecl ValueNumStore::GetConstantBoundInfo(unsigned int, struct ValueNumStore::ConstantBoundInfo *)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             
172806   : +0.31%  : 0.14%  : +0.0004% : protected: void __cdecl Compiler::compCompile(void **, unsigned int *, class JitFlags *)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      
172736   : +2.73%  : 0.14%  : +0.0004% : public: void __cdecl RangeCheck::MergeEdgeAssertions(struct GenTreeLclVarCommon *, unsigned __int64 *const &, struct Range *)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 
164132   : +3.16%  : 0.14%  : +0.0004% : public: static struct Range __cdecl RangeOps::Merge(struct Range const &, struct Range const &, bool)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         
162139   : +3.63%  : 0.14%  : +0.0004% : public: struct Range __cdecl RangeCheck::ComputeRangeForBinOp(struct BasicBlock *, struct GenTreeOp *, bool)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  
155473   : +0.24%  : 0.13%  : +0.0004% : public: void __cdecl Compiler::optImpliedByTypeOfAssertions(unsigned __int64 *&)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              
150968   : +0.33%  : 0.13%  : +0.0004% : BasicBlock::VisitAllSuccs<`Compiler::optReachable'::`12'::<lambda_1> >                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        
147163   : +0.33%  : 0.12%  : +0.0004% : public: unsigned short __cdecl Compiler::optGlobalAssertionIsEqualOrNotEqualZero(unsigned __int64 *const &, struct GenTree *)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 
146382   : +0.76%  : 0.12%  : +0.0004% : public: enum PhaseStatus __cdecl Compiler::optVNBasedDeadStoreRemoval(void)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   
139613   : +0.29%  : 0.12%  : +0.0003% : `Compiler::optReachable'::`12'::<lambda_1>::operator()                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        
138149   : +0.06%  : 0.12%  : +0.0003% : public: unsigned int __cdecl ValueNumStore::VNForFunc(enum var_types, enum VNFunc, unsigned int, unsigned int)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                
136733   : +0.65%  : 0.11%  : +0.0003% : `LoopLocalOccurrences::GetOrCreateMap'::`2'::<lambda_1>::operator()                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           
130909   : +0.04%  : 0.11%  : +0.0003% : public: unsigned int * __cdecl JitHashTable<struct ValueNumStore::VNDefFuncApp<2>, struct ValueNumStore::VNDefFuncAppKeyFuncs<2>, unsigned int, class CompAllocator, class JitHashTableBehavior>::LookupPointerOrAdd(struct ValueNumStore::VNDefFuncApp<2>, unsigned int)                                                                                                                                                                                                                                                                                                                                                                                                                                                                     
124058   : +0.16%  : 0.10%  : +0.0003% : public: static void __cdecl BitSetOps<unsigned __int64 *, 1, struct BitVecTraits *, struct BitVecTraits>::IntersectionD(struct BitVecTraits *, unsigned __int64 *&, unsigned __int64 *)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       
-374413  : -3.47%  : 0.31%  : -0.0009% : public: enum PhaseStatus __cdecl Compiler::fgSsaBuild(void)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   
-7939897 : -26.44% : 6.65%  : -0.0196% : private: void __cdecl SsaBuilder::AddPhiArg(struct BasicBlock *, struct Statement *, struct GenTreePhi *, unsigned int, unsigned int, struct BasicBlock *)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    
-9406160 : -14.99% : 7.88%  : -0.0232% : private: void __cdecl SsaBuilder::InsertPhiFunctions(void)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    

@jakobbotsch
Copy link
Member Author

This seems to also break strength reduction sometimes, need to investigate that.

image

// Remarks:
// Recall that the dominance frontier of a block B is the set of blocks
// B3 such that there exists some B2 s.t. B3 is a successor of B2, and
// B dominates B2. Note that this dominance need not be strict -- B2
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// B dominates B2. Note that this dominance need not be strict -- B2
// B dominates B2 but not B3. Note that this dominance need not be strict -- B2

return reachingDef;
}

if (BitVecOps::IsMember(&m_poTraits, m_iteratedDominanceFrontiers, dom->bbPostorderNum))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like if you are worried about worst-case costs and can be allowed to fail to put a local into SSA, you could see if there is any particularly large IDF and just punt if there is one (maybe?).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also maybe only handle the two-def case (not sure we ever really see more than two defs for a CSE)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like if you are worried about worst-case costs and can be allowed to fail to put a local into SSA, you could see if there is any particularly large IDF and just punt if there is one (maybe?).

Yeah that's a good point, the number of recursive calls is bounded by the IDF size. I just dislike making the updater partial in this way...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went with your suggestion and skip the insertion into SSA if the IDF size is > 100. However, I didn't see a need to restrict the number of defs we allow in the inserter.

@jakobbotsch
Copy link
Member Author

This seems to also break strength reduction sometimes, need to investigate that.

The problem here was the fact that I added the lvTracked check in optLocalHasNonLoopUses. I did that separately in #109505, to take those regressions separately (this was a correctness fix). To resolve those regressions I have then added the necessary liveness computation to this PR: that is, when inserting a local into SSA we now also compute the set of blocks in which that local is live-in, which is then stored in a side table. The cost of that computation does not look to be too bad -- overall TP cost of SSA insertion + liveness computation is around 0.2% for win-x64, with some collections being outliers in both directions. I think that cost is acceptable given that this gets us a fair amount of new strength reductions.

Still have some other regressions to investigate.

@jakobbotsch
Copy link
Member Author

cc @dotnet/jit-contrib PTAL @AndyAyersMS -- this now computes liveness for SSA-inserted locals, see my comment above. Can you review those new parts?

Diffs.

I went through a bunch of ASP.NET regressions and they fell into two categories:

  • Some regressions are because of new eliminated comparisons in assertion prop, which can lead to diferent register allocation due to fewer uses of those locals, and this can end up being a regression overall
  • Other regressions happen when IVs created by IV opts do not get a register allocated (nothing new here)

Sometimes strength reduction is also a size regression while being a perfscore regression, but there is nothing new there.

Some stats for number of strength reductions in win-x64 collections with this PR:
aspnet: 1710 -> 1819 (+6.4%)
benchmarks.run_pgo: 2177 -> 2252 (+3.4%)
libraries_tests.run: 10483 -> 11115 (+6.0%) with 74 new misses
realworld: 471 -> 523 (+11%)

@jakobbotsch jakobbotsch marked this pull request as ready for review November 6, 2024 16:32
@jakobbotsch
Copy link
Member Author

This seems to be more expensive for arm64, likely because there are more CSEs that are now having their liveness computed.
One thing is that we only very rarely need that liveness, for IV opts only, so if we could only compute it at that point it would be better. But it would require us to keep the list of reaching defs/uses up to date between CSE and IV opts, while currently we only need to keep the liveness information up to date. Probably not impossible, but seems impractical given that assertion prop runs there. An alternative would be to find some way to predict whether we'll need the liveness, but I also didn't see a simple way of doing that from CSE.

m_queue.Reset();
m_queue.Push(use.Block);

while (!m_queue.Empty())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add a check that you actually do find the def block, and/or don't try to search past the first block? (that is, def dominates use)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we necessarily find the reaching def (we might hit a block marked live-in already due to another use), but makes sense to add an assert that we don't get to the first block. I added that assert into AddInsertedSsaLiveIn since it never makes sense to mark these added locals as live into the first BB.

@jakobbotsch jakobbotsch merged commit 9c1f53e into dotnet:main Nov 7, 2024
108 checks passed
@jakobbotsch jakobbotsch deleted the ssa-updating branch November 7, 2024 10:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

JIT: Cloned loop not fully strength reduced on arm64
2 participants