-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AIEX] Add WAW Sticky Register dep. mutator #183
Conversation
QoR results (Updated in 01.09):
The regression MulAttributeBroadcasting_aie2_bf16_0 is related to a bit more spilling inside the innermost loop. |
@@ -47,6 +47,11 @@ struct AIEBaseRegisterInfo : public TargetRegisterInfo { | |||
bool isSimplifiableReservedReg(MCRegister PhysReg) const override { | |||
return false; | |||
} | |||
|
|||
// Whether a reserved register is sticky |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps add a definition of 'sticky' here. In particular, I think that the stickiness is a priori a property of an operand in an instruction. A reset of a flag is a non-sticky operation.
static cl::opt<unsigned> WAWStickyRegistersMemOpsThreshold( | ||
"aie-waw-sticky-register-mem-threshold", cl::Hidden, cl::init(4), | ||
cl::desc("Number of memory instructions to enable the register exclusion " | ||
"heuristic in WAW sticky registers dep. removal")); | ||
|
||
// These are debugging/testing options. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: the new ones are also testing options
auto *RI = static_cast<const AIEBaseRegisterInfo *>(TRI); | ||
const MachineBasicBlock *MBB = DAG->getBB(); | ||
|
||
if (MBB) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unindent by if (!MBB) return
?
BitVector AllRegs(RI->getNumRegs()); | ||
AllRegs.reset(); | ||
// Here, we analyze which sticky registers are explicitly redefined | ||
// of read. We also track all instructions implicitly reading or |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: or read
// The first thing to test is the tuning parameter: we only consider | ||
// cases where the number of memory ops are <= the threshold. | ||
if (MIs.size() <= WAWStickyRegistersMemOpsThreshold && | ||
((all_of(MIs, IsLoad) || all_of(MIs, IsStore)))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm. I guess you only need to remember NumLoads and NumStores to evaluate this condition
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I am interested on cases where all instructions must be loads or stores (per sticky register), I will need a per-register set of counters. I think in this way the code a bit more clear.
continue; | ||
|
||
if (!AllRegs.test(Reg)) | ||
SU.removePred(Dep); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't this invalidate the iterator for Dep?
continue; | ||
|
||
if (!AllRegs.test(Reg)) | ||
SU.removePred(Dep); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I never know. Does this also remove the successor of Dep's SU?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According the the docs, It removes the specified edge as a pred of the current node if it exists. It also removes the current node as a successor of the specified node.
3a7c26b
to
5558c5e
Compare
@@ -584,6 +679,8 @@ AIEBaseSubtarget::getSMSMutationsImpl(const Triple &TT) { | |||
std::vector<std::unique_ptr<ScheduleDAGMutation>> Mutations; | |||
if (!TT.isAIE1()) { | |||
Mutations.emplace_back(std::make_unique<SWPWAWEdges>()); | |||
if (EnableWAWStickyRegisters) | |||
Mutations.emplace_back(std::make_unique<WAWStickyRegistersEdges>()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense to enable that mutation in other schedulers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I observed no effects in other schedulers. Maybe we can keep it off and turn it on only if we see some opportunity in Post SWP (save time).
// or read. We also track all instructions implicitly reading or | ||
// defining such registers. | ||
std::map<const Register, SmallVector<const MachineInstr *, 16>> RegMIsMap; | ||
for (const MachineInstr &MI : *MBB) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: I think we only need to iterate between RegionBegin and RegionEnd?
; RUN: llc -O2 -mtriple=aie2 -aie-pipeliner-waw-sticky-registers=false \ | ||
; RUN: %s -o - | FileCheck %s --check-prefix=WAW-STICKY-OFF | ||
; RUN: llc -O2 -mtriple=aie2 -debug-only=pipeliner %s -o - 2>&1 > /dev/null \ | ||
; RUN: | FileCheck %s --check-prefix=WAW-STICKY-ON-PIPELINER |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: I'd rather have a single set of check lines (with the new combiner on), otherwise it makes the test really long
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I reduced the test.
cd4b00d
to
e302e40
Compare
@gbossu and @martien-de-jong , I updated the PR and also rebased. The QoR results are updated. |
!13 = !{!"llvm.loop.mustprogress"} | ||
!14 = !{!"llvm.loop.itercount.range", i64 4} | ||
;; NOTE: These prefixes are unused and the list is autogenerated. Do not add tests below this line: | ||
; WAW-STICKY-ON-PIPELINER: {{.*}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can the check lines still be auto generated?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think no, sadly! If you agree I can keep just the instructions, not the pipeliner info.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I simplified the test! Now it can be ;-)
default: | ||
return false; | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this just be return AIE2::mSRmRegClass.contains(PhysReg)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes if we can include srCarry. Semantically we can use, because we detect register reads, however If I remember correctly, this represents another unexpected freedom behavior.
// VST.CONV ... | ||
// In this case, by removing dependencies between pairs of VST.CONVs, | ||
// we give too much freedom to the scheduler to do good, but also | ||
// not good choices. In this way, we filter those cases off. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is really sad that giving more freedom is bad :( Do you have an example of where things go wrong?
I expected that mutator to be a simple iteration over output dependencies, and if the output dep is on a sticky reg, just remove it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I was expecting to be simple as well, but the results were frustrating. I will prepare a run turning off the heuristic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @gbossu, QoR results without the heuristic part:
|--------------------------|---------------|-----------------------------|--------------------------------------|--------------|--------------------|--------------------|----------------|----------------------------|--------------|--------------|--------------|--------------|--------------|------------------------------------|------------------------|--------------|-----------------|-----------------|-----------------|-----------------|--------------|--------------|---------------------------|---------------------------|-----------------------|-----------------------|---------------------------|-----------------------|---------------|---------------|-------------------------|-------------------------|--------------------|-------------------|-------------------|------------------|----------------------|--------------------|----------------------|------------------|--------------------|----------------|-----------------------------|-----------------------------|----------------|----------------|--------------------------------------------------------------|----------------------------------------------------------|--------------------------------------------------|----------------------------------------------|--------------------------------------------------|----------------------------------------------|------------------------------------------------------------|-----------------------|-----------------------|-----------------------|-----------------------|--------------|--------------|-----------------------|-----------------------|------------------|------------------|-------------------------|-------------------------|--------------|--------------|------------------|------------------|--------------|--------------|----------------|----------------|---------------|---------------|--------------------------|--------------------------|--------------------|--------------------|--------------|-------------------------|-------------------------|---------------|---------------|----------------------|----------------------|-----------------------|-----------------------|-----------------------|-----------------------|-----------------------|-----------------------|-----------------------|-----------------------|-----------------|--------------------------------------|--------------------------------------|------------------------|------------------------|----------------------|------------------|----------------------|------------------|----------------|----------------|-----------------|-----------------|-----------------|-----------------|-------------------------------|--------------|--------------|----------------------|------------------|--------------|--------------|------------------------|------------------------|--------------|--------------|--------------|--------------|--------------|--------------|--------------|--------------|-------------------------|-------------------------|---------------------|---------------------|---------------------|---------------------|---------------------|---------------------|---------------------|---------------------|----------------|----------------|--------------------|--------------------|--------------------|--------------------|-------------------------------|------------------|-------------------------------|-------------------------------|-------------------------------|-------------------------------|-----------------------------------|-------------------------------|------------------------------|------------------------------|------------------------------|------------------------------|--------------|--------------|--------------|--------------|-------------------|----------------------|--------------|--------------|--------------------|----------------|--------------|--------------|--------------------------------------|------------------------|--------------|--------------|--------------|--------------|---------------------|-----------------|------------------------|------------------------|-----------------------|-----------------------|--------------------------------------|--------------------------------------|--------------|--------------|-----------------------|-----------------------|-------------------|-------------------|-------------------|------------------|------------------|------------------|------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|------------------------------|--------------------------|-----------------------|-----------------------|------------------|------------------|------------------|------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|------------------|------------------|------------------|------------------|--------------|--------------|-------------------|-------------------|------------------|------------------|------------------|------------------|---------------|---------------|----------------|----------------|------------------|-------------------------|-------------------------|----------------|----------------|--------------|--------------|--------------|--------------|---------------|---------------|------------------|--------------|--------------|----------------|----------------|--------------|--------------|--------------|--------------|--------------------|----------------|--------------------------------------|-----------------------------|-------------------------------------------|-----------------|-------------------------------|--------------|--------------|------------------|------------------|---------------|---------------|---------------|---------------|---------------|---------------|---------------|---------------|-------------------------|-----------------------------|-------------------------|-----------------------------|-------------------------|-----------------------------|-------------------------|-----------------------------|-------------------------|-----------------------------|-------------------------|-----------------------------|-------------------------|-----------------------------|-------------------------|-----------------------------|-------------------------|-----------------------------|-------------------------|-----------------------------|--------------|--------------|----------------|----------------|--------------|------------------------------------|--------------------------------------|--------------|----------------------------|-----------------------------|----------------|-----------------|--------------|-------------------------|--------------|-------------------------|--------------|-------------------|-----------------------------|-------------------------------|------------------|---------------|---------------|---------------------|-----------------------------|--------------|------------|-------------|-------------|-------------|-------------|-------------|-------------|-------------|-------------|-------------|
| Core_Compute_Cycle_Count | Neg_aie2_1 | HardSigmoidTemplated_int8_0 | MulAttributeBroadcasting_aie2_bf16_0 | Add2D_0 | Add2D_Standalone_1 | Add2D_Standalone_0 | MulBf16_aie2_0 | MulBroadcastingBf16_aie2_0 | Abs_bf16_0 | Abs_int8_0 | Add2D_1 | Add2D_bf16_0 | Add2D_bf16_1 | AddAttributeBroadcasting_aie2_int8 | AddBroadcasting_aie2_0 | Add_aie2_0 | ArgMax1d_bf16_0 | ArgMax1d_int8_0 | ArgMin1d_bf16_0 | ArgMin1d_int8_0 | AvgPool2D_0 | AvgPool2D_1 | AvgPool2D_aie2_bfloat16_0 | AvgPool2D_aie2_bfloat16_1 | AvgPool2D_aie2_int8_0 | AvgPool2D_aie2_int8_1 | BatchNorm1d_aie2_bfloat16 | BatchNorm1d_aie2_int8 | BatchNorm2D_0 | BatchNorm2D_1 | BilinearInterpolation_0 | BilinearInterpolation_1 | BitShift_AIE2_int8 | BitwiseAnd_int8_0 | BitwiseNot_aie2_0 | BitwiseOr_int8_0 | BitwiseXor_aie2_int8 | Cast_aie2_bfloat16 | Cast_aie2_bfloat16_1 | Cast_aie2_int8_1 | Ceil_AIE2_bfloat16 | Ceil_AIE2_int8 | ChannelsFirstFlatten_bf16_0 | ChannelsFirstFlatten_int8_0 | Clip_aie2_bf16 | Clip_aie2_int8 | CompareOpsBroadcasting_K_EQ_GE_GT_LE_LT_CMP_GE_bfloat16_aie2 | CompareOpsBroadcasting_K_EQ_GE_GT_LE_LT_CMP_GE_int8_aie2 | CompareOps_K_EQ_GE_GT_LE_LT_CMP_EQ_bfloat16_aie2 | CompareOps_K_EQ_GE_GT_LE_LT_CMP_EQ_int8_aie2 | CompareOps_K_EQ_GE_GT_LE_LT_CMP_GE_bfloat16_aie2 | CompareOps_K_EQ_GE_GT_LE_LT_CMP_GE_int8_aie2 | CompareOps_K_EQ_GE_GT_LE_LT_CMP_GE_int8_aie2_ptr_interface | Conv1D_DW_AIE2_bf16_0 | Conv1D_DW_AIE2_bf16_1 | Conv1D_DW_AIE2_int8_0 | Conv1D_DW_AIE2_int8_1 | Conv2D_0 | Conv2D_1 | Conv2D_7x7s2_Layer1_0 | Conv2D_7x7s2_Layer1_1 | Conv2D_11x11s4_0 | Conv2D_11x11s4_1 | Conv2D_11x11s4_Layer1_0 | Conv2D_11x11s4_Layer1_1 | Conv2D_DW_0 | Conv2D_DW_1 | Conv2D_DW_bf16_0 | Conv2D_DW_bf16_1 | Conv2D_FC_0 | Conv2D_FC_1 | Conv2D_LReLU_0 | Conv2D_LReLU_1 | Conv2D_ReLU_0 | Conv2D_ReLU_1 | Conv2D_ReLU_Standalone_0 | Conv2D_ReLU_Standalone_1 | Conv2D_ReLU_int8_0 | Conv2D_ReLU_int8_1 | Conv2D_SV60 | Conv2D_Transpose_AIE2_0 | Conv2D_Transpose_AIE2_1 | Conv2D_bf16_0 | Conv2D_bf16_1 | Conv2D_mixed_batch_0 | Conv2D_mixed_batch_1 | DegroupG4_aie2_bf16_0 | DegroupG4_aie2_bf16_1 | DegroupG4_aie2_int8_0 | DegroupG4_aie2_int8_1 | DegroupG8_aie2_bf16_0 | DegroupG8_aie2_bf16_1 | DegroupG8_aie2_int8_0 | DegroupG8_aie2_int8_1 | DilatedConv2D_1 | DivAttributeBroadcasting_aie2_bf16_0 | DivAttributeBroadcasting_aie2_int8_0 | DivBroadcasting_aie2_0 | DivBroadcasting_aie2_1 | EleMax_aie2_bfloat16 | EleMax_aie2_int8 | EleMin_aie2_bfloat16 | EleMin_aie2_int8 | ElemDiv_aie2_0 | ElemDiv_aie2_1 | Elu_aie2_bf16_0 | Elu_aie2_int8_0 | Erf_aie2_bf16_0 | Erf_aie2_int8_0 | Erf_aie2_int8_0_ptr_interface | Exp_bf16_0 | Exp_bf16_1 | Expand_aie2_bfloat16 | Expand_aie2_int8 | Floor_aie2_0 | Floor_aie2_1 | FullyConnect_aie2_bf16 | FullyConnect_aie2_int8 | GELU_0 | GELU_1 | GEMM_bf16_0 | GEMM_bf16_1 | GEMM_int8_0 | GEMM_int8_1 | GEMV_0 | GEMV_1 | GeluTemplated_aie2_bf16 | GeluTemplated_aie2_int8 | GroupG4_aie2_bf16_0 | GroupG4_aie2_bf16_1 | GroupG4_aie2_int8_0 | GroupG4_aie2_int8_1 | GroupG8_aie2_bf16_0 | GroupG8_aie2_bf16_1 | GroupG8_aie2_int8_0 | GroupG8_aie2_int8_1 | Group_Conv2D_0 | Group_Conv2D_1 | HardSigmoid_bf16_0 | HardSigmoid_bf16_1 | HardSigmoid_int8_0 | HardSigmoid_int8_1 | HardswishAsHardsigmoid_aie2_0 | Hardswish_aie2_0 | InstanceNormPart1_aie2_bf16_0 | InstanceNormPart1_aie2_int8_0 | InstanceNormPart2_aie2_bf16_0 | InstanceNormPart2_aie2_int8_0 | InterpolateLinear1D_AIE2_bfloat16 | InterpolateLinear1D_AIE2_int8 | LayerNormC8Part1_aie2_bf16_0 | LayerNormC8Part1_aie2_int8_0 | LayerNormC8Part2_aie2_bf16_0 | LayerNormC8Part2_aie2_int8_0 | LayerNorm_0 | LayerNorm_1 | Log_bf16_0 | Log_int8_0 | LogicalNot_aie2_0 | LogicalXor_aie2_int8 | MaxPool2D_0 | MaxPool2D_1 | Mish_aie2_bfloat16 | Mish_aie2_int8 | Mul2d_bf16_0 | Mul2d_bf16_1 | MulAttributeBroadcasting_aie2_int8_0 | MulBroadcasting_aie2_0 | Mul_aie2_0 | Neg_aie2_0 | Pad2D_0 | Pad2D_1 | Pad3D_AIE2_bfloat16 | Pad3D_AIE2_int8 | PixelShuffle_aie2_bf16 | PixelShuffle_aie2_int8 | PixelUnshuffle_bf16_0 | PixelUnshuffle_int8_0 | PowAttributeBroadcasting_aie2_bf16_0 | PowAttributeBroadcasting_aie2_int8_0 | Pow_bf16_0 | Pow_int8_0 | Range_bfloat16_aie2_0 | Range_bfloat16_aie2_1 | Range_int8_aie2_0 | Range_int8_aie2_1 | Reciprocal_aie2_0 | ReduceMax_bf16_0 | ReduceMax_bf16_1 | ReduceMax_int8_0 | ReduceMax_int8_1 | ReduceMeanAxis_1_aie2_bf16 | ReduceMeanAxis_1_aie2_int8 | ReduceMeanAxis_2_aie2_bf16 | ReduceMeanAxis_2_aie2_int8 | ReduceMeanAxis_3_aie2_bf16 | ReduceMeanAxis_3_aie2_int8 | ReduceMeanAxis_4_aie2_bf16 | ReduceMeanAxis_4_aie2_int8 | ReduceMeanAxis_5_aie2_bf16 | ReduceMeanAxis_5_aie2_int8 | ReduceMeanAxis_6_aie2_bf16 | ReduceMeanAxis_6_aie2_int8 | ReduceMeanAxis_7_aie2_bf16 | ReduceMeanAxis_7_aie2_int8 | ReduceMeanNoc8_AIE2_bfloat16 | ReduceMeanNoc8_AIE2_int8 | ReduceMin1D_aie2_bf16 | ReduceMin1D_aie2_int8 | ReduceMin_bf16_0 | ReduceMin_bf16_1 | ReduceMin_int8_0 | ReduceMin_int8_1 | ReduceSumAxis_1_aie2_bf16 | ReduceSumAxis_1_aie2_int8 | ReduceSumAxis_2_aie2_bf16 | ReduceSumAxis_2_aie2_int8 | ReduceSumAxis_3_aie2_bf16 | ReduceSumAxis_3_aie2_int8 | ReduceSumAxis_4_aie2_bf16 | ReduceSumAxis_4_aie2_int8 | ReduceSumAxis_5_aie2_bf16 | ReduceSumAxis_5_aie2_int8 | ReduceSumAxis_6_aie2_bf16 | ReduceSumAxis_6_aie2_int8 | ReduceSumAxis_7_aie2_bf16 | ReduceSumAxis_7_aie2_int8 | ReduceSum_bf16_0 | ReduceSum_bf16_1 | ReduceSum_int8_0 | ReduceSum_int8_1 | Round_aie2_0 | Round_aie2_1 | Rsqrt_aie2_bf16_0 | Rsqrt_aie2_int8_0 | Scale_Add_bf16_0 | Scale_Add_bf16_1 | Select_aie2_bf16 | Select_aie2_int8 | Shrink_aie2_0 | Shrink_aie2_1 | SiLU_aie2_bf16 | SiLU_aie2_int8 | SiLU_aie2_int8_1 | SigmoidTemplated_int8_0 | SigmoidTemplated_int8_1 | Sigmoid_int8_0 | Sigmoid_int8_1 | Sign_bf16_0 | Sign_bf16_1 | Sign_int8_0 | Sign_int8_1 | Sin_aie2_bf16 | Sin_aie2_int8 | Slice_bfloat16_0 | Slice_int8_0 | Softmax_1 | Softmax_bf16_0 | Softmax_bf16_1 | Sqrt_bf16_0 | Sqrt_bf16_1 | Sqrt_int8_0 | Sqrt_int8_1 | Squeeze_bfloat16_0 | Squeeze_int8_0 | SubAttributeBroadcasting_aie2_int8_0 | SubBroadcasting_aie2_int8_0 | SubBroadcasting_aie2_int8_0_ptr_interface | Sub_aie2_int8_0 | Sub_aie2_int8_0_ptr_interface | Tanh_0 | Tanh_1 | Tile_aie2_bf16_0 | Tile_aie2_int8_1 | Topk1D_bf16_0 | Topk1D_bf16_1 | Topk1D_int8_0 | Topk1D_int8_1 | Topk2D_bf16_0 | Topk2D_bf16_1 | Topk2D_int8_0 | Topk2D_int8_1 | Transpose_aie2_bf16_021 | Transpose_aie2_bf16_021_pad | Transpose_aie2_bf16_102 | Transpose_aie2_bf16_102_pad | Transpose_aie2_bf16_120 | Transpose_aie2_bf16_120_pad | Transpose_aie2_bf16_201 | Transpose_aie2_bf16_201_pad | Transpose_aie2_bf16_210 | Transpose_aie2_bf16_210_pad | Transpose_aie2_int8_021 | Transpose_aie2_int8_021_pad | Transpose_aie2_int8_102 | Transpose_aie2_int8_102_pad | Transpose_aie2_int8_120 | Transpose_aie2_int8_120_pad | Transpose_aie2_int8_201 | Transpose_aie2_int8_201_pad | Transpose_aie2_int8_210 | Transpose_aie2_int8_210_pad | bfloat16 | int8 | Sigmoid_bf16_0 | Sigmoid_bf16_1 | Requantize_0 | AddAttributeBroadcasting_aie2_bf16 | SubAttributeBroadcasting_aie2_bf16_0 | Requantize_1 | AddBroadcastingBf16_aie2_0 | SubBroadcasting_aie2_bf16_0 | AddBf16_aie2_0 | Sub_aie2_bf16_0 | Scale_Add_0 | SigmoidTemplated_bf16_0 | Tanh_int8_0 | TanhTemplated_aie2_int8 | Tanh_int8_1 | Reciprocal_aie2_1 | HardSigmoidTemplated_bf16_0 | HardswishAsHardsigmoid_aie2_1 | Hardswish_aie2_1 | Mul2D_0 | Mul2D_1 | Rescale_aie2_int8_0 | TanhTemplated_aie2_bfloat16 | Averege diff | Diff stdev | Quantile #1 | Quantile #2 | Quantile #3 | Quantile #4 | Quantile #5 | Quantile #6 | Quantile #7 | Quantile #8 | Quantile #9 |
|--------------------------|---------------|-----------------------------|--------------------------------------|--------------|--------------------|--------------------|----------------|----------------------------|--------------|--------------|--------------|--------------|--------------|------------------------------------|------------------------|--------------|-----------------|-----------------|-----------------|-----------------|--------------|--------------|---------------------------|---------------------------|-----------------------|-----------------------|---------------------------|-----------------------|---------------|---------------|-------------------------|-------------------------|--------------------|-------------------|-------------------|------------------|----------------------|--------------------|----------------------|------------------|--------------------|----------------|-----------------------------|-----------------------------|----------------|----------------|--------------------------------------------------------------|----------------------------------------------------------|--------------------------------------------------|----------------------------------------------|--------------------------------------------------|----------------------------------------------|------------------------------------------------------------|-----------------------|-----------------------|-----------------------|-----------------------|--------------|--------------|-----------------------|-----------------------|------------------|------------------|-------------------------|-------------------------|--------------|--------------|------------------|------------------|--------------|--------------|----------------|----------------|---------------|---------------|--------------------------|--------------------------|--------------------|--------------------|--------------|-------------------------|-------------------------|---------------|---------------|----------------------|----------------------|-----------------------|-----------------------|-----------------------|-----------------------|-----------------------|-----------------------|-----------------------|-----------------------|-----------------|--------------------------------------|--------------------------------------|------------------------|------------------------|----------------------|------------------|----------------------|------------------|----------------|----------------|-----------------|-----------------|-----------------|-----------------|-------------------------------|--------------|--------------|----------------------|------------------|--------------|--------------|------------------------|------------------------|--------------|--------------|--------------|--------------|--------------|--------------|--------------|--------------|-------------------------|-------------------------|---------------------|---------------------|---------------------|---------------------|---------------------|---------------------|---------------------|---------------------|----------------|----------------|--------------------|--------------------|--------------------|--------------------|-------------------------------|------------------|-------------------------------|-------------------------------|-------------------------------|-------------------------------|-----------------------------------|-------------------------------|------------------------------|------------------------------|------------------------------|------------------------------|--------------|--------------|--------------|--------------|-------------------|----------------------|--------------|--------------|--------------------|----------------|--------------|--------------|--------------------------------------|------------------------|--------------|--------------|--------------|--------------|---------------------|-----------------|------------------------|------------------------|-----------------------|-----------------------|--------------------------------------|--------------------------------------|--------------|--------------|-----------------------|-----------------------|-------------------|-------------------|-------------------|------------------|------------------|------------------|------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|------------------------------|--------------------------|-----------------------|-----------------------|------------------|------------------|------------------|------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|------------------|------------------|------------------|------------------|--------------|--------------|-------------------|-------------------|------------------|------------------|------------------|------------------|---------------|---------------|----------------|----------------|------------------|-------------------------|-------------------------|----------------|----------------|--------------|--------------|--------------|--------------|---------------|---------------|------------------|--------------|--------------|----------------|----------------|--------------|--------------|--------------|--------------|--------------------|----------------|--------------------------------------|-----------------------------|-------------------------------------------|-----------------|-------------------------------|--------------|--------------|------------------|------------------|---------------|---------------|---------------|---------------|---------------|---------------|---------------|---------------|-------------------------|-----------------------------|-------------------------|-----------------------------|-------------------------|-----------------------------|-------------------------|-----------------------------|-------------------------|-----------------------------|-------------------------|-----------------------------|-------------------------|-----------------------------|-------------------------|-----------------------------|-------------------------|-----------------------------|-------------------------|-----------------------------|--------------|--------------|----------------|----------------|--------------|------------------------------------|--------------------------------------|--------------|----------------------------|-----------------------------|----------------|-----------------|--------------|-------------------------|--------------|-------------------------|--------------|-------------------|-----------------------------|-------------------------------|------------------|---------------|---------------|---------------------|-----------------------------|--------------|------------|-------------|-------------|-------------|-------------|-------------|-------------|-------------|-------------|-------------|
| Before | 343 | 257 | 1415 | 215 | 480 | 320 | 1076 | 1131 | 407 | 510 | 466 | 257 | 311 | 919 | 887 | 832 | 351 | 409 | 380 | 303 | 1068 | 780 | 3271 | 2263 | 1068 | 780 | 716 | 460 | 386 | 735 | 724 | 378 | 2008 | 482 | 215 | 482 | 720 | 923 | 923 | 794 | 1413 | 446 | 13603 | 14379 | 204 | 271 | 1627 | 987 | 1629 | 992 | 1624 | 987 | 987 | 3466 | 4025 | 1525 | 1748 | 7973 | 2483 | 5891 | 1617 | 5427 | 5426 | 4280 | 2983 | 2949 | 861 | 1204 | 3921 | 2702 | 1185 | 2180 | 5268 | 1281 | 27957 | 1281 | 2537 | 10149 | 930 | 863 | 39065 | 10453 | 25433 | 39790 | 11106 | 22356 | 605 | 1024 | 366 | 576 | 749 | 1151 | 438 | 639 | 5519 | 5500 | 8123 | 2139 | 1483 | 333 | 230 | 333 | 230 | 2081 | 1422 | 1375 | 577 | 2894 | 2554 | 2533 | 7328 | 1467 | 1960 | 1907 | 301 | 881 | 1090 | 829 | 2594 | 3426 | 3636 | 7677 | 3048 | 36207 | 469 | 387 | 1389 | 1214 | 497 | 1534 | 314 | 830 | 1028 | 1661 | 557 | 893 | 3863 | 4382 | 1030 | 706 | 458 | 470 | 1368 | 1368 | 2891 | 11139 | 13060 | 12216 | 15107 | 11999 | 8890 | 7820 | 11980 | 11222 | 19381 | 16452 | 4784 | 1617 | 225 | 528 | 793 | 585 | 5661 | 9418 | 440 | 296 | 612 | 475 | 420 | 408 | 568 | 1684 | 9272 | 9627 | 8570 | 8570 | 17152 | 17152 | 41102 | 4312 | 35597 | 4312 | 3962 | 2658 | 1203 | 1791 | 1363 | 7193 | 9433 | 14509 | 19315 | 13541 | 7438 | 13577 | 7500 | 7735 | 2932 | 13547 | 7465 | 7714 | 2949 | 7721 | 2928 | 6867 | 2111 | 22137 | 61745 | 193 | 169 | 7193 | 18469 | 8797 | 19069 | 12647 | 7149 | 12669 | 7190 | 7593 | 2879 | 12684 | 7211 | 7596 | 2882 | 7579 | 2855 | 6816 | 2071 | 19617 | 12217 | 20005 | 11482 | 367 | 1248 | 3603 | 2376 | 1741 | 1741 | 328 | 213 | 671 | 759 | 3482 | 2968 | 2966 | 1467 | 1467 | 91 | 110 | 1078 | 210 | 416 | 122 | 3009 | 841 | 945 | 1545 | 502 | 7632 | 1643 | 29776 | 3792 | 19157 | 19157 | 207 | 207 | 919 | 865 | 865 | 810 | 810 | 2694 | 3558 | 4264 | 2595 | 1217 | 169 | 836 | 118 | 34469 | 303 | 30723 | 252 | 1856 | 2338 | 1155 | 1140 | 1856 | 1752 | 1871 | 1767 | 1868 | 1868 | 2685 | 3612 | 1149 | 1089 | 2686 | 2686 | 2700 | 2544 | 2694 | 2538 | 1199 | 862 | 2011 | 1327 | 1346 | 760 | 760 | 738 | 727 | 705 | 673 | 651 | 369 | 2138 | 348 | 310 | 420 | 2400 | 616 | 1790 | 1785 | 458 | 458 | 326 | 2524 | | | | | | | | | | | |
|--------------------------|---------------|-----------------------------|--------------------------------------|--------------|--------------------|--------------------|----------------|----------------------------|--------------|--------------|--------------|--------------|--------------|------------------------------------|------------------------|--------------|-----------------|-----------------|-----------------|-----------------|--------------|--------------|---------------------------|---------------------------|-----------------------|-----------------------|---------------------------|-----------------------|---------------|---------------|-------------------------|-------------------------|--------------------|-------------------|-------------------|------------------|----------------------|--------------------|----------------------|------------------|--------------------|----------------|-----------------------------|-----------------------------|----------------|----------------|--------------------------------------------------------------|----------------------------------------------------------|--------------------------------------------------|----------------------------------------------|--------------------------------------------------|----------------------------------------------|------------------------------------------------------------|-----------------------|-----------------------|-----------------------|-----------------------|--------------|--------------|-----------------------|-----------------------|------------------|------------------|-------------------------|-------------------------|--------------|--------------|------------------|------------------|--------------|--------------|----------------|----------------|---------------|---------------|--------------------------|--------------------------|--------------------|--------------------|--------------|-------------------------|-------------------------|---------------|---------------|----------------------|----------------------|-----------------------|-----------------------|-----------------------|-----------------------|-----------------------|-----------------------|-----------------------|-----------------------|-----------------|--------------------------------------|--------------------------------------|------------------------|------------------------|----------------------|------------------|----------------------|------------------|----------------|----------------|-----------------|-----------------|-----------------|-----------------|-------------------------------|--------------|--------------|----------------------|------------------|--------------|--------------|------------------------|------------------------|--------------|--------------|--------------|--------------|--------------|--------------|--------------|--------------|-------------------------|-------------------------|---------------------|---------------------|---------------------|---------------------|---------------------|---------------------|---------------------|---------------------|----------------|----------------|--------------------|--------------------|--------------------|--------------------|-------------------------------|------------------|-------------------------------|-------------------------------|-------------------------------|-------------------------------|-----------------------------------|-------------------------------|------------------------------|------------------------------|------------------------------|------------------------------|--------------|--------------|--------------|--------------|-------------------|----------------------|--------------|--------------|--------------------|----------------|--------------|--------------|--------------------------------------|------------------------|--------------|--------------|--------------|--------------|---------------------|-----------------|------------------------|------------------------|-----------------------|-----------------------|--------------------------------------|--------------------------------------|--------------|--------------|-----------------------|-----------------------|-------------------|-------------------|-------------------|------------------|------------------|------------------|------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|------------------------------|--------------------------|-----------------------|-----------------------|------------------|------------------|------------------|------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|------------------|------------------|------------------|------------------|--------------|--------------|-------------------|-------------------|------------------|------------------|------------------|------------------|---------------|---------------|----------------|----------------|------------------|-------------------------|-------------------------|----------------|----------------|--------------|--------------|--------------|--------------|---------------|---------------|------------------|--------------|--------------|----------------|----------------|--------------|--------------|--------------|--------------|--------------------|----------------|--------------------------------------|-----------------------------|-------------------------------------------|-----------------|-------------------------------|--------------|--------------|------------------|------------------|---------------|---------------|---------------|---------------|---------------|---------------|---------------|---------------|-------------------------|-----------------------------|-------------------------|-----------------------------|-------------------------|-----------------------------|-------------------------|-----------------------------|-------------------------|-----------------------------|-------------------------|-----------------------------|-------------------------|-----------------------------|-------------------------|-----------------------------|-------------------------|-----------------------------|-------------------------|-----------------------------|--------------|--------------|----------------|----------------|--------------|------------------------------------|--------------------------------------|--------------|----------------------------|-----------------------------|----------------|-----------------|--------------|-------------------------|--------------|-------------------------|--------------|-------------------|-----------------------------|-------------------------------|------------------|---------------|---------------|---------------------|-----------------------------|--------------|------------|-------------|-------------|-------------|-------------|-------------|-------------|-------------|-------------|-------------|
| After | 490 | 284 | 1554 | 229 | 510 | 334 | 1119 | 1174 | 407 | 510 | 466 | 257 | 311 | 919 | 887 | 832 | 351 | 409 | 380 | 303 | 1068 | 780 | 3271 | 2263 | 1068 | 780 | 716 | 460 | 386 | 735 | 724 | 378 | 2008 | 482 | 215 | 482 | 720 | 923 | 923 | 794 | 1413 | 446 | 13603 | 14379 | 204 | 271 | 1627 | 987 | 1629 | 992 | 1624 | 987 | 987 | 3466 | 4025 | 1525 | 1748 | 7973 | 2483 | 5891 | 1617 | 5427 | 5426 | 4280 | 2983 | 2949 | 861 | 1204 | 3921 | 2702 | 1185 | 2180 | 5268 | 1281 | 27957 | 1281 | 2537 | 10149 | 930 | 863 | 39065 | 10453 | 25433 | 39790 | 11106 | 22356 | 605 | 1024 | 366 | 576 | 749 | 1151 | 438 | 639 | 5519 | 5500 | 8123 | 2139 | 1483 | 333 | 230 | 333 | 230 | 2081 | 1422 | 1375 | 577 | 2894 | 2554 | 2533 | 7328 | 1467 | 1960 | 1907 | 301 | 881 | 1090 | 829 | 2594 | 3426 | 3636 | 7677 | 3048 | 36207 | 469 | 387 | 1389 | 1214 | 497 | 1534 | 314 | 830 | 1028 | 1661 | 557 | 893 | 3863 | 4382 | 1030 | 706 | 458 | 470 | 1368 | 1368 | 2891 | 11139 | 13060 | 12216 | 15107 | 11999 | 8890 | 7820 | 11980 | 11222 | 19381 | 16452 | 4784 | 1617 | 225 | 528 | 793 | 585 | 5661 | 9418 | 440 | 296 | 612 | 475 | 420 | 408 | 568 | 1684 | 9272 | 9627 | 8570 | 8570 | 17152 | 17152 | 41102 | 4312 | 35597 | 4312 | 3962 | 2658 | 1203 | 1791 | 1363 | 7193 | 9433 | 14509 | 19315 | 13541 | 7438 | 13577 | 7500 | 7735 | 2932 | 13547 | 7465 | 7714 | 2949 | 7721 | 2928 | 6867 | 2111 | 22137 | 61745 | 193 | 169 | 7193 | 18469 | 8797 | 19069 | 12647 | 7149 | 12669 | 7190 | 7593 | 2879 | 12684 | 7211 | 7596 | 2882 | 7579 | 2855 | 6816 | 2071 | 19617 | 12217 | 20005 | 11482 | 367 | 1248 | 3603 | 2376 | 1741 | 1741 | 328 | 213 | 671 | 759 | 3482 | 2968 | 2966 | 1467 | 1467 | 91 | 110 | 1078 | 210 | 416 | 122 | 3009 | 841 | 945 | 1545 | 502 | 7632 | 1643 | 29776 | 3792 | 19157 | 19157 | 207 | 207 | 919 | 865 | 865 | 810 | 810 | 2694 | 3558 | 4264 | 2595 | 1217 | 169 | 836 | 118 | 34469 | 303 | 30723 | 252 | 1856 | 2338 | 1155 | 1140 | 1856 | 1752 | 1871 | 1767 | 1868 | 1868 | 2685 | 3612 | 1149 | 1089 | 2686 | 2686 | 2700 | 2544 | 2694 | 2538 | 1199 | 862 | 2009 | 1325 | 1343 | 757 | 757 | 735 | 724 | 702 | 670 | 648 | 367 | 2078 | 338 | 300 | 406 | 2248 | 556 | 1590 | 1585 | 404 | 404 | 233 | 1188 | -0.22% | 4.46 | +0.00% | +0.00% | +0.00% | +0.00% | +0.00% | +0.00% | +0.00% | +0.00% | +0.00% |
|--------------------------|---------------|-----------------------------|--------------------------------------|--------------|--------------------|--------------------|----------------|----------------------------|--------------|--------------|--------------|--------------|--------------|------------------------------------|------------------------|--------------|-----------------|-----------------|-----------------|-----------------|--------------|--------------|---------------------------|---------------------------|-----------------------|-----------------------|---------------------------|-----------------------|---------------|---------------|-------------------------|-------------------------|--------------------|-------------------|-------------------|------------------|----------------------|--------------------|----------------------|------------------|--------------------|----------------|-----------------------------|-----------------------------|----------------|----------------|--------------------------------------------------------------|----------------------------------------------------------|--------------------------------------------------|----------------------------------------------|--------------------------------------------------|----------------------------------------------|------------------------------------------------------------|-----------------------|-----------------------|-----------------------|-----------------------|--------------|--------------|-----------------------|-----------------------|------------------|------------------|-------------------------|-------------------------|--------------|--------------|------------------|------------------|--------------|--------------|----------------|----------------|---------------|---------------|--------------------------|--------------------------|--------------------|--------------------|--------------|-------------------------|-------------------------|---------------|---------------|----------------------|----------------------|-----------------------|-----------------------|-----------------------|-----------------------|-----------------------|-----------------------|-----------------------|-----------------------|-----------------|--------------------------------------|--------------------------------------|------------------------|------------------------|----------------------|------------------|----------------------|------------------|----------------|----------------|-----------------|-----------------|-----------------|-----------------|-------------------------------|--------------|--------------|----------------------|------------------|--------------|--------------|------------------------|------------------------|--------------|--------------|--------------|--------------|--------------|--------------|--------------|--------------|-------------------------|-------------------------|---------------------|---------------------|---------------------|---------------------|---------------------|---------------------|---------------------|---------------------|----------------|----------------|--------------------|--------------------|--------------------|--------------------|-------------------------------|------------------|-------------------------------|-------------------------------|-------------------------------|-------------------------------|-----------------------------------|-------------------------------|------------------------------|------------------------------|------------------------------|------------------------------|--------------|--------------|--------------|--------------|-------------------|----------------------|--------------|--------------|--------------------|----------------|--------------|--------------|--------------------------------------|------------------------|--------------|--------------|--------------|--------------|---------------------|-----------------|------------------------|------------------------|-----------------------|-----------------------|--------------------------------------|--------------------------------------|--------------|--------------|-----------------------|-----------------------|-------------------|-------------------|-------------------|------------------|------------------|------------------|------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|------------------------------|--------------------------|-----------------------|-----------------------|------------------|------------------|------------------|------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|------------------|------------------|------------------|------------------|--------------|--------------|-------------------|-------------------|------------------|------------------|------------------|------------------|---------------|---------------|----------------|----------------|------------------|-------------------------|-------------------------|----------------|----------------|--------------|--------------|--------------|--------------|---------------|---------------|------------------|--------------|--------------|----------------|----------------|--------------|--------------|--------------|--------------|--------------------|----------------|--------------------------------------|-----------------------------|-------------------------------------------|-----------------|-------------------------------|--------------|--------------|------------------|------------------|---------------|---------------|---------------|---------------|---------------|---------------|---------------|---------------|-------------------------|-----------------------------|-------------------------|-----------------------------|-------------------------|-----------------------------|-------------------------|-----------------------------|-------------------------|-----------------------------|-------------------------|-----------------------------|-------------------------|-----------------------------|-------------------------|-----------------------------|-------------------------|-----------------------------|-------------------------|-----------------------------|--------------|--------------|----------------|----------------|--------------|------------------------------------|--------------------------------------|--------------|----------------------------|-----------------------------|----------------|-----------------|--------------|-------------------------|--------------|-------------------------|--------------|-------------------|-----------------------------|-------------------------------|------------------|---------------|---------------|---------------------|-----------------------------|--------------|------------|-------------|-------------|-------------|-------------|-------------|-------------|-------------|-------------|-------------|
| Total diff | REGR(+42.86%) | REGR(+10.51%) | REGR(+9.82%) | REGR(+6.51%) | REGR(+6.25%) | REGR(+4.38%) | REGR(+4.00%) | REGR(+3.80%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(-0.10%) | IMPR(-0.15%) | IMPR(-0.22%) | IMPR(-0.39%) | IMPR(-0.39%) | IMPR(-0.41%) | IMPR(-0.41%) | IMPR(-0.43%) | IMPR(-0.45%) | IMPR(-0.46%) | IMPR(-0.54%) | IMPR(-2.81%) | IMPR(-2.87%) | IMPR(-3.23%) | IMPR(-3.33%) | IMPR(-6.33%) | IMPR(-9.74%) | IMPR(-11.17%) | IMPR(-11.20%) | IMPR(-11.79%) | IMPR(-11.79%) | IMPR(-28.53%) | IMPR(-52.93%) | -0.22% | 4.46 | +0.00% | +0.00% | +0.00% | +0.00% | +0.00% | +0.00% | +0.00% | +0.00% | +0.00% |
|--------------------------|---------------|-----------------------------|--------------------------------------|--------------|--------------------|--------------------|----------------|----------------------------|--------------|--------------|--------------|--------------|--------------|------------------------------------|------------------------|--------------|-----------------|-----------------|-----------------|-----------------|--------------|--------------|---------------------------|---------------------------|-----------------------|-----------------------|---------------------------|-----------------------|---------------|---------------|-------------------------|-------------------------|--------------------|-------------------|-------------------|------------------|----------------------|--------------------|----------------------|------------------|--------------------|----------------|-----------------------------|-----------------------------|----------------|----------------|--------------------------------------------------------------|----------------------------------------------------------|--------------------------------------------------|----------------------------------------------|--------------------------------------------------|----------------------------------------------|------------------------------------------------------------|-----------------------|-----------------------|-----------------------|-----------------------|--------------|--------------|-----------------------|-----------------------|------------------|------------------|-------------------------|-------------------------|--------------|--------------|------------------|------------------|--------------|--------------|----------------|----------------|---------------|---------------|--------------------------|--------------------------|--------------------|--------------------|--------------|-------------------------|-------------------------|---------------|---------------|----------------------|----------------------|-----------------------|-----------------------|-----------------------|-----------------------|-----------------------|-----------------------|-----------------------|-----------------------|-----------------|--------------------------------------|--------------------------------------|------------------------|------------------------|----------------------|------------------|----------------------|------------------|----------------|----------------|-----------------|-----------------|-----------------|-----------------|-------------------------------|--------------|--------------|----------------------|------------------|--------------|--------------|------------------------|------------------------|--------------|--------------|--------------|--------------|--------------|--------------|--------------|--------------|-------------------------|-------------------------|---------------------|---------------------|---------------------|---------------------|---------------------|---------------------|---------------------|---------------------|----------------|----------------|--------------------|--------------------|--------------------|--------------------|-------------------------------|------------------|-------------------------------|-------------------------------|-------------------------------|-------------------------------|-----------------------------------|-------------------------------|------------------------------|------------------------------|------------------------------|------------------------------|--------------|--------------|--------------|--------------|-------------------|----------------------|--------------|--------------|--------------------|----------------|--------------|--------------|--------------------------------------|------------------------|--------------|--------------|--------------|--------------|---------------------|-----------------|------------------------|------------------------|-----------------------|-----------------------|--------------------------------------|--------------------------------------|--------------|--------------|-----------------------|-----------------------|-------------------|-------------------|-------------------|------------------|------------------|------------------|------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|------------------------------|--------------------------|-----------------------|-----------------------|------------------|------------------|------------------|------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|------------------|------------------|------------------|------------------|--------------|--------------|-------------------|-------------------|------------------|------------------|------------------|------------------|---------------|---------------|----------------|----------------|------------------|-------------------------|-------------------------|----------------|----------------|--------------|--------------|--------------|--------------|---------------|---------------|------------------|--------------|--------------|----------------|----------------|--------------|--------------|--------------|--------------|--------------------|----------------|--------------------------------------|-----------------------------|-------------------------------------------|-----------------|-------------------------------|--------------|--------------|------------------|------------------|---------------|---------------|---------------|---------------|---------------|---------------|---------------|---------------|-------------------------|-----------------------------|-------------------------|-----------------------------|-------------------------|-----------------------------|-------------------------|-----------------------------|-------------------------|-----------------------------|-------------------------|-----------------------------|-------------------------|-----------------------------|-------------------------|-----------------------------|-------------------------|-----------------------------|-------------------------|-----------------------------|--------------|--------------|----------------|----------------|--------------|------------------------------------|--------------------------------------|--------------|----------------------------|-----------------------------|----------------|-----------------|--------------|-------------------------|--------------|-------------------------|--------------|-------------------|-----------------------------|-------------------------------|------------------|---------------|---------------|---------------------|-----------------------------|--------------|------------|-------------|-------------|-------------|-------------|-------------|-------------|-------------|-------------|-------------|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed not great... I have a pending change that teaches the scheduler a bit more about LCDs. Once @F-Stuckmann merges #168 I will open the PR. Maybe that will help with those regressions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We hope yes ;-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the extra regressions seem to be neg, add2d and hardSigmoid. Could you maybe check them again?
Note I'm not concerned about Neg, this is a simple II=1 that we should handle in the post-pipeliner. FYI @martien-de-jong
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Same for Hardsigmoid actually, this is a bit of work, but we'll get it done in the post-pipeliner)
So the only regression to really check is Add2D. I'd be very happy if we now stay at II=10, and can simplify the logic of this DAGMutator
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can see 1 more cycle for Add2D_0
and 2 more for HardSigmoidTemplated_int8_0
in the innermost loop. I will rebase the branch and see how it affects this PR - with the heuristic disabled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @gbossu, results comparing aie-public
with this PR (heuristics off):
|-----------------------------------|---------------|-----------------------------|--------------------------------------|--------------------|--------------|--------------|--------------|--------------------|----------------|----------------------------|-----------------------------|----------------------------|------------------------------------|--------------------------------------|----------------|--------------|--------------|--------------|--------------|--------------|------------------------------------|----------------|------------------------|--------------|-----------------|-----------------|-----------------|-----------------|--------------|--------------|---------------------------|---------------------------|-----------------------|-----------------------|---------------------------|-----------------------|---------------|---------------|-------------------------|-------------------------|--------------------|-------------------|-------------------|------------------|----------------------|--------------------|----------------------|----------------|------------------|--------------------|----------------|-----------------------------|-----------------------------|----------------|----------------|--------------------------------------------------------------|----------------------------------------------------------|--------------------------------------------------|----------------------------------------------|--------------------------------------------------|----------------------------------------------|------------------------------------------------------------|-----------------------|-----------------------|-----------------------|-----------------------|--------------|--------------|-----------------------|-----------------------|------------------|------------------|-------------------------|-------------------------|--------------|--------------|------------------|------------------|--------------|--------------|----------------|----------------|---------------|---------------|--------------------------|--------------------------|--------------------|--------------------|--------------|-------------------------|-------------------------|---------------|---------------|----------------------|----------------------|-----------------------|-----------------------|-----------------------|-----------------------|-----------------------|-----------------------|-----------------------|-----------------------|-----------------|--------------------------------------|--------------------------------------|------------------------|------------------------|----------------------|------------------|----------------------|------------------|----------------|----------------|-----------------|-----------------|-----------------|-----------------|-------------------------------|--------------|--------------|----------------------|------------------|--------------|--------------|------------------------|------------------------|--------------|--------------|--------------|--------------|--------------|--------------|--------------|--------------|-------------------------|-------------------------|---------------------|---------------------|---------------------|---------------------|---------------------|---------------------|---------------------|---------------------|----------------|----------------|--------------------|--------------------|--------------------|--------------------|-------------------------------|------------------|-------------------------------|-------------------------------|-------------------------------|-------------------------------|-----------------------------------|-------------------------------|------------------------------|------------------------------|------------------------------|------------------------------|--------------|--------------|--------------|--------------|-------------------|----------------------|--------------|--------------|--------------------|----------------|--------------|--------------|--------------------------------------|------------------------|--------------|--------------|--------------|--------------|---------------------|-----------------|------------------------|------------------------|-----------------------|-----------------------|--------------------------------------|--------------------------------------|--------------|--------------|-----------------------|-----------------------|-------------------|-------------------|-------------------|------------------|------------------|------------------|------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|------------------------------|--------------------------|-----------------------|-----------------------|------------------|------------------|------------------|------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|------------------|------------------|------------------|------------------|--------------|--------------|-------------------|-------------------|------------------|------------------|------------------|------------------|---------------|---------------|----------------|----------------|------------------|-------------------------|-------------------------|----------------|----------------|--------------|--------------|--------------|--------------|---------------|---------------|------------------|--------------|--------------|----------------|--------------|--------------|--------------|--------------|--------------------|----------------|--------------------------------------|-----------------------------|-------------------------------------------|-----------------|-----------------|-------------------------------|--------------|--------------|------------------|------------------|---------------|---------------|---------------|---------------|---------------|---------------|---------------|---------------|-------------------------|-----------------------------|-------------------------|-----------------------------|-------------------------|-----------------------------|-------------------------|-----------------------------|-------------------------|-----------------------------|-------------------------|-----------------------------|-------------------------|-----------------------------|-------------------------|-----------------------------|-------------------------|-----------------------------|-------------------------|-----------------------------|--------------|--------------|----------------|----------------|--------------|--------------|--------------|-------------------------|--------------|-------------------------------|------------------|-------------------------|-----------------------------|-------------------|---------------|---------------|---------------------|-----------------------------|--------------|------------|-------------|-------------|-------------|-------------|-------------|-------------|-------------|-------------|-------------|
| Core_Compute_Cycle_Count | Neg_aie2_1 | HardSigmoidTemplated_int8_0 | MulAttributeBroadcasting_aie2_bf16_0 | Add2D_Standalone_1 | Add2D_0 | Requantize_0 | Requantize_1 | Add2D_Standalone_0 | MulBf16_aie2_0 | MulBroadcastingBf16_aie2_0 | SubBroadcasting_aie2_bf16_0 | AddBroadcastingBf16_aie2_0 | AddAttributeBroadcasting_aie2_bf16 | SubAttributeBroadcasting_aie2_bf16_0 | Softmax_bf16_0 | Abs_bf16_0 | Abs_int8_0 | Add2D_1 | Add2D_bf16_0 | Add2D_bf16_1 | AddAttributeBroadcasting_aie2_int8 | AddBf16_aie2_0 | AddBroadcasting_aie2_0 | Add_aie2_0 | ArgMax1d_bf16_0 | ArgMax1d_int8_0 | ArgMin1d_bf16_0 | ArgMin1d_int8_0 | AvgPool2D_0 | AvgPool2D_1 | AvgPool2D_aie2_bfloat16_0 | AvgPool2D_aie2_bfloat16_1 | AvgPool2D_aie2_int8_0 | AvgPool2D_aie2_int8_1 | BatchNorm1d_aie2_bfloat16 | BatchNorm1d_aie2_int8 | BatchNorm2D_0 | BatchNorm2D_1 | BilinearInterpolation_0 | BilinearInterpolation_1 | BitShift_AIE2_int8 | BitwiseAnd_int8_0 | BitwiseNot_aie2_0 | BitwiseOr_int8_0 | BitwiseXor_aie2_int8 | Cast_aie2_bfloat16 | Cast_aie2_bfloat16_1 | Cast_aie2_int8 | Cast_aie2_int8_1 | Ceil_AIE2_bfloat16 | Ceil_AIE2_int8 | ChannelsFirstFlatten_bf16_0 | ChannelsFirstFlatten_int8_0 | Clip_aie2_bf16 | Clip_aie2_int8 | CompareOpsBroadcasting_K_EQ_GE_GT_LE_LT_CMP_GE_bfloat16_aie2 | CompareOpsBroadcasting_K_EQ_GE_GT_LE_LT_CMP_GE_int8_aie2 | CompareOps_K_EQ_GE_GT_LE_LT_CMP_EQ_bfloat16_aie2 | CompareOps_K_EQ_GE_GT_LE_LT_CMP_EQ_int8_aie2 | CompareOps_K_EQ_GE_GT_LE_LT_CMP_GE_bfloat16_aie2 | CompareOps_K_EQ_GE_GT_LE_LT_CMP_GE_int8_aie2 | CompareOps_K_EQ_GE_GT_LE_LT_CMP_GE_int8_aie2_ptr_interface | Conv1D_DW_AIE2_bf16_0 | Conv1D_DW_AIE2_bf16_1 | Conv1D_DW_AIE2_int8_0 | Conv1D_DW_AIE2_int8_1 | Conv2D_0 | Conv2D_1 | Conv2D_7x7s2_Layer1_0 | Conv2D_7x7s2_Layer1_1 | Conv2D_11x11s4_0 | Conv2D_11x11s4_1 | Conv2D_11x11s4_Layer1_0 | Conv2D_11x11s4_Layer1_1 | Conv2D_DW_0 | Conv2D_DW_1 | Conv2D_DW_bf16_0 | Conv2D_DW_bf16_1 | Conv2D_FC_0 | Conv2D_FC_1 | Conv2D_LReLU_0 | Conv2D_LReLU_1 | Conv2D_ReLU_0 | Conv2D_ReLU_1 | Conv2D_ReLU_Standalone_0 | Conv2D_ReLU_Standalone_1 | Conv2D_ReLU_int8_0 | Conv2D_ReLU_int8_1 | Conv2D_SV60 | Conv2D_Transpose_AIE2_0 | Conv2D_Transpose_AIE2_1 | Conv2D_bf16_0 | Conv2D_bf16_1 | Conv2D_mixed_batch_0 | Conv2D_mixed_batch_1 | DegroupG4_aie2_bf16_0 | DegroupG4_aie2_bf16_1 | DegroupG4_aie2_int8_0 | DegroupG4_aie2_int8_1 | DegroupG8_aie2_bf16_0 | DegroupG8_aie2_bf16_1 | DegroupG8_aie2_int8_0 | DegroupG8_aie2_int8_1 | DilatedConv2D_1 | DivAttributeBroadcasting_aie2_bf16_0 | DivAttributeBroadcasting_aie2_int8_0 | DivBroadcasting_aie2_0 | DivBroadcasting_aie2_1 | EleMax_aie2_bfloat16 | EleMax_aie2_int8 | EleMin_aie2_bfloat16 | EleMin_aie2_int8 | ElemDiv_aie2_0 | ElemDiv_aie2_1 | Elu_aie2_bf16_0 | Elu_aie2_int8_0 | Erf_aie2_bf16_0 | Erf_aie2_int8_0 | Erf_aie2_int8_0_ptr_interface | Exp_bf16_0 | Exp_bf16_1 | Expand_aie2_bfloat16 | Expand_aie2_int8 | Floor_aie2_0 | Floor_aie2_1 | FullyConnect_aie2_bf16 | FullyConnect_aie2_int8 | GELU_0 | GELU_1 | GEMM_bf16_0 | GEMM_bf16_1 | GEMM_int8_0 | GEMM_int8_1 | GEMV_0 | GEMV_1 | GeluTemplated_aie2_bf16 | GeluTemplated_aie2_int8 | GroupG4_aie2_bf16_0 | GroupG4_aie2_bf16_1 | GroupG4_aie2_int8_0 | GroupG4_aie2_int8_1 | GroupG8_aie2_bf16_0 | GroupG8_aie2_bf16_1 | GroupG8_aie2_int8_0 | GroupG8_aie2_int8_1 | Group_Conv2D_0 | Group_Conv2D_1 | HardSigmoid_bf16_0 | HardSigmoid_bf16_1 | HardSigmoid_int8_0 | HardSigmoid_int8_1 | HardswishAsHardsigmoid_aie2_0 | Hardswish_aie2_0 | InstanceNormPart1_aie2_bf16_0 | InstanceNormPart1_aie2_int8_0 | InstanceNormPart2_aie2_bf16_0 | InstanceNormPart2_aie2_int8_0 | InterpolateLinear1D_AIE2_bfloat16 | InterpolateLinear1D_AIE2_int8 | LayerNormC8Part1_aie2_bf16_0 | LayerNormC8Part1_aie2_int8_0 | LayerNormC8Part2_aie2_bf16_0 | LayerNormC8Part2_aie2_int8_0 | LayerNorm_0 | LayerNorm_1 | Log_bf16_0 | Log_int8_0 | LogicalNot_aie2_0 | LogicalXor_aie2_int8 | MaxPool2D_0 | MaxPool2D_1 | Mish_aie2_bfloat16 | Mish_aie2_int8 | Mul2d_bf16_0 | Mul2d_bf16_1 | MulAttributeBroadcasting_aie2_int8_0 | MulBroadcasting_aie2_0 | Mul_aie2_0 | Neg_aie2_0 | Pad2D_0 | Pad2D_1 | Pad3D_AIE2_bfloat16 | Pad3D_AIE2_int8 | PixelShuffle_aie2_bf16 | PixelShuffle_aie2_int8 | PixelUnshuffle_bf16_0 | PixelUnshuffle_int8_0 | PowAttributeBroadcasting_aie2_bf16_0 | PowAttributeBroadcasting_aie2_int8_0 | Pow_bf16_0 | Pow_int8_0 | Range_bfloat16_aie2_0 | Range_bfloat16_aie2_1 | Range_int8_aie2_0 | Range_int8_aie2_1 | Reciprocal_aie2_0 | ReduceMax_bf16_0 | ReduceMax_bf16_1 | ReduceMax_int8_0 | ReduceMax_int8_1 | ReduceMeanAxis_1_aie2_bf16 | ReduceMeanAxis_1_aie2_int8 | ReduceMeanAxis_2_aie2_bf16 | ReduceMeanAxis_2_aie2_int8 | ReduceMeanAxis_3_aie2_bf16 | ReduceMeanAxis_3_aie2_int8 | ReduceMeanAxis_4_aie2_bf16 | ReduceMeanAxis_4_aie2_int8 | ReduceMeanAxis_5_aie2_bf16 | ReduceMeanAxis_5_aie2_int8 | ReduceMeanAxis_6_aie2_bf16 | ReduceMeanAxis_6_aie2_int8 | ReduceMeanAxis_7_aie2_bf16 | ReduceMeanAxis_7_aie2_int8 | ReduceMeanNoc8_AIE2_bfloat16 | ReduceMeanNoc8_AIE2_int8 | ReduceMin1D_aie2_bf16 | ReduceMin1D_aie2_int8 | ReduceMin_bf16_0 | ReduceMin_bf16_1 | ReduceMin_int8_0 | ReduceMin_int8_1 | ReduceSumAxis_1_aie2_bf16 | ReduceSumAxis_1_aie2_int8 | ReduceSumAxis_2_aie2_bf16 | ReduceSumAxis_2_aie2_int8 | ReduceSumAxis_3_aie2_bf16 | ReduceSumAxis_3_aie2_int8 | ReduceSumAxis_4_aie2_bf16 | ReduceSumAxis_4_aie2_int8 | ReduceSumAxis_5_aie2_bf16 | ReduceSumAxis_5_aie2_int8 | ReduceSumAxis_6_aie2_bf16 | ReduceSumAxis_6_aie2_int8 | ReduceSumAxis_7_aie2_bf16 | ReduceSumAxis_7_aie2_int8 | ReduceSum_bf16_0 | ReduceSum_bf16_1 | ReduceSum_int8_0 | ReduceSum_int8_1 | Round_aie2_0 | Round_aie2_1 | Rsqrt_aie2_bf16_0 | Rsqrt_aie2_int8_0 | Scale_Add_bf16_0 | Scale_Add_bf16_1 | Select_aie2_bf16 | Select_aie2_int8 | Shrink_aie2_0 | Shrink_aie2_1 | SiLU_aie2_bf16 | SiLU_aie2_int8 | SiLU_aie2_int8_1 | SigmoidTemplated_int8_0 | SigmoidTemplated_int8_1 | Sigmoid_int8_0 | Sigmoid_int8_1 | Sign_bf16_0 | Sign_bf16_1 | Sign_int8_0 | Sign_int8_1 | Sin_aie2_bf16 | Sin_aie2_int8 | Slice_bfloat16_0 | Slice_int8_0 | Softmax_1 | Softmax_bf16_1 | Sqrt_bf16_0 | Sqrt_bf16_1 | Sqrt_int8_0 | Sqrt_int8_1 | Squeeze_bfloat16_0 | Squeeze_int8_0 | SubAttributeBroadcasting_aie2_int8_0 | SubBroadcasting_aie2_int8_0 | SubBroadcasting_aie2_int8_0_ptr_interface | Sub_aie2_bf16_0 | Sub_aie2_int8_0 | Sub_aie2_int8_0_ptr_interface | Tanh_0 | Tanh_1 | Tile_aie2_bf16_0 | Tile_aie2_int8_1 | Topk1D_bf16_0 | Topk1D_bf16_1 | Topk1D_int8_0 | Topk1D_int8_1 | Topk2D_bf16_0 | Topk2D_bf16_1 | Topk2D_int8_0 | Topk2D_int8_1 | Transpose_aie2_bf16_021 | Transpose_aie2_bf16_021_pad | Transpose_aie2_bf16_102 | Transpose_aie2_bf16_102_pad | Transpose_aie2_bf16_120 | Transpose_aie2_bf16_120_pad | Transpose_aie2_bf16_201 | Transpose_aie2_bf16_201_pad | Transpose_aie2_bf16_210 | Transpose_aie2_bf16_210_pad | Transpose_aie2_int8_021 | Transpose_aie2_int8_021_pad | Transpose_aie2_int8_102 | Transpose_aie2_int8_102_pad | Transpose_aie2_int8_120 | Transpose_aie2_int8_120_pad | Transpose_aie2_int8_201 | Transpose_aie2_int8_201_pad | Transpose_aie2_int8_210 | Transpose_aie2_int8_210_pad | bfloat16 | int8 | Sigmoid_bf16_0 | Sigmoid_bf16_1 | Scale_Add_0 | Scale_Add_1 | Tanh_int8_0 | TanhTemplated_aie2_int8 | Tanh_int8_1 | HardswishAsHardsigmoid_aie2_1 | Hardswish_aie2_1 | SigmoidTemplated_bf16_0 | HardSigmoidTemplated_bf16_0 | Reciprocal_aie2_1 | Mul2D_0 | Mul2D_1 | Rescale_aie2_int8_0 | TanhTemplated_aie2_bfloat16 | Averege diff | Diff stdev | Quantile #1 | Quantile #2 | Quantile #3 | Quantile #4 | Quantile #5 | Quantile #6 | Quantile #7 | Quantile #8 | Quantile #9 |
|-----------------------------------|---------------|-----------------------------|--------------------------------------|--------------------|--------------|--------------|--------------|--------------------|----------------|----------------------------|-----------------------------|----------------------------|------------------------------------|--------------------------------------|----------------|--------------|--------------|--------------|--------------|--------------|------------------------------------|----------------|------------------------|--------------|-----------------|-----------------|-----------------|-----------------|--------------|--------------|---------------------------|---------------------------|-----------------------|-----------------------|---------------------------|-----------------------|---------------|---------------|-------------------------|-------------------------|--------------------|-------------------|-------------------|------------------|----------------------|--------------------|----------------------|----------------|------------------|--------------------|----------------|-----------------------------|-----------------------------|----------------|----------------|--------------------------------------------------------------|----------------------------------------------------------|--------------------------------------------------|----------------------------------------------|--------------------------------------------------|----------------------------------------------|------------------------------------------------------------|-----------------------|-----------------------|-----------------------|-----------------------|--------------|--------------|-----------------------|-----------------------|------------------|------------------|-------------------------|-------------------------|--------------|--------------|------------------|------------------|--------------|--------------|----------------|----------------|---------------|---------------|--------------------------|--------------------------|--------------------|--------------------|--------------|-------------------------|-------------------------|---------------|---------------|----------------------|----------------------|-----------------------|-----------------------|-----------------------|-----------------------|-----------------------|-----------------------|-----------------------|-----------------------|-----------------|--------------------------------------|--------------------------------------|------------------------|------------------------|----------------------|------------------|----------------------|------------------|----------------|----------------|-----------------|-----------------|-----------------|-----------------|-------------------------------|--------------|--------------|----------------------|------------------|--------------|--------------|------------------------|------------------------|--------------|--------------|--------------|--------------|--------------|--------------|--------------|--------------|-------------------------|-------------------------|---------------------|---------------------|---------------------|---------------------|---------------------|---------------------|---------------------|---------------------|----------------|----------------|--------------------|--------------------|--------------------|--------------------|-------------------------------|------------------|-------------------------------|-------------------------------|-------------------------------|-------------------------------|-----------------------------------|-------------------------------|------------------------------|------------------------------|------------------------------|------------------------------|--------------|--------------|--------------|--------------|-------------------|----------------------|--------------|--------------|--------------------|----------------|--------------|--------------|--------------------------------------|------------------------|--------------|--------------|--------------|--------------|---------------------|-----------------|------------------------|------------------------|-----------------------|-----------------------|--------------------------------------|--------------------------------------|--------------|--------------|-----------------------|-----------------------|-------------------|-------------------|-------------------|------------------|------------------|------------------|------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|------------------------------|--------------------------|-----------------------|-----------------------|------------------|------------------|------------------|------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|------------------|------------------|------------------|------------------|--------------|--------------|-------------------|-------------------|------------------|------------------|------------------|------------------|---------------|---------------|----------------|----------------|------------------|-------------------------|-------------------------|----------------|----------------|--------------|--------------|--------------|--------------|---------------|---------------|------------------|--------------|--------------|----------------|--------------|--------------|--------------|--------------|--------------------|----------------|--------------------------------------|-----------------------------|-------------------------------------------|-----------------|-----------------|-------------------------------|--------------|--------------|------------------|------------------|---------------|---------------|---------------|---------------|---------------|---------------|---------------|---------------|-------------------------|-----------------------------|-------------------------|-----------------------------|-------------------------|-----------------------------|-------------------------|-----------------------------|-------------------------|-----------------------------|-------------------------|-----------------------------|-------------------------|-----------------------------|-------------------------|-----------------------------|-------------------------|-----------------------------|-------------------------|-----------------------------|--------------|--------------|----------------|----------------|--------------|--------------|--------------|-------------------------|--------------|-------------------------------|------------------|-------------------------|-----------------------------|-------------------|---------------|---------------|---------------------|-----------------------------|--------------|------------|-------------|-------------|-------------|-------------|-------------|-------------|-------------|-------------|-------------|
| Before | 343 | 257 | 1415 | 482 | 217 | 1281 | 705 | 322 | 1076 | 1131 | 705 | 727 | 760 | 760 | 7631 | 376 | 510 | 466 | 257 | 311 | 914 | 673 | 882 | 826 | 351 | 409 | 380 | 302 | 1068 | 780 | 3271 | 2263 | 1068 | 780 | 716 | 460 | 386 | 735 | 724 | 378 | 2008 | 467 | 215 | 467 | 720 | 923 | 923 | 765 | 765 | 1413 | 446 | 13603 | 14379 | 204 | 271 | 1597 | 971 | 1599 | 976 | 1594 | 971 | 971 | 3357 | 3901 | 1540 | 1774 | 7973 | 2507 | 5891 | 1617 | 5427 | 5426 | 4280 | 2983 | 2949 | 861 | 1204 | 3921 | 2702 | 1185 | 2180 | 5268 | 1281 | 27957 | 1281 | 2537 | 10187 | 930 | 863 | 39065 | 10453 | 24959 | 38645 | 11106 | 22356 | 605 | 992 | 366 | 560 | 749 | 1151 | 438 | 639 | 5519 | 5500 | 8123 | 2139 | 1483 | 333 | 210 | 333 | 210 | 2081 | 1422 | 1375 | 577 | 2894 | 2554 | 2533 | 7328 | 1467 | 1960 | 1907 | 301 | 881 | 1090 | 829 | 2594 | 3426 | 3636 | 7677 | 3048 | 36207 | 469 | 387 | 1389 | 1214 | 497 | 1534 | 314 | 830 | 1028 | 1661 | 557 | 893 | 3863 | 4382 | 1030 | 706 | 456 | 468 | 1368 | 1368 | 2869 | 11139 | 13052 | 12216 | 14979 | 11967 | 8890 | 7820 | 11724 | 11094 | 19189 | 16260 | 4784 | 1617 | 225 | 528 | 793 | 585 | 5661 | 9418 | 440 | 296 | 632 | 475 | 420 | 408 | 568 | 1684 | 9272 | 9627 | 8570 | 8570 | 17152 | 17152 | 41102 | 4312 | 35597 | 4312 | 3962 | 2658 | 1203 | 1791 | 1363 | 7193 | 9433 | 14509 | 19315 | 13438 | 7438 | 13474 | 7500 | 7634 | 2932 | 13444 | 7465 | 7613 | 2949 | 7620 | 2928 | 6737 | 2111 | 22137 | 61413 | 193 | 169 | 7193 | 18469 | 8797 | 19069 | 12349 | 7149 | 12372 | 7190 | 7464 | 2879 | 12387 | 7211 | 7467 | 2882 | 7450 | 2855 | 6686 | 2071 | 19617 | 12209 | 19670 | 11400 | 367 | 1248 | 3603 | 2376 | 1709 | 1709 | 329 | 206 | 672 | 759 | 3608 | 2968 | 2966 | 1275 | 1275 | 91 | 110 | 1078 | 210 | 417 | 123 | 3009 | 841 | 945 | 1545 | 502 | 1643 | 29776 | 3792 | 19157 | 19157 | 207 | 207 | 914 | 860 | 860 | 651 | 804 | 804 | 2694 | 3558 | 4264 | 2595 | 1217 | 169 | 836 | 118 | 34469 | 303 | 30723 | 252 | 1856 | 2338 | 1155 | 1140 | 1856 | 1752 | 1871 | 1767 | 1868 | 1868 | 2685 | 3612 | 1149 | 1089 | 2686 | 2686 | 2700 | 2544 | 2694 | 2538 | 1197 | 862 | 2011 | 1327 | 369 | 369 | 348 | 310 | 420 | 1727 | 1722 | 2138 | 617 | 2432 | 458 | 458 | 326 | 2524 | | | | | | | | | | | |
|-----------------------------------|---------------|-----------------------------|--------------------------------------|--------------------|--------------|--------------|--------------|--------------------|----------------|----------------------------|-----------------------------|----------------------------|------------------------------------|--------------------------------------|----------------|--------------|--------------|--------------|--------------|--------------|------------------------------------|----------------|------------------------|--------------|-----------------|-----------------|-----------------|-----------------|--------------|--------------|---------------------------|---------------------------|-----------------------|-----------------------|---------------------------|-----------------------|---------------|---------------|-------------------------|-------------------------|--------------------|-------------------|-------------------|------------------|----------------------|--------------------|----------------------|----------------|------------------|--------------------|----------------|-----------------------------|-----------------------------|----------------|----------------|--------------------------------------------------------------|----------------------------------------------------------|--------------------------------------------------|----------------------------------------------|--------------------------------------------------|----------------------------------------------|------------------------------------------------------------|-----------------------|-----------------------|-----------------------|-----------------------|--------------|--------------|-----------------------|-----------------------|------------------|------------------|-------------------------|-------------------------|--------------|--------------|------------------|------------------|--------------|--------------|----------------|----------------|---------------|---------------|--------------------------|--------------------------|--------------------|--------------------|--------------|-------------------------|-------------------------|---------------|---------------|----------------------|----------------------|-----------------------|-----------------------|-----------------------|-----------------------|-----------------------|-----------------------|-----------------------|-----------------------|-----------------|--------------------------------------|--------------------------------------|------------------------|------------------------|----------------------|------------------|----------------------|------------------|----------------|----------------|-----------------|-----------------|-----------------|-----------------|-------------------------------|--------------|--------------|----------------------|------------------|--------------|--------------|------------------------|------------------------|--------------|--------------|--------------|--------------|--------------|--------------|--------------|--------------|-------------------------|-------------------------|---------------------|---------------------|---------------------|---------------------|---------------------|---------------------|---------------------|---------------------|----------------|----------------|--------------------|--------------------|--------------------|--------------------|-------------------------------|------------------|-------------------------------|-------------------------------|-------------------------------|-------------------------------|-----------------------------------|-------------------------------|------------------------------|------------------------------|------------------------------|------------------------------|--------------|--------------|--------------|--------------|-------------------|----------------------|--------------|--------------|--------------------|----------------|--------------|--------------|--------------------------------------|------------------------|--------------|--------------|--------------|--------------|---------------------|-----------------|------------------------|------------------------|-----------------------|-----------------------|--------------------------------------|--------------------------------------|--------------|--------------|-----------------------|-----------------------|-------------------|-------------------|-------------------|------------------|------------------|------------------|------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|------------------------------|--------------------------|-----------------------|-----------------------|------------------|------------------|------------------|------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|------------------|------------------|------------------|------------------|--------------|--------------|-------------------|-------------------|------------------|------------------|------------------|------------------|---------------|---------------|----------------|----------------|------------------|-------------------------|-------------------------|----------------|----------------|--------------|--------------|--------------|--------------|---------------|---------------|------------------|--------------|--------------|----------------|--------------|--------------|--------------|--------------|--------------------|----------------|--------------------------------------|-----------------------------|-------------------------------------------|-----------------|-----------------|-------------------------------|--------------|--------------|------------------|------------------|---------------|---------------|---------------|---------------|---------------|---------------|---------------|---------------|-------------------------|-----------------------------|-------------------------|-----------------------------|-------------------------|-----------------------------|-------------------------|-----------------------------|-------------------------|-----------------------------|-------------------------|-----------------------------|-------------------------|-----------------------------|-------------------------|-----------------------------|-------------------------|-----------------------------|-------------------------|-----------------------------|--------------|--------------|----------------|----------------|--------------|--------------|--------------|-------------------------|--------------|-------------------------------|------------------|-------------------------|-----------------------------|-------------------|---------------|---------------|---------------------|-----------------------------|--------------|------------|-------------|-------------|-------------|-------------|-------------|-------------|-------------|-------------|-------------|
| After (mutator without heuristic) | 463 | 284 | 1554 | 511 | 230 | 1342 | 734 | 335 | 1119 | 1174 | 706 | 728 | 761 | 761 | 7632 | 376 | 510 | 466 | 257 | 311 | 914 | 673 | 882 | 826 | 351 | 409 | 380 | 302 | 1068 | 780 | 3271 | 2263 | 1068 | 780 | 716 | 460 | 386 | 735 | 724 | 378 | 2008 | 467 | 215 | 467 | 720 | 923 | 923 | 765 | 765 | 1413 | 446 | 13603 | 14379 | 204 | 271 | 1597 | 971 | 1599 | 976 | 1594 | 971 | 971 | 3357 | 3901 | 1540 | 1774 | 7973 | 2507 | 5891 | 1617 | 5427 | 5426 | 4280 | 2983 | 2949 | 861 | 1204 | 3921 | 2702 | 1185 | 2180 | 5268 | 1281 | 27957 | 1281 | 2537 | 10187 | 930 | 863 | 39065 | 10453 | 24959 | 38645 | 11106 | 22356 | 605 | 992 | 366 | 560 | 749 | 1151 | 438 | 639 | 5519 | 5500 | 8123 | 2139 | 1483 | 333 | 210 | 333 | 210 | 2081 | 1422 | 1375 | 577 | 2894 | 2554 | 2533 | 7328 | 1467 | 1960 | 1907 | 301 | 881 | 1090 | 829 | 2594 | 3426 | 3636 | 7677 | 3048 | 36207 | 469 | 387 | 1389 | 1214 | 497 | 1534 | 314 | 830 | 1028 | 1661 | 557 | 893 | 3863 | 4382 | 1030 | 706 | 456 | 468 | 1368 | 1368 | 2869 | 11139 | 13052 | 12216 | 14979 | 11967 | 8890 | 7820 | 11724 | 11094 | 19189 | 16260 | 4784 | 1617 | 225 | 528 | 793 | 585 | 5661 | 9418 | 440 | 296 | 632 | 475 | 420 | 408 | 568 | 1684 | 9272 | 9627 | 8570 | 8570 | 17152 | 17152 | 41102 | 4312 | 35597 | 4312 | 3962 | 2658 | 1203 | 1791 | 1363 | 7193 | 9433 | 14509 | 19315 | 13438 | 7438 | 13474 | 7500 | 7634 | 2932 | 13444 | 7465 | 7613 | 2949 | 7620 | 2928 | 6737 | 2111 | 22137 | 61413 | 193 | 169 | 7193 | 18469 | 8797 | 19069 | 12349 | 7149 | 12372 | 7190 | 7464 | 2879 | 12387 | 7211 | 7467 | 2882 | 7450 | 2855 | 6686 | 2071 | 19617 | 12209 | 19670 | 11400 | 367 | 1248 | 3603 | 2376 | 1709 | 1709 | 329 | 206 | 672 | 759 | 3608 | 2968 | 2966 | 1275 | 1275 | 91 | 110 | 1078 | 210 | 417 | 123 | 3009 | 841 | 945 | 1545 | 502 | 1643 | 29776 | 3792 | 19157 | 19157 | 207 | 207 | 914 | 860 | 860 | 651 | 804 | 804 | 2694 | 3558 | 4264 | 2595 | 1217 | 169 | 836 | 118 | 34469 | 303 | 30723 | 252 | 1856 | 2338 | 1155 | 1140 | 1856 | 1752 | 1871 | 1767 | 1868 | 1868 | 2685 | 3612 | 1149 | 1089 | 2686 | 2686 | 2700 | 2544 | 2694 | 2538 | 1197 | 862 | 2009 | 1325 | 367 | 367 | 338 | 300 | 406 | 1590 | 1585 | 1954 | 556 | 2155 | 404 | 404 | 233 | 1143 | -0.23% | 4.32 | +0.00% | +0.00% | +0.00% | +0.00% | +0.00% | +0.00% | +0.00% | +0.00% | +0.00% |
|-----------------------------------|---------------|-----------------------------|--------------------------------------|--------------------|--------------|--------------|--------------|--------------------|----------------|----------------------------|-----------------------------|----------------------------|------------------------------------|--------------------------------------|----------------|--------------|--------------|--------------|--------------|--------------|------------------------------------|----------------|------------------------|--------------|-----------------|-----------------|-----------------|-----------------|--------------|--------------|---------------------------|---------------------------|-----------------------|-----------------------|---------------------------|-----------------------|---------------|---------------|-------------------------|-------------------------|--------------------|-------------------|-------------------|------------------|----------------------|--------------------|----------------------|----------------|------------------|--------------------|----------------|-----------------------------|-----------------------------|----------------|----------------|--------------------------------------------------------------|----------------------------------------------------------|--------------------------------------------------|----------------------------------------------|--------------------------------------------------|----------------------------------------------|------------------------------------------------------------|-----------------------|-----------------------|-----------------------|-----------------------|--------------|--------------|-----------------------|-----------------------|------------------|------------------|-------------------------|-------------------------|--------------|--------------|------------------|------------------|--------------|--------------|----------------|----------------|---------------|---------------|--------------------------|--------------------------|--------------------|--------------------|--------------|-------------------------|-------------------------|---------------|---------------|----------------------|----------------------|-----------------------|-----------------------|-----------------------|-----------------------|-----------------------|-----------------------|-----------------------|-----------------------|-----------------|--------------------------------------|--------------------------------------|------------------------|------------------------|----------------------|------------------|----------------------|------------------|----------------|----------------|-----------------|-----------------|-----------------|-----------------|-------------------------------|--------------|--------------|----------------------|------------------|--------------|--------------|------------------------|------------------------|--------------|--------------|--------------|--------------|--------------|--------------|--------------|--------------|-------------------------|-------------------------|---------------------|---------------------|---------------------|---------------------|---------------------|---------------------|---------------------|---------------------|----------------|----------------|--------------------|--------------------|--------------------|--------------------|-------------------------------|------------------|-------------------------------|-------------------------------|-------------------------------|-------------------------------|-----------------------------------|-------------------------------|------------------------------|------------------------------|------------------------------|------------------------------|--------------|--------------|--------------|--------------|-------------------|----------------------|--------------|--------------|--------------------|----------------|--------------|--------------|--------------------------------------|------------------------|--------------|--------------|--------------|--------------|---------------------|-----------------|------------------------|------------------------|-----------------------|-----------------------|--------------------------------------|--------------------------------------|--------------|--------------|-----------------------|-----------------------|-------------------|-------------------|-------------------|------------------|------------------|------------------|------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|------------------------------|--------------------------|-----------------------|-----------------------|------------------|------------------|------------------|------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|------------------|------------------|------------------|------------------|--------------|--------------|-------------------|-------------------|------------------|------------------|------------------|------------------|---------------|---------------|----------------|----------------|------------------|-------------------------|-------------------------|----------------|----------------|--------------|--------------|--------------|--------------|---------------|---------------|------------------|--------------|--------------|----------------|--------------|--------------|--------------|--------------|--------------------|----------------|--------------------------------------|-----------------------------|-------------------------------------------|-----------------|-----------------|-------------------------------|--------------|--------------|------------------|------------------|---------------|---------------|---------------|---------------|---------------|---------------|---------------|---------------|-------------------------|-----------------------------|-------------------------|-----------------------------|-------------------------|-----------------------------|-------------------------|-----------------------------|-------------------------|-----------------------------|-------------------------|-----------------------------|-------------------------|-----------------------------|-------------------------|-----------------------------|-------------------------|-----------------------------|-------------------------|-----------------------------|--------------|--------------|----------------|----------------|--------------|--------------|--------------|-------------------------|--------------|-------------------------------|------------------|-------------------------|-----------------------------|-------------------|---------------|---------------|---------------------|-----------------------------|--------------|------------|-------------|-------------|-------------|-------------|-------------|-------------|-------------|-------------|-------------|
| Total diff | REGR(+34.99%) | REGR(+10.51%) | REGR(+9.82%) | REGR(+6.02%) | REGR(+5.99%) | REGR(+4.76%) | REGR(+4.11%) | REGR(+4.04%) | REGR(+4.00%) | REGR(+3.80%) | REGR(+0.14%) | REGR(+0.14%) | REGR(+0.13%) | REGR(+0.13%) | SAME(+0.01%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%) | SAME(-0.10%) | IMPR(-0.15%) | IMPR(-0.54%) | IMPR(-0.54%) | IMPR(-2.87%) | IMPR(-3.23%) | IMPR(-3.33%) | IMPR(-7.93%) | IMPR(-7.96%) | IMPR(-8.61%) | IMPR(-9.89%) | IMPR(-11.39%) | IMPR(-11.79%) | IMPR(-11.79%) | IMPR(-28.53%) | IMPR(-54.71%) | -0.23% | 4.32 | +0.00% | +0.00% | +0.00% | +0.00% | +0.00% | +0.00% | +0.00% | +0.00% | +0.00% |
|-----------------------------------|---------------|-----------------------------|--------------------------------------|--------------------|--------------|--------------|--------------|--------------------|----------------|----------------------------|-----------------------------|----------------------------|------------------------------------|--------------------------------------|----------------|--------------|--------------|--------------|--------------|--------------|------------------------------------|----------------|------------------------|--------------|-----------------|-----------------|-----------------|-----------------|--------------|--------------|---------------------------|---------------------------|-----------------------|-----------------------|---------------------------|-----------------------|---------------|---------------|-------------------------|-------------------------|--------------------|-------------------|-------------------|------------------|----------------------|--------------------|----------------------|----------------|------------------|--------------------|----------------|-----------------------------|-----------------------------|----------------|----------------|--------------------------------------------------------------|----------------------------------------------------------|--------------------------------------------------|----------------------------------------------|--------------------------------------------------|----------------------------------------------|------------------------------------------------------------|-----------------------|-----------------------|-----------------------|-----------------------|--------------|--------------|-----------------------|-----------------------|------------------|------------------|-------------------------|-------------------------|--------------|--------------|------------------|------------------|--------------|--------------|----------------|----------------|---------------|---------------|--------------------------|--------------------------|--------------------|--------------------|--------------|-------------------------|-------------------------|---------------|---------------|----------------------|----------------------|-----------------------|-----------------------|-----------------------|-----------------------|-----------------------|-----------------------|-----------------------|-----------------------|-----------------|--------------------------------------|--------------------------------------|------------------------|------------------------|----------------------|------------------|----------------------|------------------|----------------|----------------|-----------------|-----------------|-----------------|-----------------|-------------------------------|--------------|--------------|----------------------|------------------|--------------|--------------|------------------------|------------------------|--------------|--------------|--------------|--------------|--------------|--------------|--------------|--------------|-------------------------|-------------------------|---------------------|---------------------|---------------------|---------------------|---------------------|---------------------|---------------------|---------------------|----------------|----------------|--------------------|--------------------|--------------------|--------------------|-------------------------------|------------------|-------------------------------|-------------------------------|-------------------------------|-------------------------------|-----------------------------------|-------------------------------|------------------------------|------------------------------|------------------------------|------------------------------|--------------|--------------|--------------|--------------|-------------------|----------------------|--------------|--------------|--------------------|----------------|--------------|--------------|--------------------------------------|------------------------|--------------|--------------|--------------|--------------|---------------------|-----------------|------------------------|------------------------|-----------------------|-----------------------|--------------------------------------|--------------------------------------|--------------|--------------|-----------------------|-----------------------|-------------------|-------------------|-------------------|------------------|------------------|------------------|------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|------------------------------|--------------------------|-----------------------|-----------------------|------------------|------------------|------------------|------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|------------------|------------------|------------------|------------------|--------------|--------------|-------------------|-------------------|------------------|------------------|------------------|------------------|---------------|---------------|----------------|----------------|------------------|-------------------------|-------------------------|----------------|----------------|--------------|--------------|--------------|--------------|---------------|---------------|------------------|--------------|--------------|----------------|--------------|--------------|--------------|--------------|--------------------|----------------|--------------------------------------|-----------------------------|-------------------------------------------|-----------------|-----------------|-------------------------------|--------------|--------------|------------------|------------------|---------------|---------------|---------------|---------------|---------------|---------------|---------------|---------------|-------------------------|-----------------------------|-------------------------|-----------------------------|-------------------------|-----------------------------|-------------------------|-----------------------------|-------------------------|-----------------------------|-------------------------|-----------------------------|-------------------------|-----------------------------|-------------------------|-----------------------------|-------------------------|-----------------------------|-------------------------|-----------------------------|--------------|--------------|----------------|----------------|--------------|--------------|--------------|-------------------------|--------------|-------------------------------|------------------|-------------------------|-----------------------------|-------------------|---------------|---------------|---------------------|-----------------------------|--------------|------------|-------------|-------------|-------------|-------------|-------------|-------------|-------------|-------------|-------------|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interestingly we have the follow situation for Add2D_0:
II = 9 for both cases, but "real final" II of 10 without the mutation and 11 with the mutation. In total, we have 4 VADDs:
- Without the mutation: we start the first VADD in stage 0 and rest in stage 1.
- With the mutation: all VADDS are executed in stage 1.
This saves at least one phi node in the loop, because each VADD require 2 operands in phi nodes, but if one VADD is executed in state 0, just the result will forwarded to the next stage by a phi node.
The approach with the mutation looks reasonable, all VADDS starting at the same stage.
Now we can break some WAW dependencies related to status registers when such registers are not explicitly read or written.
e302e40
to
239abac
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, let's go with the heuristic approach so far. I was experimenting a bit today, and I think we can get II=10 back with extra improvements to the loop-aware convergence strategy. Until it's productized, I think it's fine to increase the scheduling freedom bit by bit.
Now we can break some WAW dependencies related to status registers when such registers are not explicitly read or written.