-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[LV][NFC] Add branch weight test showing incorrect behaviour #144682
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-llvm-transforms Author: David Sherwood (david-arm) ChangesThis patch adds a test that shows incorrect branch weights being set in function EpilogueVectorizerEpilogueLoop::emitMinimumVectorEpilogueIterCountCheck Full diff: https://github.com/llvm/llvm-project/pull/144682.diff 2 Files Affected:
diff --git a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
index f1470fd1f7314..5a58144bbcca0 100644
--- a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+++ b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
@@ -7682,6 +7682,8 @@ EpilogueVectorizerEpilogueLoop::emitMinimumVectorEpilogueIterCountCheck(
BranchInst &BI =
*BranchInst::Create(Bypass, LoopVectorPreHeader, CheckMinIters);
if (hasBranchWeightMD(*OrigLoop->getLoopLatch()->getTerminator())) {
+ // FIXME: See test Transforms/LoopVectorize/branch-weights.ll. I don't
+ // think the MainLoopStep is correct.
unsigned MainLoopStep = UF * VF.getKnownMinValue();
unsigned EpilogueLoopStep =
EPI.EpilogueUF * EPI.EpilogueVF.getKnownMinValue();
diff --git a/llvm/test/Transforms/LoopVectorize/branch-weights.ll b/llvm/test/Transforms/LoopVectorize/branch-weights.ll
index e11f77d8aeaec..d162e7aff5f32 100644
--- a/llvm/test/Transforms/LoopVectorize/branch-weights.ll
+++ b/llvm/test/Transforms/LoopVectorize/branch-weights.ll
@@ -1,53 +1,81 @@
-; RUN: opt < %s -S -passes=loop-vectorize -force-vector-interleave=1 -force-vector-width=4 -enable-epilogue-vectorization -epilogue-vectorization-force-VF=4 | FileCheck %s
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --filter "br" --filter "^.*:" --version 5
+; RUN: opt < %s -S -passes=loop-vectorize -force-vector-interleave=1 -force-vector-width=4 -enable-epilogue-vectorization \
+; RUN: -epilogue-vectorization-force-VF=4 | FileCheck %s --check-prefix=MAINVF4IC1_EPI4
+; RUN: opt < %s -S -passes=loop-vectorize -force-vector-interleave=2 -force-vector-width=4 -enable-epilogue-vectorization \
+; RUN: -epilogue-vectorization-force-VF=4 | FileCheck %s --check-prefix=MAINVF4IC2_EPI4
-; CHECK-LABEL: @f0(
-;
-; CHECK: entry:
-; CHECK: br i1 %cmp.entry, label %iter.check, label %exit, !prof [[PROF_F0_ENTRY:![0-9]+]]
-;
-; CHECK: iter.check:
-; CHECK: br i1 %min.iters.check, label %vec.epilog.scalar.ph, label %vector.scevcheck, !prof [[PROF_F0_UNLIKELY:![0-9]+]]
-;
-; CHECK: vector.scevcheck:
-; CHECK: br i1 %4, label %vec.epilog.scalar.ph, label %vector.main.loop.iter.check, !prof [[PROF_F0_UNLIKELY]]
-;
-; CHECK: vector.main.loop.iter.check:
-; CHECK: br i1 %min.iters.check1, label %vec.epilog.ph, label %vector.ph, !prof [[PROF_F0_UNLIKELY]]
-;
-; CHECK: vector.ph:
-; CHECK: br label %vector.body
-;
-; CHECK: vector.body:
-; CHECK: br i1 {{.+}}, label %middle.block, label %vector.body, !prof [[PROF_F0_VECTOR_BODY:![0-9]+]]
-;
-; CHECK: middle.block:
-; CHECK: br i1 %cmp.n, label %exit.loopexit, label %vec.epilog.iter.check, !prof [[PROF_F0_MIDDLE_BLOCKS:![0-9]+]]
-;
-; CHECK: vec.epilog.iter.check:
-; CHECK: br i1 %min.epilog.iters.check, label %vec.epilog.scalar.ph, label %vec.epilog.ph, !prof [[PROF_F0_VEC_EPILOGUE_SKIP:![0-9]+]]
-;
-; CHECK: vec.epilog.ph:
-; CHECK: br label %vec.epilog.vector.body
-;
-; CHECK: vec.epilog.vector.body:
-; CHECK: br i1 {{.+}}, label %vec.epilog.middle.block, label %vec.epilog.vector.body, !prof [[PROF_F0_VEC_EPILOG_VECTOR_BODY:![0-9]+]]
-;
-; CHECK: vec.epilog.middle.block:
-; CHECK: br i1 %cmp.n{{.+}}, label %exit.loopexit, label %vec.epilog.scalar.ph, !prof [[PROF_F0_MIDDLE_BLOCKS:![0-9]+]]
-;
-; CHECK: vec.epilog.scalar.ph:
-; CHECK: br label %loop
-;
-; CHECK: loop:
-; CHECK: br i1 %cmp.loop, label %loop, label %exit.loopexit, !prof [[PROF_F0_LOOP:![0-9]+]]
+; FIXME: For MAINVF4IC2_EPI4 the branch weights in the terminator of
+; the VEC_EPILOG_ITER_CHECK block should be [4,4] since we process 8
+; scalar iterations in the main loop, leaving the remaining count to
+; be in the range [0,7]. That gives a 4:4 chance of skipping the
+; vector epilogue. I believe the problem lies in
+; EpilogueVectorizerEpilogueLoop::emitMinimumVectorEpilogueIterCountCheck
+; where the main loop VF is set to the same value as the epilogue VF.
+define void @f0(i8 %n, i32 %len, ptr %p) !prof !0 {
+; MAINVF4IC1_EPI4-LABEL: define void @f0(
+; MAINVF4IC1_EPI4-SAME: i8 [[N:%.*]], i32 [[LEN:%.*]], ptr [[P:%.*]]) !prof [[PROF0:![0-9]+]] {
+; MAINVF4IC1_EPI4: [[ENTRY:.*:]]
+; MAINVF4IC1_EPI4: br i1 [[CMP_ENTRY:%.*]], label %[[ITER_CHECK:.*]], label %[[EXIT:.*]], !prof [[PROF1:![0-9]+]]
+; MAINVF4IC1_EPI4: [[ITER_CHECK]]:
+; MAINVF4IC1_EPI4: br i1 [[MIN_ITERS_CHECK:%.*]], label %[[VEC_EPILOG_SCALAR_PH:.*]], label %[[VECTOR_SCEVCHECK:.*]], !prof [[PROF2:![0-9]+]]
+; MAINVF4IC1_EPI4: [[VECTOR_SCEVCHECK]]:
+; MAINVF4IC1_EPI4: br i1 [[TMP4:%.*]], label %[[VEC_EPILOG_SCALAR_PH]], label %[[VECTOR_MAIN_LOOP_ITER_CHECK:.*]], !prof [[PROF2]]
+; MAINVF4IC1_EPI4: [[VECTOR_MAIN_LOOP_ITER_CHECK]]:
+; MAINVF4IC1_EPI4: br i1 [[MIN_ITERS_CHECK1:%.*]], label %[[VEC_EPILOG_PH:.*]], label %[[VECTOR_PH:.*]], !prof [[PROF2]]
+; MAINVF4IC1_EPI4: [[VECTOR_PH]]:
+; MAINVF4IC1_EPI4: br label %[[VECTOR_BODY:.*]]
+; MAINVF4IC1_EPI4: [[VECTOR_BODY]]:
+; MAINVF4IC1_EPI4: br i1 [[TMP8:%.*]], label %[[MIDDLE_BLOCK:.*]], label %[[VECTOR_BODY]], !prof [[PROF3:![0-9]+]], !llvm.loop [[LOOP4:![0-9]+]]
+; MAINVF4IC1_EPI4: [[MIDDLE_BLOCK]]:
+; MAINVF4IC1_EPI4: br i1 [[CMP_N:%.*]], label %[[EXIT_LOOPEXIT:.*]], label %[[VEC_EPILOG_ITER_CHECK:.*]], !prof [[PROF7:![0-9]+]]
+; MAINVF4IC1_EPI4: [[VEC_EPILOG_ITER_CHECK]]:
+; MAINVF4IC1_EPI4: br i1 [[MIN_EPILOG_ITERS_CHECK:%.*]], label %[[VEC_EPILOG_SCALAR_PH]], label %[[VEC_EPILOG_PH]], !prof [[PROF8:![0-9]+]]
+; MAINVF4IC1_EPI4: [[VEC_EPILOG_PH]]:
+; MAINVF4IC1_EPI4: br label %[[VEC_EPILOG_VECTOR_BODY:.*]]
+; MAINVF4IC1_EPI4: [[VEC_EPILOG_VECTOR_BODY]]:
+; MAINVF4IC1_EPI4: br i1 [[TMP12:%.*]], label %[[VEC_EPILOG_MIDDLE_BLOCK:.*]], label %[[VEC_EPILOG_VECTOR_BODY]], !prof [[PROF9:![0-9]+]], !llvm.loop [[LOOP10:![0-9]+]]
+; MAINVF4IC1_EPI4: [[VEC_EPILOG_MIDDLE_BLOCK]]:
+; MAINVF4IC1_EPI4: br i1 [[CMP_N8:%.*]], label %[[EXIT_LOOPEXIT]], label %[[VEC_EPILOG_SCALAR_PH]], !prof [[PROF7]]
+; MAINVF4IC1_EPI4: [[VEC_EPILOG_SCALAR_PH]]:
+; MAINVF4IC1_EPI4: br label %[[LOOP:.*]]
+; MAINVF4IC1_EPI4: [[LOOP]]:
+; MAINVF4IC1_EPI4: br i1 [[CMP_LOOP:%.*]], label %[[LOOP]], label %[[EXIT_LOOPEXIT]], !prof [[PROF11:![0-9]+]], !llvm.loop [[LOOP12:![0-9]+]]
+; MAINVF4IC1_EPI4: [[EXIT_LOOPEXIT]]:
+; MAINVF4IC1_EPI4: br label %[[EXIT]]
+; MAINVF4IC1_EPI4: [[EXIT]]:
;
-; CHECK: exit.loopexit:
-; CHECK: br label %exit
+; MAINVF4IC2_EPI4-LABEL: define void @f0(
+; MAINVF4IC2_EPI4-SAME: i8 [[N:%.*]], i32 [[LEN:%.*]], ptr [[P:%.*]]) !prof [[PROF0:![0-9]+]] {
+; MAINVF4IC2_EPI4: [[ENTRY:.*:]]
+; MAINVF4IC2_EPI4: br i1 [[CMP_ENTRY:%.*]], label %[[ITER_CHECK:.*]], label %[[EXIT:.*]], !prof [[PROF1:![0-9]+]]
+; MAINVF4IC2_EPI4: [[ITER_CHECK]]:
+; MAINVF4IC2_EPI4: br i1 [[MIN_ITERS_CHECK:%.*]], label %[[VEC_EPILOG_SCALAR_PH:.*]], label %[[VECTOR_SCEVCHECK:.*]], !prof [[PROF2:![0-9]+]]
+; MAINVF4IC2_EPI4: [[VECTOR_SCEVCHECK]]:
+; MAINVF4IC2_EPI4: br i1 [[TMP4:%.*]], label %[[VEC_EPILOG_SCALAR_PH]], label %[[VECTOR_MAIN_LOOP_ITER_CHECK:.*]], !prof [[PROF2]]
+; MAINVF4IC2_EPI4: [[VECTOR_MAIN_LOOP_ITER_CHECK]]:
+; MAINVF4IC2_EPI4: br i1 [[MIN_ITERS_CHECK1:%.*]], label %[[VEC_EPILOG_PH:.*]], label %[[VECTOR_PH:.*]], !prof [[PROF2]]
+; MAINVF4IC2_EPI4: [[VECTOR_PH]]:
+; MAINVF4IC2_EPI4: br label %[[VECTOR_BODY:.*]]
+; MAINVF4IC2_EPI4: [[VECTOR_BODY]]:
+; MAINVF4IC2_EPI4: br i1 [[TMP9:%.*]], label %[[MIDDLE_BLOCK:.*]], label %[[VECTOR_BODY]], !prof [[PROF3:![0-9]+]], !llvm.loop [[LOOP4:![0-9]+]]
+; MAINVF4IC2_EPI4: [[MIDDLE_BLOCK]]:
+; MAINVF4IC2_EPI4: br i1 [[CMP_N:%.*]], label %[[EXIT_LOOPEXIT:.*]], label %[[VEC_EPILOG_ITER_CHECK:.*]], !prof [[PROF7:![0-9]+]]
+; MAINVF4IC2_EPI4: [[VEC_EPILOG_ITER_CHECK]]:
+; MAINVF4IC2_EPI4: br i1 [[MIN_EPILOG_ITERS_CHECK:%.*]], label %[[VEC_EPILOG_SCALAR_PH]], label %[[VEC_EPILOG_PH]], !prof [[PROF8:![0-9]+]]
+; MAINVF4IC2_EPI4: [[VEC_EPILOG_PH]]:
+; MAINVF4IC2_EPI4: br label %[[VEC_EPILOG_VECTOR_BODY:.*]]
+; MAINVF4IC2_EPI4: [[VEC_EPILOG_VECTOR_BODY]]:
+; MAINVF4IC2_EPI4: br i1 [[TMP13:%.*]], label %[[VEC_EPILOG_MIDDLE_BLOCK:.*]], label %[[VEC_EPILOG_VECTOR_BODY]], !prof [[PROF9:![0-9]+]], !llvm.loop [[LOOP10:![0-9]+]]
+; MAINVF4IC2_EPI4: [[VEC_EPILOG_MIDDLE_BLOCK]]:
+; MAINVF4IC2_EPI4: br i1 [[CMP_N8:%.*]], label %[[EXIT_LOOPEXIT]], label %[[VEC_EPILOG_SCALAR_PH]], !prof [[PROF11:![0-9]+]]
+; MAINVF4IC2_EPI4: [[VEC_EPILOG_SCALAR_PH]]:
+; MAINVF4IC2_EPI4: br label %[[LOOP:.*]]
+; MAINVF4IC2_EPI4: [[LOOP]]:
+; MAINVF4IC2_EPI4: br i1 [[CMP_LOOP:%.*]], label %[[LOOP]], label %[[EXIT_LOOPEXIT]], !prof [[PROF12:![0-9]+]], !llvm.loop [[LOOP13:![0-9]+]]
+; MAINVF4IC2_EPI4: [[EXIT_LOOPEXIT]]:
+; MAINVF4IC2_EPI4: br label %[[EXIT]]
+; MAINVF4IC2_EPI4: [[EXIT]]:
;
-; CHECK: exit:
-; CHECK: ret void
-
-define void @f0(i8 %n, i32 %len, ptr %p) !prof !0 {
entry:
%cmp.entry = icmp sgt i32 %len, 0
br i1 %cmp.entry, label %loop, label %exit, !prof !1
@@ -72,11 +100,33 @@ exit:
!0 = !{!"function_entry_count", i64 13}
!1 = !{!"branch_weights", i32 12, i32 1}
!2 = !{!"branch_weights", i32 1234, i32 1}
-
-; CHECK: [[PROF_F0_ENTRY]] = !{!"branch_weights", i32 12, i32 1}
-; CHECK: [[PROF_F0_UNLIKELY]] = !{!"branch_weights", i32 1, i32 127}
-; CHECK: [[PROF_F0_VECTOR_BODY]] = !{!"branch_weights", i32 1, i32 307}
-; CHECK: [[PROF_F0_MIDDLE_BLOCKS]] = !{!"branch_weights", i32 1, i32 3}
-; CHECK: [[PROF_F0_VEC_EPILOGUE_SKIP]] = !{!"branch_weights", i32 4, i32 0}
-; CHECK: [[PROF_F0_VEC_EPILOG_VECTOR_BODY]] = !{!"branch_weights", i32 0, i32 0}
-; CHECK: [[PROF_F0_LOOP]] = !{!"branch_weights", i32 2, i32 1}
+;.
+; MAINVF4IC1_EPI4: [[PROF0]] = !{!"function_entry_count", i64 13}
+; MAINVF4IC1_EPI4: [[PROF1]] = !{!"branch_weights", i32 12, i32 1}
+; MAINVF4IC1_EPI4: [[PROF2]] = !{!"branch_weights", i32 1, i32 127}
+; MAINVF4IC1_EPI4: [[PROF3]] = !{!"branch_weights", i32 1, i32 307}
+; MAINVF4IC1_EPI4: [[LOOP4]] = distinct !{[[LOOP4]], [[META5:![0-9]+]], [[META6:![0-9]+]]}
+; MAINVF4IC1_EPI4: [[META5]] = !{!"llvm.loop.isvectorized", i32 1}
+; MAINVF4IC1_EPI4: [[META6]] = !{!"llvm.loop.unroll.runtime.disable"}
+; MAINVF4IC1_EPI4: [[PROF7]] = !{!"branch_weights", i32 1, i32 3}
+; MAINVF4IC1_EPI4: [[PROF8]] = !{!"branch_weights", i32 4, i32 0}
+; MAINVF4IC1_EPI4: [[PROF9]] = !{!"branch_weights", i32 0, i32 0}
+; MAINVF4IC1_EPI4: [[LOOP10]] = distinct !{[[LOOP10]], [[META5]], [[META6]]}
+; MAINVF4IC1_EPI4: [[PROF11]] = !{!"branch_weights", i32 2, i32 1}
+; MAINVF4IC1_EPI4: [[LOOP12]] = distinct !{[[LOOP12]], [[META5]]}
+;.
+; MAINVF4IC2_EPI4: [[PROF0]] = !{!"function_entry_count", i64 13}
+; MAINVF4IC2_EPI4: [[PROF1]] = !{!"branch_weights", i32 12, i32 1}
+; MAINVF4IC2_EPI4: [[PROF2]] = !{!"branch_weights", i32 1, i32 127}
+; MAINVF4IC2_EPI4: [[PROF3]] = !{!"branch_weights", i32 1, i32 153}
+; MAINVF4IC2_EPI4: [[LOOP4]] = distinct !{[[LOOP4]], [[META5:![0-9]+]], [[META6:![0-9]+]]}
+; MAINVF4IC2_EPI4: [[META5]] = !{!"llvm.loop.isvectorized", i32 1}
+; MAINVF4IC2_EPI4: [[META6]] = !{!"llvm.loop.unroll.runtime.disable"}
+; MAINVF4IC2_EPI4: [[PROF7]] = !{!"branch_weights", i32 1, i32 7}
+; MAINVF4IC2_EPI4: [[PROF8]] = !{!"branch_weights", i32 4, i32 0}
+; MAINVF4IC2_EPI4: [[PROF9]] = !{!"branch_weights", i32 0, i32 0}
+; MAINVF4IC2_EPI4: [[LOOP10]] = distinct !{[[LOOP10]], [[META5]], [[META6]]}
+; MAINVF4IC2_EPI4: [[PROF11]] = !{!"branch_weights", i32 1, i32 3}
+; MAINVF4IC2_EPI4: [[PROF12]] = !{!"branch_weights", i32 2, i32 1}
+; MAINVF4IC2_EPI4: [[LOOP13]] = distinct !{[[LOOP13]], [[META5]]}
+;.
|
@llvm/pr-subscribers-vectorizers Author: David Sherwood (david-arm) ChangesThis patch adds a test that shows incorrect branch weights being set in function EpilogueVectorizerEpilogueLoop::emitMinimumVectorEpilogueIterCountCheck Full diff: https://github.com/llvm/llvm-project/pull/144682.diff 2 Files Affected:
diff --git a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
index f1470fd1f7314..5a58144bbcca0 100644
--- a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+++ b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
@@ -7682,6 +7682,8 @@ EpilogueVectorizerEpilogueLoop::emitMinimumVectorEpilogueIterCountCheck(
BranchInst &BI =
*BranchInst::Create(Bypass, LoopVectorPreHeader, CheckMinIters);
if (hasBranchWeightMD(*OrigLoop->getLoopLatch()->getTerminator())) {
+ // FIXME: See test Transforms/LoopVectorize/branch-weights.ll. I don't
+ // think the MainLoopStep is correct.
unsigned MainLoopStep = UF * VF.getKnownMinValue();
unsigned EpilogueLoopStep =
EPI.EpilogueUF * EPI.EpilogueVF.getKnownMinValue();
diff --git a/llvm/test/Transforms/LoopVectorize/branch-weights.ll b/llvm/test/Transforms/LoopVectorize/branch-weights.ll
index e11f77d8aeaec..d162e7aff5f32 100644
--- a/llvm/test/Transforms/LoopVectorize/branch-weights.ll
+++ b/llvm/test/Transforms/LoopVectorize/branch-weights.ll
@@ -1,53 +1,81 @@
-; RUN: opt < %s -S -passes=loop-vectorize -force-vector-interleave=1 -force-vector-width=4 -enable-epilogue-vectorization -epilogue-vectorization-force-VF=4 | FileCheck %s
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --filter "br" --filter "^.*:" --version 5
+; RUN: opt < %s -S -passes=loop-vectorize -force-vector-interleave=1 -force-vector-width=4 -enable-epilogue-vectorization \
+; RUN: -epilogue-vectorization-force-VF=4 | FileCheck %s --check-prefix=MAINVF4IC1_EPI4
+; RUN: opt < %s -S -passes=loop-vectorize -force-vector-interleave=2 -force-vector-width=4 -enable-epilogue-vectorization \
+; RUN: -epilogue-vectorization-force-VF=4 | FileCheck %s --check-prefix=MAINVF4IC2_EPI4
-; CHECK-LABEL: @f0(
-;
-; CHECK: entry:
-; CHECK: br i1 %cmp.entry, label %iter.check, label %exit, !prof [[PROF_F0_ENTRY:![0-9]+]]
-;
-; CHECK: iter.check:
-; CHECK: br i1 %min.iters.check, label %vec.epilog.scalar.ph, label %vector.scevcheck, !prof [[PROF_F0_UNLIKELY:![0-9]+]]
-;
-; CHECK: vector.scevcheck:
-; CHECK: br i1 %4, label %vec.epilog.scalar.ph, label %vector.main.loop.iter.check, !prof [[PROF_F0_UNLIKELY]]
-;
-; CHECK: vector.main.loop.iter.check:
-; CHECK: br i1 %min.iters.check1, label %vec.epilog.ph, label %vector.ph, !prof [[PROF_F0_UNLIKELY]]
-;
-; CHECK: vector.ph:
-; CHECK: br label %vector.body
-;
-; CHECK: vector.body:
-; CHECK: br i1 {{.+}}, label %middle.block, label %vector.body, !prof [[PROF_F0_VECTOR_BODY:![0-9]+]]
-;
-; CHECK: middle.block:
-; CHECK: br i1 %cmp.n, label %exit.loopexit, label %vec.epilog.iter.check, !prof [[PROF_F0_MIDDLE_BLOCKS:![0-9]+]]
-;
-; CHECK: vec.epilog.iter.check:
-; CHECK: br i1 %min.epilog.iters.check, label %vec.epilog.scalar.ph, label %vec.epilog.ph, !prof [[PROF_F0_VEC_EPILOGUE_SKIP:![0-9]+]]
-;
-; CHECK: vec.epilog.ph:
-; CHECK: br label %vec.epilog.vector.body
-;
-; CHECK: vec.epilog.vector.body:
-; CHECK: br i1 {{.+}}, label %vec.epilog.middle.block, label %vec.epilog.vector.body, !prof [[PROF_F0_VEC_EPILOG_VECTOR_BODY:![0-9]+]]
-;
-; CHECK: vec.epilog.middle.block:
-; CHECK: br i1 %cmp.n{{.+}}, label %exit.loopexit, label %vec.epilog.scalar.ph, !prof [[PROF_F0_MIDDLE_BLOCKS:![0-9]+]]
-;
-; CHECK: vec.epilog.scalar.ph:
-; CHECK: br label %loop
-;
-; CHECK: loop:
-; CHECK: br i1 %cmp.loop, label %loop, label %exit.loopexit, !prof [[PROF_F0_LOOP:![0-9]+]]
+; FIXME: For MAINVF4IC2_EPI4 the branch weights in the terminator of
+; the VEC_EPILOG_ITER_CHECK block should be [4,4] since we process 8
+; scalar iterations in the main loop, leaving the remaining count to
+; be in the range [0,7]. That gives a 4:4 chance of skipping the
+; vector epilogue. I believe the problem lies in
+; EpilogueVectorizerEpilogueLoop::emitMinimumVectorEpilogueIterCountCheck
+; where the main loop VF is set to the same value as the epilogue VF.
+define void @f0(i8 %n, i32 %len, ptr %p) !prof !0 {
+; MAINVF4IC1_EPI4-LABEL: define void @f0(
+; MAINVF4IC1_EPI4-SAME: i8 [[N:%.*]], i32 [[LEN:%.*]], ptr [[P:%.*]]) !prof [[PROF0:![0-9]+]] {
+; MAINVF4IC1_EPI4: [[ENTRY:.*:]]
+; MAINVF4IC1_EPI4: br i1 [[CMP_ENTRY:%.*]], label %[[ITER_CHECK:.*]], label %[[EXIT:.*]], !prof [[PROF1:![0-9]+]]
+; MAINVF4IC1_EPI4: [[ITER_CHECK]]:
+; MAINVF4IC1_EPI4: br i1 [[MIN_ITERS_CHECK:%.*]], label %[[VEC_EPILOG_SCALAR_PH:.*]], label %[[VECTOR_SCEVCHECK:.*]], !prof [[PROF2:![0-9]+]]
+; MAINVF4IC1_EPI4: [[VECTOR_SCEVCHECK]]:
+; MAINVF4IC1_EPI4: br i1 [[TMP4:%.*]], label %[[VEC_EPILOG_SCALAR_PH]], label %[[VECTOR_MAIN_LOOP_ITER_CHECK:.*]], !prof [[PROF2]]
+; MAINVF4IC1_EPI4: [[VECTOR_MAIN_LOOP_ITER_CHECK]]:
+; MAINVF4IC1_EPI4: br i1 [[MIN_ITERS_CHECK1:%.*]], label %[[VEC_EPILOG_PH:.*]], label %[[VECTOR_PH:.*]], !prof [[PROF2]]
+; MAINVF4IC1_EPI4: [[VECTOR_PH]]:
+; MAINVF4IC1_EPI4: br label %[[VECTOR_BODY:.*]]
+; MAINVF4IC1_EPI4: [[VECTOR_BODY]]:
+; MAINVF4IC1_EPI4: br i1 [[TMP8:%.*]], label %[[MIDDLE_BLOCK:.*]], label %[[VECTOR_BODY]], !prof [[PROF3:![0-9]+]], !llvm.loop [[LOOP4:![0-9]+]]
+; MAINVF4IC1_EPI4: [[MIDDLE_BLOCK]]:
+; MAINVF4IC1_EPI4: br i1 [[CMP_N:%.*]], label %[[EXIT_LOOPEXIT:.*]], label %[[VEC_EPILOG_ITER_CHECK:.*]], !prof [[PROF7:![0-9]+]]
+; MAINVF4IC1_EPI4: [[VEC_EPILOG_ITER_CHECK]]:
+; MAINVF4IC1_EPI4: br i1 [[MIN_EPILOG_ITERS_CHECK:%.*]], label %[[VEC_EPILOG_SCALAR_PH]], label %[[VEC_EPILOG_PH]], !prof [[PROF8:![0-9]+]]
+; MAINVF4IC1_EPI4: [[VEC_EPILOG_PH]]:
+; MAINVF4IC1_EPI4: br label %[[VEC_EPILOG_VECTOR_BODY:.*]]
+; MAINVF4IC1_EPI4: [[VEC_EPILOG_VECTOR_BODY]]:
+; MAINVF4IC1_EPI4: br i1 [[TMP12:%.*]], label %[[VEC_EPILOG_MIDDLE_BLOCK:.*]], label %[[VEC_EPILOG_VECTOR_BODY]], !prof [[PROF9:![0-9]+]], !llvm.loop [[LOOP10:![0-9]+]]
+; MAINVF4IC1_EPI4: [[VEC_EPILOG_MIDDLE_BLOCK]]:
+; MAINVF4IC1_EPI4: br i1 [[CMP_N8:%.*]], label %[[EXIT_LOOPEXIT]], label %[[VEC_EPILOG_SCALAR_PH]], !prof [[PROF7]]
+; MAINVF4IC1_EPI4: [[VEC_EPILOG_SCALAR_PH]]:
+; MAINVF4IC1_EPI4: br label %[[LOOP:.*]]
+; MAINVF4IC1_EPI4: [[LOOP]]:
+; MAINVF4IC1_EPI4: br i1 [[CMP_LOOP:%.*]], label %[[LOOP]], label %[[EXIT_LOOPEXIT]], !prof [[PROF11:![0-9]+]], !llvm.loop [[LOOP12:![0-9]+]]
+; MAINVF4IC1_EPI4: [[EXIT_LOOPEXIT]]:
+; MAINVF4IC1_EPI4: br label %[[EXIT]]
+; MAINVF4IC1_EPI4: [[EXIT]]:
;
-; CHECK: exit.loopexit:
-; CHECK: br label %exit
+; MAINVF4IC2_EPI4-LABEL: define void @f0(
+; MAINVF4IC2_EPI4-SAME: i8 [[N:%.*]], i32 [[LEN:%.*]], ptr [[P:%.*]]) !prof [[PROF0:![0-9]+]] {
+; MAINVF4IC2_EPI4: [[ENTRY:.*:]]
+; MAINVF4IC2_EPI4: br i1 [[CMP_ENTRY:%.*]], label %[[ITER_CHECK:.*]], label %[[EXIT:.*]], !prof [[PROF1:![0-9]+]]
+; MAINVF4IC2_EPI4: [[ITER_CHECK]]:
+; MAINVF4IC2_EPI4: br i1 [[MIN_ITERS_CHECK:%.*]], label %[[VEC_EPILOG_SCALAR_PH:.*]], label %[[VECTOR_SCEVCHECK:.*]], !prof [[PROF2:![0-9]+]]
+; MAINVF4IC2_EPI4: [[VECTOR_SCEVCHECK]]:
+; MAINVF4IC2_EPI4: br i1 [[TMP4:%.*]], label %[[VEC_EPILOG_SCALAR_PH]], label %[[VECTOR_MAIN_LOOP_ITER_CHECK:.*]], !prof [[PROF2]]
+; MAINVF4IC2_EPI4: [[VECTOR_MAIN_LOOP_ITER_CHECK]]:
+; MAINVF4IC2_EPI4: br i1 [[MIN_ITERS_CHECK1:%.*]], label %[[VEC_EPILOG_PH:.*]], label %[[VECTOR_PH:.*]], !prof [[PROF2]]
+; MAINVF4IC2_EPI4: [[VECTOR_PH]]:
+; MAINVF4IC2_EPI4: br label %[[VECTOR_BODY:.*]]
+; MAINVF4IC2_EPI4: [[VECTOR_BODY]]:
+; MAINVF4IC2_EPI4: br i1 [[TMP9:%.*]], label %[[MIDDLE_BLOCK:.*]], label %[[VECTOR_BODY]], !prof [[PROF3:![0-9]+]], !llvm.loop [[LOOP4:![0-9]+]]
+; MAINVF4IC2_EPI4: [[MIDDLE_BLOCK]]:
+; MAINVF4IC2_EPI4: br i1 [[CMP_N:%.*]], label %[[EXIT_LOOPEXIT:.*]], label %[[VEC_EPILOG_ITER_CHECK:.*]], !prof [[PROF7:![0-9]+]]
+; MAINVF4IC2_EPI4: [[VEC_EPILOG_ITER_CHECK]]:
+; MAINVF4IC2_EPI4: br i1 [[MIN_EPILOG_ITERS_CHECK:%.*]], label %[[VEC_EPILOG_SCALAR_PH]], label %[[VEC_EPILOG_PH]], !prof [[PROF8:![0-9]+]]
+; MAINVF4IC2_EPI4: [[VEC_EPILOG_PH]]:
+; MAINVF4IC2_EPI4: br label %[[VEC_EPILOG_VECTOR_BODY:.*]]
+; MAINVF4IC2_EPI4: [[VEC_EPILOG_VECTOR_BODY]]:
+; MAINVF4IC2_EPI4: br i1 [[TMP13:%.*]], label %[[VEC_EPILOG_MIDDLE_BLOCK:.*]], label %[[VEC_EPILOG_VECTOR_BODY]], !prof [[PROF9:![0-9]+]], !llvm.loop [[LOOP10:![0-9]+]]
+; MAINVF4IC2_EPI4: [[VEC_EPILOG_MIDDLE_BLOCK]]:
+; MAINVF4IC2_EPI4: br i1 [[CMP_N8:%.*]], label %[[EXIT_LOOPEXIT]], label %[[VEC_EPILOG_SCALAR_PH]], !prof [[PROF11:![0-9]+]]
+; MAINVF4IC2_EPI4: [[VEC_EPILOG_SCALAR_PH]]:
+; MAINVF4IC2_EPI4: br label %[[LOOP:.*]]
+; MAINVF4IC2_EPI4: [[LOOP]]:
+; MAINVF4IC2_EPI4: br i1 [[CMP_LOOP:%.*]], label %[[LOOP]], label %[[EXIT_LOOPEXIT]], !prof [[PROF12:![0-9]+]], !llvm.loop [[LOOP13:![0-9]+]]
+; MAINVF4IC2_EPI4: [[EXIT_LOOPEXIT]]:
+; MAINVF4IC2_EPI4: br label %[[EXIT]]
+; MAINVF4IC2_EPI4: [[EXIT]]:
;
-; CHECK: exit:
-; CHECK: ret void
-
-define void @f0(i8 %n, i32 %len, ptr %p) !prof !0 {
entry:
%cmp.entry = icmp sgt i32 %len, 0
br i1 %cmp.entry, label %loop, label %exit, !prof !1
@@ -72,11 +100,33 @@ exit:
!0 = !{!"function_entry_count", i64 13}
!1 = !{!"branch_weights", i32 12, i32 1}
!2 = !{!"branch_weights", i32 1234, i32 1}
-
-; CHECK: [[PROF_F0_ENTRY]] = !{!"branch_weights", i32 12, i32 1}
-; CHECK: [[PROF_F0_UNLIKELY]] = !{!"branch_weights", i32 1, i32 127}
-; CHECK: [[PROF_F0_VECTOR_BODY]] = !{!"branch_weights", i32 1, i32 307}
-; CHECK: [[PROF_F0_MIDDLE_BLOCKS]] = !{!"branch_weights", i32 1, i32 3}
-; CHECK: [[PROF_F0_VEC_EPILOGUE_SKIP]] = !{!"branch_weights", i32 4, i32 0}
-; CHECK: [[PROF_F0_VEC_EPILOG_VECTOR_BODY]] = !{!"branch_weights", i32 0, i32 0}
-; CHECK: [[PROF_F0_LOOP]] = !{!"branch_weights", i32 2, i32 1}
+;.
+; MAINVF4IC1_EPI4: [[PROF0]] = !{!"function_entry_count", i64 13}
+; MAINVF4IC1_EPI4: [[PROF1]] = !{!"branch_weights", i32 12, i32 1}
+; MAINVF4IC1_EPI4: [[PROF2]] = !{!"branch_weights", i32 1, i32 127}
+; MAINVF4IC1_EPI4: [[PROF3]] = !{!"branch_weights", i32 1, i32 307}
+; MAINVF4IC1_EPI4: [[LOOP4]] = distinct !{[[LOOP4]], [[META5:![0-9]+]], [[META6:![0-9]+]]}
+; MAINVF4IC1_EPI4: [[META5]] = !{!"llvm.loop.isvectorized", i32 1}
+; MAINVF4IC1_EPI4: [[META6]] = !{!"llvm.loop.unroll.runtime.disable"}
+; MAINVF4IC1_EPI4: [[PROF7]] = !{!"branch_weights", i32 1, i32 3}
+; MAINVF4IC1_EPI4: [[PROF8]] = !{!"branch_weights", i32 4, i32 0}
+; MAINVF4IC1_EPI4: [[PROF9]] = !{!"branch_weights", i32 0, i32 0}
+; MAINVF4IC1_EPI4: [[LOOP10]] = distinct !{[[LOOP10]], [[META5]], [[META6]]}
+; MAINVF4IC1_EPI4: [[PROF11]] = !{!"branch_weights", i32 2, i32 1}
+; MAINVF4IC1_EPI4: [[LOOP12]] = distinct !{[[LOOP12]], [[META5]]}
+;.
+; MAINVF4IC2_EPI4: [[PROF0]] = !{!"function_entry_count", i64 13}
+; MAINVF4IC2_EPI4: [[PROF1]] = !{!"branch_weights", i32 12, i32 1}
+; MAINVF4IC2_EPI4: [[PROF2]] = !{!"branch_weights", i32 1, i32 127}
+; MAINVF4IC2_EPI4: [[PROF3]] = !{!"branch_weights", i32 1, i32 153}
+; MAINVF4IC2_EPI4: [[LOOP4]] = distinct !{[[LOOP4]], [[META5:![0-9]+]], [[META6:![0-9]+]]}
+; MAINVF4IC2_EPI4: [[META5]] = !{!"llvm.loop.isvectorized", i32 1}
+; MAINVF4IC2_EPI4: [[META6]] = !{!"llvm.loop.unroll.runtime.disable"}
+; MAINVF4IC2_EPI4: [[PROF7]] = !{!"branch_weights", i32 1, i32 7}
+; MAINVF4IC2_EPI4: [[PROF8]] = !{!"branch_weights", i32 4, i32 0}
+; MAINVF4IC2_EPI4: [[PROF9]] = !{!"branch_weights", i32 0, i32 0}
+; MAINVF4IC2_EPI4: [[LOOP10]] = distinct !{[[LOOP10]], [[META5]], [[META6]]}
+; MAINVF4IC2_EPI4: [[PROF11]] = !{!"branch_weights", i32 1, i32 3}
+; MAINVF4IC2_EPI4: [[PROF12]] = !{!"branch_weights", i32 2, i32 1}
+; MAINVF4IC2_EPI4: [[LOOP13]] = distinct !{[[LOOP13]], [[META5]]}
+;.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM with a few additional inline comments, thanks!
// FIXME: See test Transforms/LoopVectorize/branch-weights.ll. I don't | ||
// think the MainLoopStep is correct. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// FIXME: See test Transforms/LoopVectorize/branch-weights.ll. I don't | |
// think the MainLoopStep is correct. | |
// FIXME: UF and VF are the same as EPI.EpilogueUF and EPI.EpilogueVF, so MainLoopStep is the same as EpilogueLoopStep. See test Transforms/LoopVectorize/branch-weights.ll. |
Yes they definitely are the same, as we are in EpilogueVectorizerEpilogueLoop
; MAINVF4IC1_EPI4: [[VEC_EPILOG_PH]]: | ||
; MAINVF4IC1_EPI4: br label %[[VEC_EPILOG_VECTOR_BODY:.*]] | ||
; MAINVF4IC1_EPI4: [[VEC_EPILOG_VECTOR_BODY]]: | ||
; MAINVF4IC1_EPI4: br i1 [[TMP12:%.*]], label %[[VEC_EPILOG_MIDDLE_BLOCK:.*]], label %[[VEC_EPILOG_VECTOR_BODY]], !prof [[PROF9:![0-9]+]], !llvm.loop [[LOOP10:![0-9]+]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be helpful if we had at least the exit condition for the vector loop and the induction increment, but I'm not sure if it would be easy to include.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's easy enough to include the exit condition by adding a filter for icmp.
This patch adds a test that shows incorrect branch weights being set in function EpilogueVectorizerEpilogueLoop::emitMinimumVectorEpilogueIterCountCheck
df3b170
to
d354a57
Compare
This patch adds a test that shows incorrect branch weights being set in function
EpilogueVectorizerEpilogueLoop::emitMinimumVectorEpilogueIterCountCheck