-
Notifications
You must be signed in to change notification settings - Fork 15k
[InstCombine] Strip leading zero indices from GEP #155415
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
GEPs are often in the form `gep [N x %T], ptr %p, i64 0, i64 %idx`. Canonicalize these to `gep %T, ptr %p, i64 %idx`. This enables transforms that only support one GEP index to work and improves CSE.
@llvm/pr-subscribers-backend-amdgpu @llvm/pr-subscribers-clang Author: Nikita Popov (nikic) ChangesGEPs are often in the form This enables transforms that only support one GEP index to work and improves CSE. Various transforms were recently hardened to make sure they still work without the leading index. #155404 is the last problematic transform I'm aware of. Patch is 213.30 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/155415.diff 56 Files Affected:
diff --git a/clang/test/CodeGen/attr-counted-by.c b/clang/test/CodeGen/attr-counted-by.c
index 59e1b134850a9..cb23efdb8f263 100644
--- a/clang/test/CodeGen/attr-counted-by.c
+++ b/clang/test/CodeGen/attr-counted-by.c
@@ -72,7 +72,7 @@ struct anon_struct {
// SANITIZE-WITH-ATTR-NEXT: unreachable, !nosanitize [[META2]]
// SANITIZE-WITH-ATTR: cont3:
// SANITIZE-WITH-ATTR-NEXT: [[ARRAY:%.*]] = getelementptr inbounds nuw i8, ptr [[P]], i64 12
-// SANITIZE-WITH-ATTR-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds nuw [0 x i32], ptr [[ARRAY]], i64 0, i64 [[IDXPROM]]
+// SANITIZE-WITH-ATTR-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds nuw i32, ptr [[ARRAY]], i64 [[IDXPROM]]
// SANITIZE-WITH-ATTR-NEXT: store i32 [[VAL]], ptr [[ARRAYIDX]], align 4, !tbaa [[TBAA4:![0-9]+]]
// SANITIZE-WITH-ATTR-NEXT: ret void
//
@@ -81,7 +81,7 @@ struct anon_struct {
// NO-SANITIZE-WITH-ATTR-NEXT: entry:
// NO-SANITIZE-WITH-ATTR-NEXT: [[ARRAY:%.*]] = getelementptr inbounds nuw i8, ptr [[P]], i64 12
// NO-SANITIZE-WITH-ATTR-NEXT: [[IDXPROM:%.*]] = sext i32 [[INDEX]] to i64
-// NO-SANITIZE-WITH-ATTR-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds [0 x i32], ptr [[ARRAY]], i64 0, i64 [[IDXPROM]]
+// NO-SANITIZE-WITH-ATTR-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds i32, ptr [[ARRAY]], i64 [[IDXPROM]]
// NO-SANITIZE-WITH-ATTR-NEXT: store i32 [[VAL]], ptr [[ARRAYIDX]], align 4, !tbaa [[TBAA2:![0-9]+]]
// NO-SANITIZE-WITH-ATTR-NEXT: ret void
//
@@ -90,7 +90,7 @@ struct anon_struct {
// SANITIZE-WITHOUT-ATTR-NEXT: entry:
// SANITIZE-WITHOUT-ATTR-NEXT: [[ARRAY:%.*]] = getelementptr inbounds nuw i8, ptr [[P]], i64 12
// SANITIZE-WITHOUT-ATTR-NEXT: [[IDXPROM:%.*]] = sext i32 [[INDEX]] to i64
-// SANITIZE-WITHOUT-ATTR-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds [0 x i32], ptr [[ARRAY]], i64 0, i64 [[IDXPROM]]
+// SANITIZE-WITHOUT-ATTR-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds i32, ptr [[ARRAY]], i64 [[IDXPROM]]
// SANITIZE-WITHOUT-ATTR-NEXT: store i32 [[VAL]], ptr [[ARRAYIDX]], align 4, !tbaa [[TBAA2:![0-9]+]]
// SANITIZE-WITHOUT-ATTR-NEXT: ret void
//
@@ -99,7 +99,7 @@ struct anon_struct {
// NO-SANITIZE-WITHOUT-ATTR-NEXT: entry:
// NO-SANITIZE-WITHOUT-ATTR-NEXT: [[ARRAY:%.*]] = getelementptr inbounds nuw i8, ptr [[P]], i64 12
// NO-SANITIZE-WITHOUT-ATTR-NEXT: [[IDXPROM:%.*]] = sext i32 [[INDEX]] to i64
-// NO-SANITIZE-WITHOUT-ATTR-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds [0 x i32], ptr [[ARRAY]], i64 0, i64 [[IDXPROM]]
+// NO-SANITIZE-WITHOUT-ATTR-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds i32, ptr [[ARRAY]], i64 [[IDXPROM]]
// NO-SANITIZE-WITHOUT-ATTR-NEXT: store i32 [[VAL]], ptr [[ARRAYIDX]], align 4, !tbaa [[TBAA2:![0-9]+]]
// NO-SANITIZE-WITHOUT-ATTR-NEXT: ret void
//
@@ -120,7 +120,7 @@ void test1(struct annotated *p, int index, int val) {
// SANITIZE-WITH-ATTR-NEXT: unreachable, !nosanitize [[META2]]
// SANITIZE-WITH-ATTR: cont6:
// SANITIZE-WITH-ATTR-NEXT: [[ARRAY:%.*]] = getelementptr inbounds nuw i8, ptr [[P]], i64 12
-// SANITIZE-WITH-ATTR-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds nuw [0 x i32], ptr [[ARRAY]], i64 0, i64 [[INDEX]]
+// SANITIZE-WITH-ATTR-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds nuw i32, ptr [[ARRAY]], i64 [[INDEX]]
// SANITIZE-WITH-ATTR-NEXT: [[TMP2:%.*]] = tail call i32 @llvm.smax.i32(i32 [[COUNTED_BY_LOAD]], i32 0)
// SANITIZE-WITH-ATTR-NEXT: [[CONV:%.*]] = shl i32 [[TMP2]], 2
// SANITIZE-WITH-ATTR-NEXT: store i32 [[CONV]], ptr [[ARRAYIDX]], align 4, !tbaa [[TBAA4]]
@@ -134,7 +134,7 @@ void test1(struct annotated *p, int index, int val) {
// NO-SANITIZE-WITH-ATTR-NEXT: [[COUNTED_BY_LOAD:%.*]] = load i32, ptr [[COUNTED_BY_GEP]], align 4
// NO-SANITIZE-WITH-ATTR-NEXT: [[TMP0:%.*]] = tail call i32 @llvm.smax.i32(i32 [[COUNTED_BY_LOAD]], i32 0)
// NO-SANITIZE-WITH-ATTR-NEXT: [[CONV:%.*]] = shl i32 [[TMP0]], 2
-// NO-SANITIZE-WITH-ATTR-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds nuw [0 x i32], ptr [[ARRAY]], i64 0, i64 [[INDEX]]
+// NO-SANITIZE-WITH-ATTR-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds nuw i32, ptr [[ARRAY]], i64 [[INDEX]]
// NO-SANITIZE-WITH-ATTR-NEXT: store i32 [[CONV]], ptr [[ARRAYIDX]], align 4, !tbaa [[TBAA2]]
// NO-SANITIZE-WITH-ATTR-NEXT: ret void
//
@@ -142,7 +142,7 @@ void test1(struct annotated *p, int index, int val) {
// SANITIZE-WITHOUT-ATTR-SAME: ptr noundef [[P:%.*]], i64 noundef [[INDEX:%.*]]) local_unnamed_addr #[[ATTR0]] {
// SANITIZE-WITHOUT-ATTR-NEXT: entry:
// SANITIZE-WITHOUT-ATTR-NEXT: [[ARRAY:%.*]] = getelementptr inbounds nuw i8, ptr [[P]], i64 12
-// SANITIZE-WITHOUT-ATTR-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds nuw [0 x i32], ptr [[ARRAY]], i64 0, i64 [[INDEX]]
+// SANITIZE-WITHOUT-ATTR-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds nuw i32, ptr [[ARRAY]], i64 [[INDEX]]
// SANITIZE-WITHOUT-ATTR-NEXT: store i32 -1, ptr [[ARRAYIDX]], align 4, !tbaa [[TBAA2]]
// SANITIZE-WITHOUT-ATTR-NEXT: ret void
//
@@ -150,7 +150,7 @@ void test1(struct annotated *p, int index, int val) {
// NO-SANITIZE-WITHOUT-ATTR-SAME: ptr noundef [[P:%.*]], i64 noundef [[INDEX:%.*]]) local_unnamed_addr #[[ATTR0]] {
// NO-SANITIZE-WITHOUT-ATTR-NEXT: entry:
// NO-SANITIZE-WITHOUT-ATTR-NEXT: [[ARRAY:%.*]] = getelementptr inbounds nuw i8, ptr [[P]], i64 12
-// NO-SANITIZE-WITHOUT-ATTR-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds nuw [0 x i32], ptr [[ARRAY]], i64 0, i64 [[INDEX]]
+// NO-SANITIZE-WITHOUT-ATTR-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds nuw i32, ptr [[ARRAY]], i64 [[INDEX]]
// NO-SANITIZE-WITHOUT-ATTR-NEXT: store i32 -1, ptr [[ARRAYIDX]], align 4, !tbaa [[TBAA2]]
// NO-SANITIZE-WITHOUT-ATTR-NEXT: ret void
//
@@ -243,7 +243,7 @@ size_t test2_bdos_cast(struct annotated *p) {
// SANITIZE-WITH-ATTR-NEXT: unreachable, !nosanitize [[META2]]
// SANITIZE-WITH-ATTR: cont3:
// SANITIZE-WITH-ATTR-NEXT: [[ARRAY:%.*]] = getelementptr inbounds nuw i8, ptr [[P]], i64 12
-// SANITIZE-WITH-ATTR-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds nuw [0 x i32], ptr [[ARRAY]], i64 0, i64 [[INDEX]]
+// SANITIZE-WITH-ATTR-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds nuw i32, ptr [[ARRAY]], i64 [[INDEX]]
// SANITIZE-WITH-ATTR-NEXT: store i32 -1, ptr [[ARRAYIDX]], align 4, !tbaa [[TBAA4]]
// SANITIZE-WITH-ATTR-NEXT: ret void
//
@@ -251,7 +251,7 @@ size_t test2_bdos_cast(struct annotated *p) {
// NO-SANITIZE-WITH-ATTR-SAME: ptr noundef [[P:%.*]], i64 noundef [[INDEX:%.*]]) local_unnamed_addr #[[ATTR0]] {
// NO-SANITIZE-WITH-ATTR-NEXT: entry:
// NO-SANITIZE-WITH-ATTR-NEXT: [[ARRAY:%.*]] = getelementptr inbounds nuw i8, ptr [[P]], i64 12
-// NO-SANITIZE-WITH-ATTR-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds nuw [0 x i32], ptr [[ARRAY]], i64 0, i64 [[INDEX]]
+// NO-SANITIZE-WITH-ATTR-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds nuw i32, ptr [[ARRAY]], i64 [[INDEX]]
// NO-SANITIZE-WITH-ATTR-NEXT: store i32 -1, ptr [[ARRAYIDX]], align 4, !tbaa [[TBAA2]]
// NO-SANITIZE-WITH-ATTR-NEXT: ret void
//
@@ -259,7 +259,7 @@ size_t test2_bdos_cast(struct annotated *p) {
// SANITIZE-WITHOUT-ATTR-SAME: ptr noundef [[P:%.*]], i64 noundef [[INDEX:%.*]]) local_unnamed_addr #[[ATTR0]] {
// SANITIZE-WITHOUT-ATTR-NEXT: entry:
// SANITIZE-WITHOUT-ATTR-NEXT: [[ARRAY:%.*]] = getelementptr inbounds nuw i8, ptr [[P]], i64 12
-// SANITIZE-WITHOUT-ATTR-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds nuw [0 x i32], ptr [[ARRAY]], i64 0, i64 [[INDEX]]
+// SANITIZE-WITHOUT-ATTR-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds nuw i32, ptr [[ARRAY]], i64 [[INDEX]]
// SANITIZE-WITHOUT-ATTR-NEXT: store i32 -1, ptr [[ARRAYIDX]], align 4, !tbaa [[TBAA2]]
// SANITIZE-WITHOUT-ATTR-NEXT: ret void
//
@@ -267,7 +267,7 @@ size_t test2_bdos_cast(struct annotated *p) {
// NO-SANITIZE-WITHOUT-ATTR-SAME: ptr noundef [[P:%.*]], i64 noundef [[INDEX:%.*]]) local_unnamed_addr #[[ATTR0]] {
// NO-SANITIZE-WITHOUT-ATTR-NEXT: entry:
// NO-SANITIZE-WITHOUT-ATTR-NEXT: [[ARRAY:%.*]] = getelementptr inbounds nuw i8, ptr [[P]], i64 12
-// NO-SANITIZE-WITHOUT-ATTR-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds nuw [0 x i32], ptr [[ARRAY]], i64 0, i64 [[INDEX]]
+// NO-SANITIZE-WITHOUT-ATTR-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds nuw i32, ptr [[ARRAY]], i64 [[INDEX]]
// NO-SANITIZE-WITHOUT-ATTR-NEXT: store i32 -1, ptr [[ARRAYIDX]], align 4, !tbaa [[TBAA2]]
// NO-SANITIZE-WITHOUT-ATTR-NEXT: ret void
//
@@ -350,7 +350,7 @@ size_t test3_bdos_cast(struct annotated *p) {
// SANITIZE-WITH-ATTR-NEXT: [[RESULT:%.*]] = add i32 [[FLEXIBLE_ARRAY_MEMBER_SIZE]], 244
// SANITIZE-WITH-ATTR-NEXT: [[TMP4:%.*]] = and i32 [[RESULT]], 252
// SANITIZE-WITH-ATTR-NEXT: [[CONV2:%.*]] = select i1 [[TMP3]], i32 [[TMP4]], i32 0
-// SANITIZE-WITH-ATTR-NEXT: [[ARRAYIDX10:%.*]] = getelementptr inbounds nuw [0 x i32], ptr [[ARRAY]], i64 0, i64 [[IDXPROM]]
+// SANITIZE-WITH-ATTR-NEXT: [[ARRAYIDX10:%.*]] = getelementptr inbounds nuw i32, ptr [[ARRAY]], i64 [[IDXPROM]]
// SANITIZE-WITH-ATTR-NEXT: store i32 [[CONV2]], ptr [[ARRAYIDX10]], align 4, !tbaa [[TBAA4]]
// SANITIZE-WITH-ATTR-NEXT: [[DOTNOT81:%.*]] = icmp eq i32 [[DOTCOUNTED_BY_LOAD]], 3
// SANITIZE-WITH-ATTR-NEXT: br i1 [[DOTNOT81]], label [[HANDLER_OUT_OF_BOUNDS18:%.*]], label [[CONT19:%.*]], !prof [[PROF8:![0-9]+]], !nosanitize [[META2]]
@@ -370,7 +370,7 @@ size_t test3_bdos_cast(struct annotated *p) {
// SANITIZE-WITH-ATTR-NEXT: [[RESULT25:%.*]] = add i32 [[FLEXIBLE_ARRAY_MEMBER_SIZE]], 240
// SANITIZE-WITH-ATTR-NEXT: [[TMP7:%.*]] = and i32 [[RESULT25]], 252
// SANITIZE-WITH-ATTR-NEXT: [[CONV27:%.*]] = select i1 [[TMP6]], i32 [[TMP7]], i32 0
-// SANITIZE-WITH-ATTR-NEXT: [[ARRAYIDX36:%.*]] = getelementptr inbounds nuw [0 x i32], ptr [[ARRAY]], i64 0, i64 [[IDXPROM31]]
+// SANITIZE-WITH-ATTR-NEXT: [[ARRAYIDX36:%.*]] = getelementptr inbounds nuw i32, ptr [[ARRAY]], i64 [[IDXPROM31]]
// SANITIZE-WITH-ATTR-NEXT: store i32 [[CONV27]], ptr [[ARRAYIDX36]], align 4, !tbaa [[TBAA4]]
// SANITIZE-WITH-ATTR-NEXT: [[IDXPROM42:%.*]] = sext i32 [[FAM_IDX]] to i64
// SANITIZE-WITH-ATTR-NEXT: [[DOTCOUNTED_BY_LOAD44:%.*]] = load i32, ptr [[DOTCOUNTED_BY_GEP]], align 4
@@ -389,7 +389,7 @@ size_t test3_bdos_cast(struct annotated *p) {
// SANITIZE-WITH-ATTR-NEXT: tail call void @__ubsan_handle_out_of_bounds_abort(ptr nonnull @[[GLOB10:[0-9]+]], i64 [[IDXPROM60]]) #[[ATTR8]], !nosanitize [[META2]]
// SANITIZE-WITH-ATTR-NEXT: unreachable, !nosanitize [[META2]]
// SANITIZE-WITH-ATTR: cont67:
-// SANITIZE-WITH-ATTR-NEXT: [[ARRAYIDX65:%.*]] = getelementptr inbounds nuw [0 x i32], ptr [[ARRAY]], i64 0, i64 [[IDXPROM60]]
+// SANITIZE-WITH-ATTR-NEXT: [[ARRAYIDX65:%.*]] = getelementptr inbounds nuw i32, ptr [[ARRAY]], i64 [[IDXPROM60]]
// SANITIZE-WITH-ATTR-NEXT: [[COUNT50:%.*]] = sext i32 [[DOTCOUNTED_BY_LOAD44]] to i64
// SANITIZE-WITH-ATTR-NEXT: [[TMP10:%.*]] = sub nsw i64 [[COUNT50]], [[IDXPROM42]]
// SANITIZE-WITH-ATTR-NEXT: [[TMP11:%.*]] = tail call i64 @llvm.smax.i64(i64 [[TMP10]], i64 0)
@@ -411,7 +411,7 @@ size_t test3_bdos_cast(struct annotated *p) {
// NO-SANITIZE-WITH-ATTR-NEXT: [[TMP1:%.*]] = and i32 [[RESULT]], 252
// NO-SANITIZE-WITH-ATTR-NEXT: [[CONV1:%.*]] = select i1 [[TMP0]], i32 [[TMP1]], i32 0
// NO-SANITIZE-WITH-ATTR-NEXT: [[IDXPROM:%.*]] = sext i32 [[INDEX]] to i64
-// NO-SANITIZE-WITH-ATTR-NEXT: [[ARRAYIDX3:%.*]] = getelementptr inbounds [0 x i32], ptr [[ARRAY]], i64 0, i64 [[IDXPROM]]
+// NO-SANITIZE-WITH-ATTR-NEXT: [[ARRAYIDX3:%.*]] = getelementptr inbounds i32, ptr [[ARRAY]], i64 [[IDXPROM]]
// NO-SANITIZE-WITH-ATTR-NEXT: store i32 [[CONV1]], ptr [[ARRAYIDX3]], align 4, !tbaa [[TBAA2]]
// NO-SANITIZE-WITH-ATTR-NEXT: [[COUNTED_BY_LOAD7:%.*]] = load i32, ptr [[COUNTED_BY_GEP]], align 4
// NO-SANITIZE-WITH-ATTR-NEXT: [[FLEXIBLE_ARRAY_MEMBER_SIZE9:%.*]] = shl i32 [[COUNTED_BY_LOAD7]], 2
@@ -419,9 +419,7 @@ size_t test3_bdos_cast(struct annotated *p) {
// NO-SANITIZE-WITH-ATTR-NEXT: [[TMP2:%.*]] = icmp sgt i32 [[COUNTED_BY_LOAD7]], 3
// NO-SANITIZE-WITH-ATTR-NEXT: [[TMP3:%.*]] = and i32 [[RESULT10]], 252
// NO-SANITIZE-WITH-ATTR-NEXT: [[CONV12:%.*]] = select i1 [[TMP2]], i32 [[TMP3]], i32 0
-// NO-SANITIZE-WITH-ATTR-NEXT: [[ADD:%.*]] = add nsw i32 [[INDEX]], 1
-// NO-SANITIZE-WITH-ATTR-NEXT: [[IDXPROM14:%.*]] = sext i32 [[ADD]] to i64
-// NO-SANITIZE-WITH-ATTR-NEXT: [[ARRAYIDX15:%.*]] = getelementptr inbounds [0 x i32], ptr [[ARRAY]], i64 0, i64 [[IDXPROM14]]
+// NO-SANITIZE-WITH-ATTR-NEXT: [[ARRAYIDX15:%.*]] = getelementptr i8, ptr [[ARRAYIDX3]], i64 4
// NO-SANITIZE-WITH-ATTR-NEXT: store i32 [[CONV12]], ptr [[ARRAYIDX15]], align 4, !tbaa [[TBAA2]]
// NO-SANITIZE-WITH-ATTR-NEXT: [[IDXPROM17:%.*]] = sext i32 [[FAM_IDX]] to i64
// NO-SANITIZE-WITH-ATTR-NEXT: [[COUNTED_BY_LOAD20:%.*]] = load i32, ptr [[COUNTED_BY_GEP]], align 4
@@ -434,9 +432,7 @@ size_t test3_bdos_cast(struct annotated *p) {
// NO-SANITIZE-WITH-ATTR-NEXT: [[TMP8:%.*]] = shl i32 [[DOTTR]], 2
// NO-SANITIZE-WITH-ATTR-NEXT: [[TMP9:%.*]] = and i32 [[TMP8]], 252
// NO-SANITIZE-WITH-ATTR-NEXT: [[CONV26:%.*]] = select i1 [[TMP7]], i32 [[TMP9]], i32 0
-// NO-SANITIZE-WITH-ATTR-NEXT: [[ADD28:%.*]] = add nsw i32 [[INDEX]], 2
-// NO-SANITIZE-WITH-ATTR-NEXT: [[IDXPROM29:%.*]] = sext i32 [[ADD28]] to i64
-// NO-SANITIZE-WITH-ATTR-NEXT: [[ARRAYIDX30:%.*]] = getelementptr inbounds [0 x i32], ptr [[ARRAY]], i64 0, i64 [[IDXPROM29]]
+// NO-SANITIZE-WITH-ATTR-NEXT: [[ARRAYIDX30:%.*]] = getelementptr i8, ptr [[ARRAYIDX3]], i64 8
// NO-SANITIZE-WITH-ATTR-NEXT: store i32 [[CONV26]], ptr [[ARRAYIDX30]], align 4, !tbaa [[TBAA2]]
// NO-SANITIZE-WITH-ATTR-NEXT: ret void
//
@@ -445,15 +441,11 @@ size_t test3_bdos_cast(struct annotated *p) {
// SANITIZE-WITHOUT-ATTR-NEXT: entry:
// SANITIZE-WITHOUT-ATTR-NEXT: [[ARRAY:%.*]] = getelementptr inbounds nuw i8, ptr [[P]], i64 12
// SANITIZE-WITHOUT-ATTR-NEXT: [[IDXPROM:%.*]] = sext i32 [[INDEX]] to i64
-// SANITIZE-WITHOUT-ATTR-NEXT: [[ARRAYIDX5:%.*]] = getelementptr inbounds [0 x i32], ptr [[ARRAY]], i64 0, i64 [[IDXPROM]]
+// SANITIZE-WITHOUT-ATTR-NEXT: [[ARRAYIDX5:%.*]] = getelementptr i32, ptr [[ARRAY]], i64 [[IDXPROM]]
// SANITIZE-WITHOUT-ATTR-NEXT: store i32 255, ptr [[ARRAYIDX5]], align 4, !tbaa [[TBAA2]]
-// SANITIZE-WITHOUT-ATTR-NEXT: [[ADD:%.*]] = add nsw i32 [[INDEX]], 1
-// SANITIZE-WITHOUT-ATTR-NEXT: [[IDXPROM17:%.*]] = sext i32 [[ADD]] to i64
-// SANITIZE-WITHOUT-ATTR-NEXT: [[ARRAYIDX18:%.*]] = getelementptr inbounds [0 x i32], ptr [[ARRAY]], i64 0, i64 [[IDXPROM17]]
+// SANITIZE-WITHOUT-ATTR-NEXT: [[ARRAYIDX18:%.*]] = getelementptr i8, ptr [[ARRAYIDX5]], i64 4
// SANITIZE-WITHOUT-ATTR-NEXT: store i32 255, ptr [[ARRAYIDX18]], align 4, !tbaa [[TBAA2]]
-// SANITIZE-WITHOUT-ATTR-NEXT: [[ADD31:%.*]] = add nsw i32 [[INDEX]], 2
-// SANITIZE-WITHOUT-ATTR-NEXT: [[IDXPROM32:%.*]] = sext i32 [[ADD31]] to i64
-// SANITIZE-WITHOUT-ATTR-NEXT: [[ARRAYIDX33:%.*]] = getelementptr inbounds [0 x i32], ptr [[ARRAY]], i64 0, i64 [[IDXPROM32]]
+// SANITIZE-WITHOUT-ATTR-NEXT: [[ARRAYIDX33:%.*]] = getelementptr i8, ptr [[ARRAYIDX5]], i64 8
// SANITIZE-WITHOUT-ATTR-NEXT: store i32 255, ptr [[ARRAYIDX33]], align 4, !tbaa [[TBAA2]]
// SANITIZE-WITHOUT-ATTR-NEXT: ret void
//
@@ -462,15 +454,11 @@ size_t test3_bdos_cast(struct annotated *p) {
// NO-SANITIZE-WITHOUT-ATTR-NEXT: entry:
// NO-SANITIZE-WITHOUT-ATTR-NEXT: [[ARRAY:%.*]] = getelementptr inbounds nuw i8, ptr [[P]], i64 12
// NO-SANITIZE-WITHOUT-ATTR-NEXT: [[IDXPROM:%.*]] = sext i32 [[INDEX]] to i64
-// NO-SANITIZE-WITHOUT-ATTR-NEXT: [[ARRAYIDX3:%.*]] = getelementptr inbounds [0 x i32], ptr [[ARRAY]], i64 0, i64 [[IDXPROM]]
+// NO-SANITIZE-WITHOUT-ATTR-NEXT: [[ARRAYIDX3:%.*]] = getelementptr inbounds i32, ptr [[ARRAY]], i64 [[IDXPROM]]
// NO-SANITIZE-WITHOUT-ATTR-NEXT: store i32 255, ptr [[ARRAYIDX3]], align 4, !tbaa [[TBAA2]]
-// NO-SANITIZE-WITHOUT-ATTR-NEXT: [[ADD:%.*]] = add nsw i32 [[INDEX]], 1
-// NO-SANITIZE-WITHOUT-ATTR-NEXT: [[IDXPROM9:%.*]] = sext i32 [[ADD]] to i64
-// NO-SANITIZE-WITHOUT-ATTR-NEXT: [[ARRAYIDX10:%.*]] = getelementptr inbounds [0 x i32], ptr [[ARRAY]], i64 0, i64 [[IDXPROM9]]
+// NO-SANITIZE-WITHOUT-ATTR-NEXT: [[ARRAYIDX10:%.*]] = getelementptr i8, ptr [[ARRAYIDX3]], i64 4
// NO-SANITIZE-WITHOUT-ATTR-NEXT: store i32 255, ptr [[ARRAYIDX10]], align 4, !tbaa [[TBAA2]]
-// NO-SANITIZE-WITHOUT-ATTR-NEXT: [[ADD17:%.*]] = add nsw i32 [[INDEX]], 2
-// NO-SANITIZE-WITHOUT-ATTR-NEXT: [[IDXPROM18:%.*]] = sext i32 [[ADD17]] to i64
-// NO-SANITIZE-WITHOUT-ATTR-NEXT: [[ARRAYIDX19:%.*]] = getelementptr inbounds [0 x i32], ptr [[ARRAY]], i64 0, i64 [[IDXPROM18]]
+// NO-SANITIZE-WITHOUT-ATTR-NEXT: [[ARRAYIDX19:%.*]] = getelementptr i8, ptr [[ARRAYIDX3]], i64 8
// NO-SANITIZE-WITHOUT-ATTR-NEXT: store i32 255, ptr [[ARRAYIDX19]], align 4, !tbaa [[TBAA2]]
// NO-SANITIZE-WITHOUT-ATTR-NEXT: ret void
//
@@ -632,7 +620,7 @@ size_t test4_bdos_cast2(struct annotated *p, int index) {
// SANITIZE-WITH-ATTR-NEXT: unreachable, !nosanitize [[META2]]
// SANITIZE-WITH-ATTR: cont3:
// SANITIZE-WITH-ATTR-NEXT: [[TMP1:%.*]] = getelementptr inbounds nuw i8, ptr [[P]], i64 16
-// SANITIZE-WITH-ATTR-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds [0 x i32], ptr [[TMP1]], i64 0, i64 [[IDXPROM]]
+// SANITIZE-WITH-ATTR-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds i32, ptr [[TMP1]], i64 [[IDXPROM]]
// SANITIZE-WITH-ATTR-NEXT: store i32 -1, ptr [[ARRAYIDX]], align 4, !tbaa [[TBAA4]]
// SANITIZE-WITH-ATTR-NEXT: ret void
//
@@ -641,7 +629,7 @@ size_t test4_bdos_cast2(struct annotated *p, int index) {
// NO-SANITIZE-WITH-ATTR-NEXT: entry:
// NO-SANITIZE-WITH-ATTR-NEXT: [[TMP0:%.*]] = getelementptr inbounds nuw i8, ptr [[P]], i64 16
// NO-SANITIZE-WITH-ATTR-NEXT: [[IDXPROM:%.*]] = sext i32 [[INDEX]] to i64
-// NO-SANITIZE-WITH-ATTR-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds [0 x i32], ptr [[TMP0]], i64 0, i64 [[IDXPROM]]
+// NO-SANITIZE-WITH-ATTR-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds i32, ptr [[TMP0]], i64 [[IDXPROM]]
// NO-SANITIZE-WITH-ATTR-NEXT: store i32 -1, ptr [[ARRAYIDX]], align 4, !tbaa [[TBAA2]]
// NO-SANITIZE-WITH-ATTR-NEXT: ret void
//
@@ -650,7 +638,7 @@ size_t test4_bdos_cast2(struct annotated *p, int index) {
// SANITIZE-WITHOUT-ATTR-NEXT: entry:
// SANITIZE-WITHOUT-ATTR-NEXT: [[TMP0:%.*]] = getelementptr inbounds nuw i8, ptr [[P]], i64 16
// SANITIZE-WITHOUT-ATTR-NEXT: [[IDXPROM:%.*]] = sext i32 [[INDEX]] to i64
-// SANITIZE-WITHOUT-ATTR-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds [0 x i32], ptr [[TMP0]], i64 0, i64 [[IDXPROM]]
+// SANITIZE-WITHOUT-ATTR-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds i32, ptr [[TMP0]], i64 [[IDXPROM]]
// SANITIZE-WITHOUT-ATTR-NEXT: store i32 -1, ptr [[ARRAYIDX]], align 4, !tbaa [[TBAA2]]
// SANITIZE-WITHOUT-ATTR-NEXT: ret void
//
@@ -659,7 +647,7 @@ size_t test4_bdos_cast2(struct annotated *p, int index) {
// NO-SANITIZE-WITHOUT-ATTR-NEXT: entry:
// NO-SANITIZE-WITHOUT-ATTR-NEXT: [[TMP0:%.*]] = getelementptr inbounds nuw i8, ptr [[P]], i64 16
// NO-SANITIZE-WITHOUT-ATTR-NEXT: [[IDXPROM:%.*]] = sext i32 [[INDEX]] to i64
-// NO-SANITIZE-WITHOUT-ATTR-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds [0 x i32], ptr [[TMP0]], i64 0, i64 [[IDXPROM]]
+// NO-SANITIZE-WITHOUT-ATTR-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds i32, ptr [[TMP0]], i64 [[IDXPROM]]
// NO-SANITIZE-WITHOUT-ATTR-NEXT: store i32 -1, ptr [[ARRAYIDX]], align 4, !tbaa [[TBAA2]]
// NO-SANITIZE-WITHOUT-ATTR-NEXT: ret void
//
@@ -704,7 +692,7 @@ size_t test5_bdos(struct anon_struct *p) {
// SANITIZE-WITH-ATTR-NEXT: unreachable, !nosanitize [[META2]]
// SANITIZE-WITH-ATTR: cont6:
// SANITIZE-WITH-ATTR-NEXT: [[TMP1:%.*]] = getelementptr inbounds nuw i8, ptr [[P]], i64 16
-// SANITIZE-WITH-ATTR-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds [0 x i32], ptr [[TMP1]], i64 0, i64 [[IDXPROM]]
+// SANITIZE-WITH-ATTR-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds i32, ptr [[TMP1]], i64 [[IDXPROM]]
// SANITIZE-WITH-ATTR-NEXT: [[FLEXIBLE_ARRAY_MEMBER_SIZE:%.*]] = shl nuw i64 [[COUNTED_BY_LOAD]], 2
// SANITIZE-WITH-ATTR-...
[truncated]
|
Oh - I was about to put up fd286bd. That was only going to try with array types, not with vectors too, and I believe the only other semi-useful part was some DA tests I had tried adding without array indices (which looked fine for all the ones I tested). The differences I have seen have mostly been extra hoisting causing extra unrolling. I would be interested in if @dtcxzyw finds anything more. |
@zyw-bot csmith-fuzz |
Based on llvm-opt-benchmark, the main impact this has is enabling the gep + add -> gep + gep in many more cases (there are a few other transforms that only support single index gep, but they're less significant). This mostly has positive effects (as many gep + adds become a constant offset gep where the base was CSEd). The main issue I saw is that we now sometimes fail to eliminate null checks of the form |
I checked whether we can mitigate this by folding null checks before doing this transform, but at least in the one case I looked at, the null check only gets inlined after this fold has already happened. |
I also checked whether not going this transform for constant negative offsets might make sense, but it turns out that it's pretty widely beneficial even in that case. So I don't think there's anything we can really do about those regressions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LG
It appears that this broke our libc on GPU buildbot: https://lab.llvm.org/buildbot/#/builders/10/builds/12560 A failing configuration can be reproduced with |
This updates tests/codegen-llvm/issues/issue-118306.rs to pass also after llvm/llvm-project#155415
@jplehr I don't have an AMDGPU environment to debug this. Is it possible to reduce this further? |
Yeah, I'm sure it is. Is IR dump what's needed here, either original or before some specific pass? |
Not sure whether it's going to be sufficient, but the original IR dump would be a good starting point at least. |
Adjust issue-118306.rs test after LLVM change This updates tests/codegen-llvm/issues/issue-118306.rs to pass also after llvm/llvm-project#155415
Rollup merge of #146116 - zmodem:issue_118306_fix, r=nikic Adjust issue-118306.rs test after LLVM change This updates tests/codegen-llvm/issues/issue-118306.rs to pass also after llvm/llvm-project#155415
I finally got some IR and maybe some other info. I attached the IR after the pass of LCSSA which runs before GVN. Let me know how helpful that is or what I can do to help better with this issue. |
I don't see anything immediately problematic looking at that function, https://godbolt.org/z/34fq6Erz8. But I'm not sure overall, cc @arsenm. |
Just a guess but possibly related to @ritter-x2a's work with using GEP flags and fixing incorrect addressing mode matching |
GVN doesn't seem to do anything particularly interesting there. I tried running this through llc to see how the codegen changes, but that gives:
Does this need extra llc flags beyond the triple? |
This IR is not suitable for running through codegen again. We cannot deal with IR that has already had control flow intrinsics inserted. Reproducers should be IR at the start of codegen, not the end of the codegen IR pipeline |
Oh, I see. I thought this was a GVN run in the middle-end, but it's one in the backend. As a side note, I noticed this comment: llvm-project/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp Lines 1380 to 1393 in 2f5500e
This looks very outdated. EarlyCSE supports both commutative instructions and poison flag intersection. You may be able to save some compile-time by using EarlyCSE instead of GVN there... |
If there is value in it, I can see to extract the IR earlier in the pipeline and attach that here. |
Yes, need the IR at the start of codegen |
Attached is the IR from pre codegen. Apologies that this may again not be what is needed. I always find the clang-linker-wrapper thing a bit confusing. |
I reverted this patch locally as this restores the tests back to passing and then retrieved the IR from the same point in the pipeline -- attached below. Good case, line 189
Bad case, line 189
I do not know whether this is any significant though. Mostly caught my attention as a difference and that the integer literal that went away ( |
I looked at the diff between those two, and all the changes look correct to me. The only real difference is related to __cxa_thread_finalize, which is probably just a difference in baselines (it was added yesterday). |
There are more diffs in the llc output, but I find it hard to interpret those without more knowledge about AMDGPU. |
I think the relevant part of the diff is probably: --- libc-gpu-precodegen-good.txt 2025-09-04 14:13:56.751818563 +0100
+++ libc-gpu-precodegen.txt 2025-09-04 14:13:49.402694527 +0100
@@ -719 +720 @@
- %call.i572 = tail call fastcc noundef zeroext i1 @_ZN22__llvm_libc_22_0_0_git7testing8internal4testIiEEbPNS1_10RunContextENS0_8TestCondET_S6_PKcS8_NS1_8LocationE(ptr noundef %2, i32 noundef 240, i32 noundef 240, ptr noundef addrspacecast (ptr addrspace(4) @.str.10 to ptr), ptr noundef addrspacecast (ptr addrspace(4) @.str.24 to ptr), i32 133) #28
+ %call.i572 = tail call fastcc noundef zeroext i1 @_ZN22__llvm_libc_22_0_0_git7testing8internal4testIiEEbPNS1_10RunContextENS0_8TestCondET_S6_PKcS8_NS1_8LocationE(ptr noundef %2, i32 noundef poison, i32 noundef 240, ptr noundef addrspacecast (ptr addrspace(4) @.str.10 to ptr), ptr noundef addrspacecast (ptr addrspace(4) @.str.24 to ptr), i32 133) #27 Note that in the good case the first three arguments to the call are "%2, 240, 240" while in the bad case they are "%2, poison, 240". |
Ah yes, good point. That means the issue was already introduced at an earlier point in the middle-end pipeline, not during codegen. |
I checked out I have attached all IR files (except precodegen), but for more insight at what exact points in the pipeline they are produced @jhuber6 needs to help me. These are produced with 01-preopt-bad.txt |
This diff looks suspicious to me in function --- 01-preopt-good.txt 2025-09-05 10:39:57.226224451 +0100
+++ 01-preopt-bad.txt 2025-09-05 10:39:56.149207707 +0100
@@ -11112,10 +11113,10 @@
if.then4: ; preds = %if.end
- %sub7 = add nsw i64 %conv, -1
- %4 = extractelement <4 x i8> <i8 0, i8 -64, i8 -32, i8 -16>, i64 %sub7
- %5 = load i32, ptr %0, align 4, !tbaa !58
+ %4 = add nuw nsw i64 %conv, 4294967295
+ %5 = extractelement <4 x i8> <i8 0, i8 -64, i8 -32, i8 -16>, i64 %4
+ %6 = load i32, ptr %0, align 4, !tbaa !58
%sh_prom = trunc nuw nsw i64 %mul to i32
- %shr = lshr i32 %5, %sh_prom
+ %shr = lshr i32 %6, %sh_prom
%shr.tr = trunc i32 %shr to i8
- %or.narrow = or i8 %4, %shr.tr
- %6 = zext i8 %or.narrow to i32
+ %or.narrow = or i8 %5, %shr.tr
+ %7 = zext i8 %or.narrow to i32
br label %if.end15 I am guessing that the |
Given this is in the |
Yes, you're currently dumping the pre-LTO IR, but we'd want the full optimized one produced by the compiler. (Should be possible to get it with Does anyone know which transform is capable of converting a load from a constant global into a dynamic extract from a vector constant? I tried and failed to reproduce that behavior on simpler test cases. |
GEPs are often in the form
gep [N x %T], ptr %p, i64 0, i64 %idx
. Canonicalize these togep %T, ptr %p, i64 %idx
.This enables transforms that only support one GEP index to work and improves CSE.
Various transforms were recently hardened to make sure they still work without the leading index. #155404 is the last problematic transform I'm aware of.