-
Notifications
You must be signed in to change notification settings - Fork 15.7k
[Delinearization] Extract array dimensions from global declarations #175158
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
@llvm/pr-subscribers-llvm-analysis Author: Sjoerd Meijer (sjoerdmeijer) ChangesThis is extracted from #156342 that implements multiple things at the same time. This patch is focusing on one thing only: to extract array bounds from global declarations to help loop interchange with our motivating example: test/Transforms/LoopInterchange/large-nested-4d.ll. With this patch, loop bounds are extracted from a global variable declaration. The result is that Delinearization now succeeds and the different subscripts are recognised, leading to improved DA results. A few remarks on this approach to extract bounds from global declarations:
Next steps:
Patch is 52.43 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/175158.diff 25 Files Affected:
diff --git a/llvm/include/llvm/Analysis/Delinearization.h b/llvm/include/llvm/Analysis/Delinearization.h
index d48b57cc4284f..a616fbb03e6f4 100644
--- a/llvm/include/llvm/Analysis/Delinearization.h
+++ b/llvm/include/llvm/Analysis/Delinearization.h
@@ -44,7 +44,15 @@ void collectParametricTerms(ScalarEvolution &SE, const SCEV *Expr,
/// (third step of delinearization).
void computeAccessFunctions(ScalarEvolution &SE, const SCEV *Expr,
SmallVectorImpl<const SCEV *> &Subscripts,
- SmallVectorImpl<const SCEV *> &Sizes);
+ SmallVectorImpl<const SCEV *> &Sizes,
+ const SCEV *ElementSize = nullptr);
+
+
+bool delinearizeUsingArrayInfo(ScalarEvolution &SE, const SCEV *AccessFn,
+ SmallVectorImpl<const SCEV *> &Subscripts,
+ SmallVectorImpl<const SCEV *> &Sizes,
+ const SCEV *ElementSize);
+
/// Split this SCEVAddRecExpr into two vectors of SCEVs representing the
/// subscripts and sizes of an array access.
///
diff --git a/llvm/lib/Analysis/Delinearization.cpp b/llvm/lib/Analysis/Delinearization.cpp
index 4263c6f49993e..cd0481c939a28 100644
--- a/llvm/lib/Analysis/Delinearization.cpp
+++ b/llvm/lib/Analysis/Delinearization.cpp
@@ -21,6 +21,7 @@
#include "llvm/IR/Constants.h"
#include "llvm/IR/DerivedTypes.h"
#include "llvm/IR/Function.h"
+#include "llvm/IR/GlobalVariable.h"
#include "llvm/IR/InstIterator.h"
#include "llvm/IR/Instructions.h"
#include "llvm/IR/PassManager.h"
@@ -345,7 +346,8 @@ void llvm::findArrayDimensions(ScalarEvolution &SE,
void llvm::computeAccessFunctions(ScalarEvolution &SE, const SCEV *Expr,
SmallVectorImpl<const SCEV *> &Subscripts,
- SmallVectorImpl<const SCEV *> &Sizes) {
+ SmallVectorImpl<const SCEV *> &Sizes,
+ const SCEV *ElementSize) {
// Early exit in case this SCEV is not an affine multivariate function.
if (Sizes.empty())
return;
@@ -397,9 +399,13 @@ void llvm::computeAccessFunctions(ScalarEvolution &SE, const SCEV *Expr,
Subscripts.push_back(R);
}
- // Also push in last position the remainder of the last division: it will be
- // the access function of the innermost dimension.
- Subscripts.push_back(Res);
+ if (!Res->isZero()) {
+ // This is only needed when the outermost array size is not known. Res = 0
+ // when the outermost array dimension is known, as for example when reading
+ // array sizes from a local or global declaration.
+ Subscripts.push_back(Res);
+ LLVM_DEBUG(dbgs() << "Subscripts push_back Res: " << *Res << "\n");
+ }
std::reverse(Subscripts.begin(), Subscripts.end());
@@ -411,6 +417,139 @@ void llvm::computeAccessFunctions(ScalarEvolution &SE, const SCEV *Expr,
});
}
+// Extract array dimensions from global variable declarations and return true
+// if array dimensions were successfully extracted.
+//
+// TODO:
+// - This can easily be extended to also extract dimensions from alloca
+// instructions.
+// - Global variable declarations might be subject to LLVM IR simplifications,
+// i.e. dimenions might be omitted. Adapting this function to that should
+// be easy as we would need to emit and pick up meta-data (or something
+// similar).
+//
+static bool
+extractArrayInfoFromGlobal(ScalarEvolution &SE, Value *BasePtr,
+ SmallVectorImpl<const SCEV *> &Sizes,
+ const SCEV *ElementSize) {
+ // Clear output vector.
+ Sizes.clear();
+
+ LLVM_DEBUG(
+ dbgs() << "extractArrayInfoFromGlobal called with BasePtr: "
+ << *BasePtr << "\n");
+
+ // Distinguish between simple array accesses and complex pointer arithmetic,
+ // i.e. check if this is a simple array access pattern:
+ // GEP [N x T]* @array, 0, idx
+ // This represents direct indexing like array[i], which should use array
+ // dimensions.
+ if (auto *GEP = dyn_cast<GetElementPtrInst>(BasePtr)) {
+ if (GEP->getNumIndices() == 2) {
+ auto *FirstIdx = dyn_cast<ConstantInt>(GEP->getOperand(1));
+ if (FirstIdx && FirstIdx->isZero()) {
+ // Simple array access: extract dimensions from the underlying array
+ // type
+ Value *Source = GEP->getPointerOperand()->stripPointerCasts();
+ return extractArrayInfoFromGlobal(SE, Source, Sizes, ElementSize);
+ }
+ }
+ // Complex GEPs like (&array[offset])[index] represent pointer arithmetic,
+ // not simple array indexing. These should be handled by parametric
+ // delinearization to preserve the linearized byte-offset semantics rather
+ // than treating them as multidimensional array accesses.
+ return false;
+ }
+
+ // Check if BasePtr is a global variable (todo: check also alloca's).
+ Type *ElementType = nullptr;
+ if (auto *GV = dyn_cast<GlobalVariable>(BasePtr)) {
+ ElementType = GV->getValueType();
+ LLVM_DEBUG(dbgs() << "Found global variable with type: " << *ElementType
+ << "\n");
+ } else {
+ LLVM_DEBUG(dbgs() << "No global found for base pointer\n");
+ return false;
+ }
+
+ // Extract dimensions from nested array types.
+ Type *I64Ty = Type::getInt64Ty(SE.getContext());
+
+ while (auto *ArrayTy = dyn_cast<ArrayType>(ElementType)) {
+ uint64_t Size = ArrayTy->getNumElements();
+ const SCEV *SizeSCEV = SE.getConstant(I64Ty, Size);
+ Sizes.push_back(SizeSCEV);
+ ElementType = ArrayTy->getElementType();
+ LLVM_DEBUG(dbgs() << " Found array dimension: " << Size << "\n");
+ }
+
+ if (Sizes.empty()) {
+ LLVM_DEBUG(dbgs() << "No array dimensions found in type\n");
+ return false;
+ }
+
+ // Add element size as the last element for computeAccessFunctions algorithm.
+ Sizes.push_back(ElementSize);
+
+ LLVM_DEBUG({
+ dbgs() << "Extracted array info from global for base pointer "
+ << *BasePtr << "\n";
+ dbgs() << "Dimensions: ";
+ for (const SCEV *Size : Sizes)
+ dbgs() << *Size << " ";
+ dbgs() << "\n";
+ });
+
+ return true;
+}
+
+// Try to infer array information and dimensions. Currently, we only extract
+// bounds from static global variable declarations. Work in progress is to
+// make this more general by using type information emitted by the front-end.
+bool llvm::delinearizeUsingArrayInfo(ScalarEvolution &SE, const SCEV *AccessFn,
+ SmallVectorImpl<const SCEV *> &Subscripts,
+ SmallVectorImpl<const SCEV *> &Sizes,
+ const SCEV *ElementSize) {
+ // Clear output vectors.
+ Subscripts.clear();
+ Sizes.clear();
+
+ const SCEVUnknown *BasePointer =
+ dyn_cast<SCEVUnknown>(SE.getPointerBase(AccessFn));
+ if (!BasePointer) {
+ LLVM_DEBUG(dbgs() << "No base pointer for AccessFn: " << *AccessFn << "\n");
+ return false;
+ }
+
+ Value *BasePtr = BasePointer->getValue();
+
+ // Extract array dimensions from global declarations.
+ if (!extractArrayInfoFromGlobal(SE, BasePtr, Sizes, ElementSize))
+ return false;
+
+ // Get the full SCEV expression and subtract the base pointer to get
+ // offset-only expression.
+ const SCEV *Expr = SE.getMinusSCEV(AccessFn, BasePointer);
+
+ computeAccessFunctions(SE, Expr, Subscripts, Sizes, ElementSize);
+ if (Sizes.empty() || Subscripts.empty())
+ return false;
+
+ // Validate dimension consistency: subscripts should match array dimensions
+ // (Sizes includes element size as last element, so array dimensions =
+ // Sizes.size() - 1)
+ unsigned ArrayDims = Sizes.size() - 1;
+ if (Subscripts.size() != ArrayDims) {
+ LLVM_DEBUG(
+ dbgs() << "delinearizeUsingArrayInfo: Dimension mismatch - "
+ << Subscripts.size() << " subscripts for " << ArrayDims
+ << " array dimensions. Falling back to parametric method.\n");
+ return false;
+ }
+
+ return true;
+}
+
/// Splits the SCEV into two vectors of SCEVs representing the subscripts and
/// sizes of an array access. Returns the remainder of the delinearization that
/// is the offset start of the array. The SCEV->delinearize algorithm computes
@@ -468,6 +607,16 @@ void llvm::delinearize(ScalarEvolution &SE, const SCEV *Expr,
Subscripts.clear();
Sizes.clear();
+ if (delinearizeUsingArrayInfo(SE, Expr, Subscripts, Sizes, ElementSize))
+ return;
+
+ LLVM_DEBUG(dbgs() << "delinearize falling back to parametric method\n");
+
+ // Fall back to parametric delinearization.
+ if (const SCEVUnknown *BasePointer =
+ dyn_cast<SCEVUnknown>(SE.getPointerBase(Expr)))
+ Expr = SE.getMinusSCEV(Expr, BasePointer);
+
// First step: collect parametric terms.
SmallVector<const SCEV *, 4> Terms;
collectParametricTerms(SE, Expr, Terms);
@@ -826,7 +975,6 @@ void printDelinearization(raw_ostream &O, Function *F, LoopInfo *LI,
// Do not delinearize if we cannot find the base pointer.
if (!BasePointer)
break;
- AccessFn = SE->getMinusSCEV(AccessFn, BasePointer);
O << "\n";
O << "Inst:" << Inst << "\n";
@@ -835,8 +983,7 @@ void printDelinearization(raw_ostream &O, Function *F, LoopInfo *LI,
SmallVector<const SCEV *, 3> Subscripts, Sizes;
auto IsDelinearizationFailed = [&]() {
- return Subscripts.size() == 0 || Sizes.size() == 0 ||
- Subscripts.size() != Sizes.size();
+ return Subscripts.empty() || Sizes.empty();
};
delinearize(*SE, AccessFn, Subscripts, Sizes, SE->getElementSize(&Inst));
@@ -847,26 +994,37 @@ void printDelinearization(raw_ostream &O, Function *F, LoopInfo *LI,
SE->getElementSize(&Inst));
}
- if (IsDelinearizationFailed()) {
- O << "failed to delinearize\n";
- continue;
- }
+ if (IsDelinearizationFailed()) {
+ O << "failed to delinearize\n";
+ continue;
+ }
+
+ O << "Base offset: " << *BasePointer << "\n";
+ O << "ArrayDecl";
+ // Print [Unknown] when the outermost dimension of the array is not known.
+ // Sizes[NumSizes - 1] is the array element size.
+ int NumSubscripts = Subscripts.size();
+ int NumSizes = Sizes.size();
+ if (NumSizes == NumSubscripts)
+ O << "[UnknownSize]";
+
+ // Handle different size relationships between Subscripts and Sizes.
+ if (NumSizes > 0) {
+ // Print array dimensions (all but the last, which is element size).
+ for (const SCEV *Size : ArrayRef(Sizes).drop_back())
+ O << "[" << *Size << "]";
+
+ // Print element size (last element in Sizes array).
+ O << " with elements of " << *Sizes[NumSizes - 1] << " bytes.\n";
+ } else {
+ O << " unknown sizes.\n";
+ }
+
+ O << "ArrayRef";
+ for (int i = 0; i < NumSubscripts; i++)
+ O << "[" << *Subscripts[i] << "]";
+ O << "\n";
- O << "Base offset: " << *BasePointer << "\n";
- O << "ArrayDecl[UnknownSize]";
- int Size = Subscripts.size();
- for (int i = 0; i < Size - 1; i++)
- O << "[" << *Sizes[i] << "]";
- O << " with elements of " << *Sizes[Size - 1] << " bytes.\n";
-
- O << "ArrayRef";
- for (int i = 0; i < Size; i++)
- O << "[" << *Subscripts[i] << "]";
- O << "\n";
-
- bool IsValid = validateDelinearizationResult(*SE, Sizes, Subscripts);
- O << "Delinearization validation: " << (IsValid ? "Succeeded" : "Failed")
- << "\n";
}
}
diff --git a/llvm/lib/Analysis/DependenceAnalysis.cpp b/llvm/lib/Analysis/DependenceAnalysis.cpp
index 23820853e6fee..b6ee848ebd605 100644
--- a/llvm/lib/Analysis/DependenceAnalysis.cpp
+++ b/llvm/lib/Analysis/DependenceAnalysis.cpp
@@ -3363,9 +3363,9 @@ bool DependenceInfo::tryDelinearizeFixedSize(
const SCEV *ElemSize = SE->getElementSize(Src);
assert(ElemSize == SE->getElementSize(Dst) && "Different element sizes");
SmallVector<const SCEV *, 4> SrcSizes, DstSizes;
- if (!delinearizeFixedSizeArray(*SE, SE->removePointerBase(SrcAccessFn),
+ if (!delinearizeFixedSizeArray(*SE, SrcAccessFn /*SE->removePointerBase(SrcAccessFn)*/,
SrcSubscripts, SrcSizes, ElemSize) ||
- !delinearizeFixedSizeArray(*SE, SE->removePointerBase(DstAccessFn),
+ !delinearizeFixedSizeArray(*SE, DstAccessFn /*SE->removePointerBase(DstAccessFn)*/,
DstSubscripts, DstSizes, ElemSize))
return false;
diff --git a/llvm/test/Analysis/Delinearization/a.ll b/llvm/test/Analysis/Delinearization/a.ll
index 5d2d4dc29206e..55ca1c70d0a86 100644
--- a/llvm/test/Analysis/Delinearization/a.ll
+++ b/llvm/test/Analysis/Delinearization/a.ll
@@ -11,11 +11,10 @@
define void @foo(i64 %n, i64 %m, i64 %o, ptr nocapture %A) #0 {
; CHECK-LABEL: 'foo'
; CHECK-NEXT: Inst: store i32 1, ptr %arrayidx11.us.us, align 4
-; CHECK-NEXT: AccessFunction: {{\{\{\{}}(28 + (4 * (-4 + (3 * %m)) * %o)),+,(8 * %m * %o)}<%for.i>,+,(12 * %o)}<%for.j>,+,20}<%for.k>
+; CHECK-NEXT: AccessFunction: {{\{\{\{}}(28 + (4 * (-4 + (3 * %m)) * %o) + %A),+,(8 * %m * %o)}<%for.i>,+,(12 * %o)}<%for.j>,+,20}<%for.k>
; CHECK-NEXT: Base offset: %A
; CHECK-NEXT: ArrayDecl[UnknownSize][%m][%o] with elements of 4 bytes.
; CHECK-NEXT: ArrayRef[{3,+,2}<nuw><%for.i>][{-4,+,3}<nw><%for.j>][{7,+,5}<nw><%for.k>]
-; CHECK-NEXT: Delinearization validation: Failed
;
entry:
%cmp32 = icmp sgt i64 %n, 0
diff --git a/llvm/test/Analysis/Delinearization/byte_offset.ll b/llvm/test/Analysis/Delinearization/byte_offset.ll
index 743dcfcca6400..b17082dd3e31a 100644
--- a/llvm/test/Analysis/Delinearization/byte_offset.ll
+++ b/llvm/test/Analysis/Delinearization/byte_offset.ll
@@ -13,7 +13,7 @@
define void @foo(ptr %A, i64 %i2, i64 %arg, i1 %c) {
; CHECK-LABEL: 'foo'
; CHECK-NEXT: Inst: store float 0.000000e+00, ptr %arrayidx, align 4
-; CHECK-NEXT: AccessFunction: ({0,+,%i2}<%outer.loop> + %unknown)
+; CHECK-NEXT: AccessFunction: ({%A,+,%i2}<%outer.loop> + %unknown)
; CHECK-NEXT: failed to delinearize
;
entry:
diff --git a/llvm/test/Analysis/Delinearization/constant_functions_multi_dim.ll b/llvm/test/Analysis/Delinearization/constant_functions_multi_dim.ll
index 7e5c5142dccbc..911f109272c16 100644
--- a/llvm/test/Analysis/Delinearization/constant_functions_multi_dim.ll
+++ b/llvm/test/Analysis/Delinearization/constant_functions_multi_dim.ll
@@ -7,18 +7,16 @@ target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
define void @mat_mul(ptr %C, ptr %A, ptr %B, i64 %N) !kernel_arg_addr_space !2 !kernel_arg_access_qual !3 !kernel_arg_type !4 !kernel_arg_base_type !4 !kernel_arg_type_qual !5 {
; CHECK-LABEL: 'mat_mul'
; CHECK-NEXT: Inst: %tmp = load float, ptr %arrayidx, align 4
-; CHECK-NEXT: AccessFunction: {(4 * %N * %call),+,4}<%for.inc>
+; CHECK-NEXT: AccessFunction: {((4 * %N * %call) + %A),+,4}<%for.inc>
; CHECK-NEXT: Base offset: %A
; CHECK-NEXT: ArrayDecl[UnknownSize][%N] with elements of 4 bytes.
; CHECK-NEXT: ArrayRef[%call][{0,+,1}<nuw><nsw><%for.inc>]
-; CHECK-NEXT: Delinearization validation: Failed
; CHECK-EMPTY:
; CHECK-NEXT: Inst: %tmp5 = load float, ptr %arrayidx4, align 4
-; CHECK-NEXT: AccessFunction: {(4 * %call1),+,(4 * %N)}<%for.inc>
+; CHECK-NEXT: AccessFunction: {((4 * %call1) + %B),+,(4 * %N)}<%for.inc>
; CHECK-NEXT: Base offset: %B
; CHECK-NEXT: ArrayDecl[UnknownSize][%N] with elements of 4 bytes.
; CHECK-NEXT: ArrayRef[{0,+,1}<nuw><nsw><%for.inc>][%call1]
-; CHECK-NEXT: Delinearization validation: Failed
;
entry:
br label %entry.split
diff --git a/llvm/test/Analysis/Delinearization/divide_by_one.ll b/llvm/test/Analysis/Delinearization/divide_by_one.ll
index 3d8e55984291e..747b2c5489785 100644
--- a/llvm/test/Analysis/Delinearization/divide_by_one.ll
+++ b/llvm/test/Analysis/Delinearization/divide_by_one.ll
@@ -14,18 +14,16 @@ target datalayout = "e-m:e-p:32:32-i1:32-i64:64-a:0-n32"
define void @test(ptr nocapture %dst, i32 %stride, i32 %bs) {
; CHECK-LABEL: 'test'
; CHECK-NEXT: Inst: %0 = load i8, ptr %arrayidx, align 1
-; CHECK-NEXT: AccessFunction: {{\{\{}}(-1 + ((1 + %bs) * %stride)),+,(-1 * %stride)}<%for.cond1.preheader>,+,1}<nw><%for.body3>
+; CHECK-NEXT: AccessFunction: {{\{\{}}(-1 + ((1 + %bs) * %stride) + %dst),+,(-1 * %stride)}<%for.cond1.preheader>,+,1}<nw><%for.body3>
; CHECK-NEXT: Base offset: %dst
; CHECK-NEXT: ArrayDecl[UnknownSize][%stride] with elements of 1 bytes.
; CHECK-NEXT: ArrayRef[{(1 + %bs),+,-1}<nw><%for.cond1.preheader>][{-1,+,1}<nw><%for.body3>]
-; CHECK-NEXT: Delinearization validation: Failed
; CHECK-EMPTY:
; CHECK-NEXT: Inst: store i8 %0, ptr %arrayidx7, align 1
-; CHECK-NEXT: AccessFunction: {{\{\{}}(%stride * %bs),+,(-1 * %stride)}<%for.cond1.preheader>,+,1}<nsw><%for.body3>
+; CHECK-NEXT: AccessFunction: {{\{\{}}((%stride * %bs) + %dst),+,(-1 * %stride)}<%for.cond1.preheader>,+,1}<nw><%for.body3>
; CHECK-NEXT: Base offset: %dst
; CHECK-NEXT: ArrayDecl[UnknownSize][%stride] with elements of 1 bytes.
; CHECK-NEXT: ArrayRef[{%bs,+,-1}<nsw><%for.cond1.preheader>][{0,+,1}<nuw><nsw><%for.body3>]
-; CHECK-NEXT: Delinearization validation: Failed
;
entry:
%cmp20 = icmp sgt i32 %bs, -1
diff --git a/llvm/test/Analysis/Delinearization/fixed_size_array.ll b/llvm/test/Analysis/Delinearization/fixed_size_array.ll
index 250d46c81a25b..284e4e95b8a28 100644
--- a/llvm/test/Analysis/Delinearization/fixed_size_array.ll
+++ b/llvm/test/Analysis/Delinearization/fixed_size_array.ll
@@ -11,11 +11,8 @@
define void @a_i_j_k(ptr %a) {
; CHECK-LABEL: 'a_i_j_k'
; CHECK-NEXT: Inst: store i32 1, ptr %idx, align 4
-; CHECK-NEXT: AccessFunction: {{\{\{\{}}0,+,1024}<nuw><nsw><%for.i.header>,+,128}<nw><%for.j.header>,+,4}<nw><%for.k>
-; CHECK-NEXT: Base offset: %a
-; CHECK-NEXT: ArrayDecl[UnknownSize][8][32] with elements of 4 bytes.
-; CHECK-NEXT: ArrayRef[{0,+,1}<nuw><nsw><%for.i.header>][{0,+,1}<nuw><nsw><%for.j.header>][{0,+,1}<nuw><nsw><%for.k>]
-; CHECK-NEXT: Delinearization validation: Succeeded
+; CHECK-NEXT: AccessFunction: {{\{\{\{}}%a,+,1024}<nw><%for.i.header>,+,128}<nw><%for.j.header>,+,4}<nw><%for.k>
+; CHECK-NEXT: failed to delinearize
;
entry:
br label %for.i.header
@@ -60,11 +57,8 @@ exit:
define void @a_i_nj_k(ptr %a) {
; CHECK-LABEL: 'a_i_nj_k'
; CHECK-NEXT: Inst: store i32 1, ptr %idx, align 4
-; CHECK-NEXT: AccessFunction: {{\{\{\{}}896,+,1024}<nuw><nsw><%for.i.header>,+,-128}<nw><%for.j.header>,+,4}<nw><%for.k>
-; CHECK-NEXT: Base offset: %a
-; CHECK-NEXT: ArrayDecl[UnknownSize][8][32] with elements of 4 bytes.
-; CHECK-NEXT: ArrayRef[{0,+,1}<nuw><nsw><%for.i.header>][{7,+,-1}<nsw><%for.j.header>][{0,+,1}<nuw><nsw><%for.k>]
-; CHECK-NEXT: Delinearization validation: Succeeded
+; CHECK-NEXT: AccessFunction: {{\{\{\{}}(896 + %a),+,1024}<nw><%for.i.header>,+,-128}<nw><%for.j.header>,+,4}<nw><%for.k>
+; CHECK-NEXT: failed to delinearize
;
entry:
br label %for.i.header
@@ -116,18 +110,12 @@ exit:
define void @a_ijk_b_i2jk(ptr %a, ptr %b) {
; CHECK-LABEL: 'a_ijk_b_i2jk'
; CHECK-NEXT: Inst: store i32 1, ptr %a.idx, align 4
-; CHECK-NEXT: AccessFunction: {{\{\{\{}}0,+,1024}<nuw><nsw><%for.i.header>,+,256}<nw><%for.j.header>,+,4}<nw><%for.k>
-; CHECK-NEXT: Base offset: %a
-; CHECK-NEXT: ArrayDecl[UnknownSize][4][64] with elements of 4 bytes.
-; CHECK-NEXT: ArrayRef[{0,+,1}<nuw><nsw><%for.i.header>][{0,+,1}<nuw><nsw><%for.j.header>][{0,+,1}<nuw><nsw><%for.k>]
-; CHECK-NEXT: Delinearization validation: Succeeded
+; CHECK-NEXT: AccessFunction: {{\{\{\{}}%a,+,1024}<nw><%for.i.header>,+,256}<nw><%for.j.header>,+,4}<nw><%for.k>
+; CHECK-NEXT: failed to delinearize
; CHECK-EMPTY:
; CHECK-NEXT: Inst: store i32 1, ptr %b.idx, align 4
-; CHECK-NEXT: AccessFunction: {{\{\{\{}}0,+,1024}<nuw><nsw><%for.i.header>,+,256}<nw><%for.j.header>,+,4}<nw><%for.k>
-; CHECK-NEXT: Base offset: %b
-; CHECK-NEXT: ArrayDecl[UnknownSize][4][64] with elements of 4 bytes.
-; CHECK-NEXT: ArrayRef[{0,+,1}<nuw><nsw><%for.i.header>][{0,+,1}<nuw><nsw><%for.j.header>][{0,+,1}<nuw><nsw><%for.k>]
-; CHECK-NEXT: Delinearization validation: Succeeded
+; CHECK-NEXT: AccessFunction: {{\{\{\{}}%b,+,1024}<nw><%for.i.header>,+,256}<nw><%for.j.header>,+,4}<nw><%for.k>
+; CHECK-NEXT: failed to delinearize
;
entry:
br label %for.i.header
@@ -180,11 +168,8 @@ exit:
define void @a_i_2j1_k(ptr %a) {
; CHECK-...
[truncated]
|
You can test this locally with the following command:git-clang-format --diff origin/main HEAD --extensions cpp,h -- llvm/include/llvm/Analysis/Delinearization.h llvm/lib/Analysis/Delinearization.cpp --diff_from_common_commit
View the diff from clang-format here.diff --git a/llvm/include/llvm/Analysis/Delinearization.h b/llvm/include/llvm/Analysis/Delinearization.h
index a616fbb03..4f46d93e8 100644
--- a/llvm/include/llvm/Analysis/Delinearization.h
+++ b/llvm/include/llvm/Analysis/Delinearization.h
@@ -47,7 +47,6 @@ void computeAccessFunctions(ScalarEvolution &SE, const SCEV *Expr,
SmallVectorImpl<const SCEV *> &Sizes,
const SCEV *ElementSize = nullptr);
-
bool delinearizeUsingArrayInfo(ScalarEvolution &SE, const SCEV *AccessFn,
SmallVectorImpl<const SCEV *> &Subscripts,
SmallVectorImpl<const SCEV *> &Sizes,
diff --git a/llvm/lib/Analysis/Delinearization.cpp b/llvm/lib/Analysis/Delinearization.cpp
index cd0481c93..55f193939 100644
--- a/llvm/lib/Analysis/Delinearization.cpp
+++ b/llvm/lib/Analysis/Delinearization.cpp
@@ -428,16 +428,14 @@ void llvm::computeAccessFunctions(ScalarEvolution &SE, const SCEV *Expr,
// be easy as we would need to emit and pick up meta-data (or something
// similar).
//
-static bool
-extractArrayInfoFromGlobal(ScalarEvolution &SE, Value *BasePtr,
- SmallVectorImpl<const SCEV *> &Sizes,
- const SCEV *ElementSize) {
+static bool extractArrayInfoFromGlobal(ScalarEvolution &SE, Value *BasePtr,
+ SmallVectorImpl<const SCEV *> &Sizes,
+ const SCEV *ElementSize) {
// Clear output vector.
Sizes.clear();
- LLVM_DEBUG(
- dbgs() << "extractArrayInfoFromGlobal called with BasePtr: "
- << *BasePtr << "\n");
+ LLVM_DEBUG(dbgs() << "extractArrayInfoFromGlobal called with BasePtr: "
+ << *BasePtr << "\n");
// Distinguish between simple array accesses and complex pointer arithmetic,
// i.e. check if this is a simple array access pattern:
@@ -492,8 +490,8 @@ extractArrayInfoFromGlobal(ScalarEvolution &SE, Value *BasePtr,
Sizes.push_back(ElementSize);
LLVM_DEBUG({
- dbgs() << "Extracted array info from global for base pointer "
- << *BasePtr << "\n";
+ dbgs() << "Extracted array info from global for base pointer " << *BasePtr
+ << "\n";
dbgs() << "Dimensions: ";
for (const SCEV *Size : Sizes)
dbgs() << *Size << " ";
@@ -983,7 +981,7 @@ void printDelinearization(raw_ostream &O, Function *F, LoopInfo *LI,
SmallVector<const SCEV *, 3> Subscripts, Sizes;
auto IsDelinearizationFailed = [&]() {
- return Subscripts.empty() || Sizes.empty();
+ return Subscripts.empty() || Sizes.empty();
};
delinearize(*SE, AccessFn, Subscripts, Sizes, SE->getElementSize(&Inst));
@@ -1023,8 +1021,7 @@ void printDelinearization(raw_ostream &O, Function *F, LoopInfo *LI,
O << "ArrayRef";
for (int i = 0; i < NumSubscripts; i++)
O << "[" << *Subscripts[i] << "]";
- O << "\n";
-
+ O << "\n";
}
}
|
🐧 Linux x64 Test Results
Failed Tests(click on a test name to see its output) PollyPolly.CodeGen/invariant-load-dimension.llIf these failures are unrelated to your changes (for example tests are broken or flaky at HEAD), please open an issue at https://github.com/llvm/llvm-project/issues and add the |
🪟 Windows x64 Test Results
Failed Tests(click on a test name to see its output) PollyPolly.CodeGen/invariant-load-dimension.llIf these failures are unrelated to your changes (for example tests are broken or flaky at HEAD), please open an issue at https://github.com/llvm/llvm-project/issues and add the |
This is extracted from llvm#156342 that implements multiple things at the same time. This patch is focusing on one thing only: to extract array bounds from global declarations to help loop interchang with our motivating example: test/Transforms/LoopInterchange/large-nested-4d.ll. With this patch, loop bounds are extracted from a global variable declaration. The result is that Delinearization now succeeds and the different subscripts are recognised, leading to improved DA results. A few remarks on this approach to extract bounds from global declarations: - This is certainly not meant to be the silver bullet to make Delinearization as robust and precise as possible, but the idea is to use bounds information if it is available, and thus every little helps. - Global declarations have type information, but this might change in the future due to LLVM IR simplifications. That's okay because the transition from this to emitting some additional type information is trivial, so we will support that when this becomes relevant. - The bigger picture is to let the front-end emit subscript or bound information in the form of assumes, see llvm#159046 that needs to be revived, but that is orthogonal to this work. Next steps: - Minor changes are required in DependenceAnalysis and `delinearizeFixedSizeArray()` to actually use this; currently this is only used in `printDelinearization()` in Delinearization. - Add support for alloca's, that should be now be a small incremental change.
39e21fc to
2d0fb2d
Compare
nikic
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Next to the functional change, this also changes the debug output in a way that makes it hard to see what is actually affected (looks like the access function includes the base pointer now and delinearization results are no longer validated? Why these changes? How are they related to the rest of the PR?)
This precommits a test for llvm#175158 which should demonstrate that Delinearization succeeds when we extract loop bounds from the global variable definition.
Yeah, the changes in debug messages is maybe slightly unfortunately, I will see if I can reduce it. The most important change is new test case but with this change now succeeds: |
@test_array_10x20 = global [10 x [20 x i32]] zeroinitializerThere are 2 interpretations for this
The second interpretation would follow the trend of LLVM-IR going lower level, which could mean that eventually it might be replaced with an "allocate 800 bytes" instruction, in which case I fear we would eventually end up at the same problem we have with using the dimensional information from However, information of delineratization could be interpreted as only a hint, not semantic, as e.g. Polly does. Runtime condition checks |
This is extracted from #156342 that implements multiple things at the same time. This patch is focusing on one thing only: to extract array bounds from global declarations to help loop interchange with our motivating example: test/Transforms/LoopInterchange/large-nested-4d.ll.
With this patch, loop bounds are extracted from a global variable declaration. The result is that Delinearization now succeeds and the different subscripts are recognised, leading to improved DA results.
A few remarks on this approach to extract bounds from global declarations:
Next steps:
delinearizeFixedSizeArray()to actually use this; currently this is only used inprintDelinearization()in Delinearization.