[MemoryLocation] Support strided matrix loads / stores #163368

cofibrant · 2025-10-14T11:08:31Z

This patch provides an approximation of the memory locations touched by llvm.matrix.column.major.load and llvm.matrix.column.major.store, enabling dead store elimination and GVN to remove redundant loads and dead stores.

CC @fhahn

llvmbot · 2025-10-14T12:05:23Z

@llvm/pr-subscribers-llvm-transforms

@llvm/pr-subscribers-llvm-analysis

Author: Nathan Corbyn (cofibrant)

Changes

This patch provides an approximation of the memory locations touched by llvm.matrix.column.major.load and llvm.matrix.column.major.store, enabling dead store elimination and GVN to remove redundant loads and dead stores.

CC @fhahn

Full diff: https://github.com/llvm/llvm-project/pull/163368.diff

4 Files Affected:

(modified) llvm/lib/Analysis/MemoryLocation.cpp (+27)
(added) llvm/test/Analysis/BasicAA/matrix-intrinsics.ll (+47)
(added) llvm/test/Transforms/DeadStoreElimination/matrix-intrinsics.ll (+82)
(added) llvm/test/Transforms/GVN/matrix-intrinsics.ll (+98)

diff --git a/llvm/lib/Analysis/MemoryLocation.cpp b/llvm/lib/Analysis/MemoryLocation.cpp
index dcc51178b975a..123a1444a5b71 100644
--- a/llvm/lib/Analysis/MemoryLocation.cpp
+++ b/llvm/lib/Analysis/MemoryLocation.cpp
@@ -288,6 +288,33 @@ MemoryLocation MemoryLocation::getForArgument(const CallBase *Call,
                             LocationSize::precise(DL.getTypeStoreSize(
                                 II->getArgOperand(1)->getType())),
                             AATags);
+    case Intrinsic::matrix_column_major_load:
+    case Intrinsic::matrix_column_major_store: {
+      bool IsLoad = II->getIntrinsicID() == Intrinsic::matrix_column_major_load;
+      assert(ArgIdx == (IsLoad ? 0 : 1) && "Invalid argument index");
+
+      auto *Stride = dyn_cast<ConstantInt>(II->getArgOperand(IsLoad ? 1 : 2));
+      uint64_t Rows =
+          cast<ConstantInt>(II->getArgOperand(IsLoad ? 3 : 4))->getZExtValue();
+      uint64_t Cols =
+          cast<ConstantInt>(II->getArgOperand(IsLoad ? 4 : 5))->getZExtValue();
+
+      // The stride is dynamic, so there's nothing we can say.
+      if (!Stride)
+        return MemoryLocation(Arg, LocationSize::afterPointer(), AATags);
+
+      uint64_t ConstStride = Stride->getZExtValue();
+      auto *VT = cast<VectorType>(IsLoad ? II->getType()
+                                         : II->getArgOperand(0)->getType());
+      TypeSize Size =
+          DL.getTypeStoreSize(VT->getScalarType()) * ConstStride * Cols;
+
+      // In the unstrided case, we have a precise size, ...
+      if (ConstStride == Rows)
+        return MemoryLocation(Arg, LocationSize::precise(Size), AATags);
+      // otherwise we merely obtain an upper bound.
+      return MemoryLocation(Arg, LocationSize::upperBound(Size), AATags);
+    }
     }
 
     assert(
diff --git a/llvm/test/Analysis/BasicAA/matrix-intrinsics.ll b/llvm/test/Analysis/BasicAA/matrix-intrinsics.ll
new file mode 100644
index 0000000000000..b5f12f5daeb49
--- /dev/null
+++ b/llvm/test/Analysis/BasicAA/matrix-intrinsics.ll
@@ -0,0 +1,47 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 6
+; RUN: opt -aa-pipeline=basic-aa -passes=gvn -S < %s | FileCheck %s
+
+; BasicAA should prove that loads from sufficiently large static offsets
+; don't overlap with matrix loads with a statically known size.
+
+define <8 x double> @non_overlapping_strided_load(ptr %p) {
+; CHECK-LABEL: define <8 x double> @non_overlapping_strided_load(
+; CHECK-SAME: ptr [[P:%.*]]) {
+; CHECK-NEXT:  [[ENTRY:.*:]]
+; CHECK-NEXT:    [[P_OFFSET:%.*]] = getelementptr inbounds double, ptr [[P]], i64 16
+; CHECK-NEXT:    [[L:%.*]] = call <8 x double> @llvm.matrix.column.major.load.v8f64.i32(ptr [[P_OFFSET]], i32 8, i1 false, i32 4, i32 2)
+; CHECK-NEXT:    call void @llvm.matrix.column.major.store.v8f64.i64(<8 x double> [[L]], ptr [[P]], i64 8, i1 false, i32 4, i32 2)
+; CHECK-NEXT:    [[S:%.*]] = fadd <8 x double> [[L]], [[L]]
+; CHECK-NEXT:    ret <8 x double> [[S]]
+;
+entry:
+  %p.offset = getelementptr inbounds double, double* %p, i64 16
+  %l = call <8 x double> @llvm.matrix.column.major.load.v8f64.i32(ptr %p.offset, i32 8, i1 false, i32 4, i32 2)
+  call void @llvm.matrix.column.major.store(<8 x double> %l, ptr %p, i64 8, i1 false, i32 4, i32 2)
+  %l.2 = call <8 x double> @llvm.matrix.column.major.load.v8f64.i32(ptr %p.offset, i32 8, i1 false, i32 4, i32 2)
+  %s = fadd <8 x double> %l, %l.2
+  ret <8 x double> %s
+}
+
+define <8 x double> @overlapping_strided_load(ptr %p) {
+; CHECK-LABEL: define <8 x double> @overlapping_strided_load(
+; CHECK-SAME: ptr [[P:%.*]]) {
+; CHECK-NEXT:  [[ENTRY:.*:]]
+; CHECK-NEXT:    [[P_OFFSET:%.*]] = getelementptr inbounds double, ptr [[P]], i64 15
+; CHECK-NEXT:    [[L:%.*]] = call <8 x double> @llvm.matrix.column.major.load.v8f64.i32(ptr [[P_OFFSET]], i32 8, i1 false, i32 4, i32 2)
+; CHECK-NEXT:    call void @llvm.matrix.column.major.store.v8f64.i64(<8 x double> [[L]], ptr [[P]], i64 8, i1 false, i32 4, i32 2)
+; CHECK-NEXT:    [[L_2:%.*]] = call <8 x double> @llvm.matrix.column.major.load.v8f64.i32(ptr [[P_OFFSET]], i32 8, i1 false, i32 4, i32 2)
+; CHECK-NEXT:    [[S:%.*]] = fadd <8 x double> [[L]], [[L_2]]
+; CHECK-NEXT:    ret <8 x double> [[S]]
+;
+entry:
+  %p.offset = getelementptr inbounds double, double* %p, i64 15
+  %l = call <8 x double> @llvm.matrix.column.major.load.v8f64.i32(ptr %p.offset, i32 8, i1 false, i32 4, i32 2)
+  call void @llvm.matrix.column.major.store(<8 x double> %l, ptr %p, i64 8, i1 false, i32 4, i32 2)
+  %l.2 = call <8 x double> @llvm.matrix.column.major.load.v8f64.i32(ptr %p.offset, i32 8, i1 false, i32 4, i32 2)
+  %s = fadd <8 x double> %l, %l.2
+  ret <8 x double> %s
+}
+
+declare <8 x double> @llvm.matrix.column.major.load.v8f64.i32(ptr, i32, i1, i32, i32)
+declare void @llvm.matrix.column.major.store.v8f64.i32(<8 x double>, ptr, i32, i1, i32, i32)
diff --git a/llvm/test/Transforms/DeadStoreElimination/matrix-intrinsics.ll b/llvm/test/Transforms/DeadStoreElimination/matrix-intrinsics.ll
new file mode 100644
index 0000000000000..5f397e5f82181
--- /dev/null
+++ b/llvm/test/Transforms/DeadStoreElimination/matrix-intrinsics.ll
@@ -0,0 +1,82 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 6
+; RUN: opt -passes=dse -S < %s | FileCheck %s
+
+define void @dead_unstrided_store(ptr noalias %src, ptr noalias %dst) {
+; CHECK-LABEL: define void @dead_unstrided_store(
+; CHECK-SAME: ptr noalias [[SRC:%.*]], ptr noalias [[DST:%.*]]) {
+; CHECK-NEXT:  [[ENTRY:.*:]]
+; CHECK-NEXT:    [[L:%.*]] = call <8 x double> @llvm.matrix.column.major.load.v8f64.i32(ptr [[SRC]], i32 4, i1 false, i32 4, i32 2)
+; CHECK-NEXT:    call void @llvm.matrix.column.major.store.v8f64.i64(<8 x double> [[L]], ptr [[DST]], i64 4, i1 false, i32 4, i32 2)
+; CHECK-NEXT:    ret void
+;
+entry:
+  %l = call <8 x double> @llvm.matrix.column.major.load.v8f64.i32(ptr %src, i32 4, i1 false, i32 4, i32 2)
+  call void @llvm.matrix.column.major.store(<8 x double> %l, ptr %dst, i64 4, i1 false, i32 4, i32 2)
+  call void @llvm.matrix.column.major.store(<8 x double> %l, ptr %dst, i64 4, i1 false, i32 4, i32 2)
+  ret void
+}
+
+define void @live_strided_store(ptr noalias %src, ptr noalias %dst) {
+; CHECK-LABEL: define void @live_strided_store(
+; CHECK-SAME: ptr noalias [[SRC:%.*]], ptr noalias [[DST:%.*]]) {
+; CHECK-NEXT:  [[ENTRY:.*:]]
+; CHECK-NEXT:    [[L:%.*]] = call <8 x double> @llvm.matrix.column.major.load.v8f64.i32(ptr [[SRC]], i32 4, i1 false, i32 4, i32 2)
+; CHECK-NEXT:    call void @llvm.matrix.column.major.store.v8f64.i64(<8 x double> [[L]], ptr [[DST]], i64 200, i1 false, i32 4, i32 2)
+; CHECK-NEXT:    call void @llvm.matrix.column.major.store.v8f64.i64(<8 x double> [[L]], ptr [[DST]], i64 100, i1 false, i32 4, i32 2)
+; CHECK-NEXT:    ret void
+;
+entry:
+  %l = call <8 x double> @llvm.matrix.column.major.load.v8f64.i32(ptr %src, i32 4, i1 false, i32 4, i32 2)
+  call void @llvm.matrix.column.major.store(<8 x double> %l, ptr %dst, i64 200, i1 false, i32 4, i32 2)
+  call void @llvm.matrix.column.major.store(<8 x double> %l, ptr %dst, i64 100, i1 false, i32 4, i32 2)
+  ret void
+}
+
+define void @dead_strided_store(ptr noalias %src, ptr noalias %dst) {
+; CHECK-LABEL: define void @dead_strided_store(
+; CHECK-SAME: ptr noalias [[SRC:%.*]], ptr noalias [[DST:%.*]]) {
+; CHECK-NEXT:  [[ENTRY:.*:]]
+; CHECK-NEXT:    [[L:%.*]] = call <8 x double> @llvm.matrix.column.major.load.v8f64.i32(ptr [[SRC]], i32 200, i1 false, i32 4, i32 2)
+; CHECK-NEXT:    call void @llvm.matrix.column.major.store.v8f64.i64(<8 x double> [[L]], ptr [[DST]], i64 100, i1 false, i32 4, i32 2)
+; CHECK-NEXT:    ret void
+;
+entry:
+  %l = call <8 x double> @llvm.matrix.column.major.load.v8f64.i32(ptr %src, i32 200, i1 false, i32 4, i32 2)
+  call void @llvm.matrix.column.major.store(<8 x double> %l, ptr %dst, i64 100, i1 false, i32 4, i32 2)
+  call void @llvm.matrix.column.major.store(<8 x double> %l, ptr %dst, i64 100, i1 false, i32 4, i32 2)
+  ret void
+}
+
+define void @dead_dynamically_strided_store(ptr noalias %src, ptr noalias %dst, i64 %stride) {
+; CHECK-LABEL: define void @dead_dynamically_strided_store(
+; CHECK-SAME: ptr noalias [[SRC:%.*]], ptr noalias [[DST:%.*]], i64 [[STRIDE:%.*]]) {
+; CHECK-NEXT:  [[ENTRY:.*:]]
+; CHECK-NEXT:    [[L:%.*]] = call <8 x double> @llvm.matrix.column.major.load.v8f64.i32(ptr [[SRC]], i32 4, i1 false, i32 4, i32 2)
+; CHECK-NEXT:    call void @llvm.matrix.column.major.store.v8f64.i64(<8 x double> [[L]], ptr [[DST]], i64 [[STRIDE]], i1 false, i32 4, i32 2)
+; CHECK-NEXT:    ret void
+;
+entry:
+  %l = call <8 x double> @llvm.matrix.column.major.load.v8f64.i32(ptr %src, i32 4, i1 false, i32 4, i32 2)
+  call void @llvm.matrix.column.major.store(<8 x double> %l, ptr %dst, i64 %stride, i1 false, i32 4, i32 2)
+  call void @llvm.matrix.column.major.store(<8 x double> %l, ptr %dst, i64 %stride, i1 false, i32 4, i32 2)
+  ret void
+}
+
+define void @live_dynamically_strided_store(ptr noalias %src, ptr noalias %dst, i64 %stride, i64 %stride.2) {
+; CHECK-LABEL: define void @live_dynamically_strided_store(
+; CHECK-SAME: ptr noalias [[SRC:%.*]], ptr noalias [[DST:%.*]], i64 [[STRIDE:%.*]], i64 [[STRIDE_2:%.*]]) {
+; CHECK-NEXT:  [[ENTRY:.*:]]
+; CHECK-NEXT:    [[L:%.*]] = call <8 x double> @llvm.matrix.column.major.load.v8f64.i32(ptr [[SRC]], i32 4, i1 false, i32 4, i32 2)
+; CHECK-NEXT:    call void @llvm.matrix.column.major.store.v8f64.i64(<8 x double> [[L]], ptr [[DST]], i64 [[STRIDE]], i1 false, i32 4, i32 2)
+; CHECK-NEXT:    call void @llvm.matrix.column.major.store.v8f64.i64(<8 x double> [[L]], ptr [[DST]], i64 [[STRIDE_2]], i1 false, i32 4, i32 2)
+; CHECK-NEXT:    ret void
+;
+entry:
+  %l = call <8 x double> @llvm.matrix.column.major.load.v8f64.i32(ptr %src, i32 4, i1 false, i32 4, i32 2)
+  call void @llvm.matrix.column.major.store(<8 x double> %l, ptr %dst, i64 %stride, i1 false, i32 4, i32 2)
+  call void @llvm.matrix.column.major.store(<8 x double> %l, ptr %dst, i64 %stride.2, i1 false, i32 4, i32 2)
+  ret void
+}
+
+declare <8 x double> @llvm.matrix.column.major.load.v8f64.i32(ptr, i32, i1, i32, i32)
+declare void @llvm.matrix.column.major.store.v8f64.i32(<8 x double>, ptr, i32, i1, i32, i32)
diff --git a/llvm/test/Transforms/GVN/matrix-intrinsics.ll b/llvm/test/Transforms/GVN/matrix-intrinsics.ll
new file mode 100644
index 0000000000000..18d8a450fccd1
--- /dev/null
+++ b/llvm/test/Transforms/GVN/matrix-intrinsics.ll
@@ -0,0 +1,98 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 6
+; RUN: opt -passes=gvn -S < %s | FileCheck %s
+
+define <8 x double> @redundant_unstrided_load(ptr %src) {
+; CHECK-LABEL: define <8 x double> @redundant_unstrided_load(
+; CHECK-SAME: ptr [[SRC:%.*]]) {
+; CHECK-NEXT:  [[ENTRY:.*:]]
+; CHECK-NEXT:    [[L:%.*]] = call <8 x double> @llvm.matrix.column.major.load.v8f64.i32(ptr [[SRC]], i32 4, i1 false, i32 4, i32 2)
+; CHECK-NEXT:    [[S:%.*]] = fadd contract <8 x double> [[L]], [[L]]
+; CHECK-NEXT:    ret <8 x double> [[S]]
+;
+entry:
+  %l = call <8 x double> @llvm.matrix.column.major.load.v8f64.i32(ptr %src, i32 4, i1 false, i32 4, i32 2)
+  %l.2 = call <8 x double> @llvm.matrix.column.major.load.v8f64.i32(ptr %src, i32 4, i1 false, i32 4, i32 2)
+  %s = fadd contract <8 x double> %l, %l.2
+  ret <8 x double> %s
+}
+
+define <8 x double> @redundant_strided_load(ptr %src) {
+; CHECK-LABEL: define <8 x double> @redundant_strided_load(
+; CHECK-SAME: ptr [[SRC:%.*]]) {
+; CHECK-NEXT:  [[ENTRY:.*:]]
+; CHECK-NEXT:    [[L:%.*]] = call <8 x double> @llvm.matrix.column.major.load.v8f64.i32(ptr [[SRC]], i32 200, i1 false, i32 4, i32 2)
+; CHECK-NEXT:    [[S:%.*]] = fadd contract <8 x double> [[L]], [[L]]
+; CHECK-NEXT:    ret <8 x double> [[S]]
+;
+entry:
+  %l = call <8 x double> @llvm.matrix.column.major.load.v8f64.i32(ptr %src, i32 200, i1 false, i32 4, i32 2)
+  %l.2 = call <8 x double> @llvm.matrix.column.major.load.v8f64.i32(ptr %src, i32 200, i1 false, i32 4, i32 2)
+  %s = fadd contract <8 x double> %l, %l.2
+  ret <8 x double> %s
+}
+
+define <8 x double> @necessary_unstrided_load(ptr %src) {
+; CHECK-LABEL: define <8 x double> @necessary_unstrided_load(
+; CHECK-SAME: ptr [[SRC:%.*]]) {
+; CHECK-NEXT:  [[ENTRY:.*:]]
+; CHECK-NEXT:    [[L:%.*]] = call <8 x double> @llvm.matrix.column.major.load.v8f64.i32(ptr [[SRC]], i32 4, i1 false, i32 4, i32 2)
+; CHECK-NEXT:    [[L_2:%.*]] = call <8 x double> @llvm.matrix.column.major.load.v8f64.i32(ptr [[SRC]], i32 2, i1 false, i32 2, i32 4)
+; CHECK-NEXT:    [[S:%.*]] = fadd contract <8 x double> [[L]], [[L_2]]
+; CHECK-NEXT:    ret <8 x double> [[S]]
+;
+entry:
+  %l = call <8 x double> @llvm.matrix.column.major.load.v8f64.i32(ptr %src, i32 4, i1 false, i32 4, i32 2)
+  %l.2 = call <8 x double> @llvm.matrix.column.major.load.v8f64.i32(ptr %src, i32 2, i1 false, i32 2, i32 4)
+  %s = fadd contract <8 x double> %l, %l.2
+  ret <8 x double> %s
+}
+
+define <8 x double> @necessary_strided_load(ptr %src) {
+; CHECK-LABEL: define <8 x double> @necessary_strided_load(
+; CHECK-SAME: ptr [[SRC:%.*]]) {
+; CHECK-NEXT:  [[ENTRY:.*:]]
+; CHECK-NEXT:    [[L:%.*]] = call <8 x double> @llvm.matrix.column.major.load.v8f64.i32(ptr [[SRC]], i32 200, i1 false, i32 4, i32 2)
+; CHECK-NEXT:    [[L_2:%.*]] = call <8 x double> @llvm.matrix.column.major.load.v8f64.i32(ptr [[SRC]], i32 100, i1 false, i32 4, i32 2)
+; CHECK-NEXT:    [[S:%.*]] = fadd contract <8 x double> [[L]], [[L_2]]
+; CHECK-NEXT:    ret <8 x double> [[S]]
+;
+entry:
+  %l = call <8 x double> @llvm.matrix.column.major.load.v8f64.i32(ptr %src, i32 200, i1 false, i32 4, i32 2)
+  %l.2 = call <8 x double> @llvm.matrix.column.major.load.v8f64.i32(ptr %src, i32 100, i1 false, i32 4, i32 2)
+  %s = fadd contract <8 x double> %l, %l.2
+  ret <8 x double> %s
+}
+
+define <8 x double> @redundant_dynamically_strided_load(ptr %src, i32 %stride) {
+; CHECK-LABEL: define <8 x double> @redundant_dynamically_strided_load(
+; CHECK-SAME: ptr [[SRC:%.*]], i32 [[STRIDE:%.*]]) {
+; CHECK-NEXT:  [[ENTRY:.*:]]
+; CHECK-NEXT:    [[L:%.*]] = call <8 x double> @llvm.matrix.column.major.load.v8f64.i32(ptr [[SRC]], i32 [[STRIDE]], i1 false, i32 4, i32 2)
+; CHECK-NEXT:    [[S:%.*]] = fadd contract <8 x double> [[L]], [[L]]
+; CHECK-NEXT:    ret <8 x double> [[S]]
+;
+entry:
+  %l = call <8 x double> @llvm.matrix.column.major.load.v8f64.i32(ptr %src, i32 %stride, i1 false, i32 4, i32 2)
+  %l.2 = call <8 x double> @llvm.matrix.column.major.load.v8f64.i32(ptr %src, i32 %stride, i1 false, i32 4, i32 2)
+  %s = fadd contract <8 x double> %l, %l.2
+  ret <8 x double> %s
+}
+
+define <8 x double> @necessary_dynamically_strided_load(ptr %src, i32 %stride, i32 %stride.2) {
+; CHECK-LABEL: define <8 x double> @necessary_dynamically_strided_load(
+; CHECK-SAME: ptr [[SRC:%.*]], i32 [[STRIDE:%.*]], i32 [[STRIDE_2:%.*]]) {
+; CHECK-NEXT:  [[ENTRY:.*:]]
+; CHECK-NEXT:    [[L:%.*]] = call <8 x double> @llvm.matrix.column.major.load.v8f64.i32(ptr [[SRC]], i32 [[STRIDE]], i1 false, i32 4, i32 2)
+; CHECK-NEXT:    [[L_2:%.*]] = call <8 x double> @llvm.matrix.column.major.load.v8f64.i32(ptr [[SRC]], i32 [[STRIDE_2]], i1 false, i32 4, i32 2)
+; CHECK-NEXT:    [[S:%.*]] = fadd contract <8 x double> [[L]], [[L_2]]
+; CHECK-NEXT:    ret <8 x double> [[S]]
+;
+entry:
+  %l = call <8 x double> @llvm.matrix.column.major.load.v8f64.i32(ptr %src, i32 %stride, i1 false, i32 4, i32 2)
+  %l.2 = call <8 x double> @llvm.matrix.column.major.load.v8f64.i32(ptr %src, i32 %stride.2, i1 false, i32 4, i32 2)
+  %s = fadd contract <8 x double> %l, %l.2
+  ret <8 x double> %s
+}
+
+declare <8 x double> @llvm.matrix.column.major.load.v8f64.i32(ptr, i32, i1, i32, i32)
+declare void @llvm.matrix.column.major.store.v8f64.i32(<8 x double>, ptr, i32, i1, i32, i32)

llvm/test/Analysis/BasicAA/matrix-intrinsics.ll

llvm/test/Transforms/DeadStoreElimination/matrix-intrinsics.ll

llvm/test/Transforms/GVN/matrix-intrinsics.ll

fhahn

For the title, something like [MemoryLocation] Support strided matrix loads/stores. would be more accurate

llvm/lib/Analysis/MemoryLocation.cpp

nikic

LGTM

nikic · 2025-10-22T14:53:13Z

llvm/lib/Analysis/MemoryLocation.cpp

+      auto *VT = cast<VectorType>(IsLoad ? II->getType()
+                                         : II->getArgOperand(0)->getType());
+      assert(Cols != 0 && "Matrix cannot have 0 columns");
+      TypeSize Size = DL.getTypeStoreSize(VT->getScalarType()) *


Suggested change

TypeSize Size = DL.getTypeStoreSize(VT->getScalarType()) *

TypeSize Size = DL.getTypeAllocSize(VT->getScalarType()) *

Not that it is likely to make a difference here, but my understanding is that this is a GEP stride, which uses the alloc size.

fhahn

LGTM, thanks

llvm/lib/Analysis/MemoryLocation.cpp

…63368) This patch provides an approximation of the memory locations touched by `llvm.matrix.column.major.load` and `llvm.matrix.column.major.store`, enabling dead store elimination and GVN to remove redundant loads and dead stores. PR: llvm/llvm-project#163368

This patch provides an approximation of the memory locations touched by `llvm.matrix.column.major.load` and `llvm.matrix.column.major.store`, enabling dead store elimination and GVN to remove redundant loads and dead stores. PR: llvm#163368

llvmbot added llvm:analysis Includes value tracking, cost tables and constant folding llvm:transforms labels Oct 14, 2025

cofibrant marked this pull request as draft October 14, 2025 13:55

fhahn reviewed Oct 15, 2025

View reviewed changes

llvm/test/Analysis/BasicAA/matrix-intrinsics.ll Outdated Show resolved Hide resolved

llvm/test/Transforms/DeadStoreElimination/matrix-intrinsics.ll Outdated Show resolved Hide resolved

llvm/test/Transforms/GVN/matrix-intrinsics.ll Outdated Show resolved Hide resolved

fhahn reviewed Oct 15, 2025

View reviewed changes

llvm/lib/Analysis/MemoryLocation.cpp Show resolved Hide resolved

cofibrant changed the title ~~[Matrix] Enable DSE and GVN for matrix loads and stores~~ [MemoryLocation] Support strided matrix loads / stores Oct 15, 2025

cofibrant force-pushed the cofibrant/gvn-matrix-load-store branch from 463c02c to b40ddb5 Compare October 22, 2025 13:52

cofibrant marked this pull request as ready for review October 22, 2025 13:52

fhahn requested review from dtcxzyw, jroelofs and nikic October 22, 2025 13:56

[MemoryLocation] Support strided matrix loads / stores

adbfde0

cofibrant force-pushed the cofibrant/gvn-matrix-load-store branch from b40ddb5 to adbfde0 Compare October 22, 2025 14:13

nikic approved these changes Oct 22, 2025

View reviewed changes

getTypeStoreSize ~> getTypeAllocSize

8a2a5cb

fhahn approved these changes Oct 22, 2025

View reviewed changes

jroelofs reviewed Oct 22, 2025

View reviewed changes

llvm/lib/Analysis/MemoryLocation.cpp Show resolved Hide resolved

jroelofs approved these changes Oct 22, 2025

View reviewed changes

fhahn merged commit 8e7e9d4 into llvm:main Oct 22, 2025
10 checks passed

cofibrant deleted the cofibrant/gvn-matrix-load-store branch October 22, 2025 20:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[MemoryLocation] Support strided matrix loads / stores #163368

[MemoryLocation] Support strided matrix loads / stores #163368

Uh oh!

cofibrant commented Oct 14, 2025

Uh oh!

llvmbot commented Oct 14, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

fhahn left a comment

Uh oh!

Uh oh!

nikic left a comment

Uh oh!

nikic Oct 22, 2025

Uh oh!

cofibrant Oct 22, 2025

Uh oh!

fhahn left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

	TypeSize Size = DL.getTypeStoreSize(VT->getScalarType()) *
	TypeSize Size = DL.getTypeAllocSize(VT->getScalarType()) *

[MemoryLocation] Support strided matrix loads / stores #163368

[MemoryLocation] Support strided matrix loads / stores #163368

Uh oh!

Conversation

cofibrant commented Oct 14, 2025

Uh oh!

llvmbot commented Oct 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

fhahn left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

nikic left a comment

Choose a reason for hiding this comment

Uh oh!

nikic Oct 22, 2025

Choose a reason for hiding this comment

Uh oh!

cofibrant Oct 22, 2025

Choose a reason for hiding this comment

Uh oh!

fhahn left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

llvmbot commented Oct 14, 2025 •

edited

Loading