-
Notifications
You must be signed in to change notification settings - Fork 12.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[mlir][Vector] Support 0-d vectors natively in TransferOpReduceRank #112907
[mlir][Vector] Support 0-d vectors natively in TransferOpReduceRank #112907
Conversation
Since https://reviews.llvm.org/D114086 , 0-d vectors are supported in VectorType. This patch removes 0-d vector handling with scalars for the TransferOpReduceRank pattern. This pattern specifically introduces tensor.extract_slice during vectorization, causing vectorization to not fold transfer_read/transfer_write slices properly. The changes in vectorization test files reflect this. There are other places where lowering patterns are still side-stepping from handling 0-d vectors properly, by turning them into scalars, but this patch only focuses on the vector.transfer_x patterns.
@llvm/pr-subscribers-mlir @llvm/pr-subscribers-mlir-linalg Author: Kunwar Grover (Groverkss) ChangesSince https://reviews.llvm.org/D114086 , 0-d vectors are supported in VectorType. This patch removes 0-d vector handling with scalars for the TransferOpReduceRank pattern. This pattern specifically introduces tensor.extract_slice during vectorization, causing vectorization to not fold transfer_read/transfer_write slices properly. The changes in vectorization test files reflect this. There are other places where lowering patterns are still side-stepping from handling 0-d vectors properly, by turning them into scalars, but this patch only focuses on the vector.transfer_x patterns. Full diff: https://github.com/llvm/llvm-project/pull/112907.diff 4 Files Affected:
diff --git a/mlir/lib/Dialect/Vector/Transforms/LowerVectorTransfer.cpp b/mlir/lib/Dialect/Vector/Transforms/LowerVectorTransfer.cpp
index 344cfc0cbffb93..f9428a4ce28640 100644
--- a/mlir/lib/Dialect/Vector/Transforms/LowerVectorTransfer.cpp
+++ b/mlir/lib/Dialect/Vector/Transforms/LowerVectorTransfer.cpp
@@ -358,31 +358,10 @@ struct TransferOpReduceRank
op, "map is not a minor identity with broadcasting");
}
- // TODO: support zero-dimension vectors natively. See:
- // https://llvm.discourse.group/t/should-we-have-0-d-vectors/3097.
- // In the meantime, lower these to a scalar load when they pop up.
- if (reducedShapeRank == 0) {
- Value newRead;
- if (isa<TensorType>(op.getShapedType())) {
- newRead = rewriter.create<tensor::ExtractOp>(
- op.getLoc(), op.getSource(), op.getIndices());
- } else {
- newRead = rewriter.create<memref::LoadOp>(
- op.getLoc(), originalVecType.getElementType(), op.getSource(),
- op.getIndices());
- }
- return rewriter
- .create<vector::BroadcastOp>(op.getLoc(), originalVecType, newRead)
- .getVector();
- }
-
SmallVector<int64_t> newShape(
originalVecType.getShape().take_back(reducedShapeRank));
SmallVector<bool> newScalableDims(
originalVecType.getScalableDims().take_back(reducedShapeRank));
- // Vector rank cannot be zero. Handled by TransferReadToVectorLoadLowering.
- if (newShape.empty())
- return rewriter.notifyMatchFailure(op, "rank-reduced vector is 0-d");
VectorType newReadType = VectorType::get(
newShape, originalVecType.getElementType(), newScalableDims);
diff --git a/mlir/test/Conversion/VectorToSCF/vector-to-scf.mlir b/mlir/test/Conversion/VectorToSCF/vector-to-scf.mlir
index c55a0c558bc2f1..5a6da3a06387a5 100644
--- a/mlir/test/Conversion/VectorToSCF/vector-to-scf.mlir
+++ b/mlir/test/Conversion/VectorToSCF/vector-to-scf.mlir
@@ -503,8 +503,8 @@ func.func @transfer_read_within_async_execute(%A : memref<2x2xf32>) -> !async.to
// CHECK-LABEL: transfer_read_with_tensor
func.func @transfer_read_with_tensor(%arg: tensor<f32>) -> vector<1xf32> {
- // CHECK: %[[EXTRACTED:.*]] = tensor.extract %{{.*}}[] : tensor<f32>
- // CHECK-NEXT: %[[RESULT:.*]] = vector.broadcast %[[EXTRACTED]] : f32 to vector<1xf32>
+ // CHECK: %[[EXTRACTED:.*]] = vector.transfer_read %{{.*}}[], %{{.*}} : tensor<f32>, vector<f32>
+ // CHECK-NEXT: %[[RESULT:.*]] = vector.broadcast %[[EXTRACTED]] : vector<f32> to vector<1xf32>
// CHECK-NEXT: return %[[RESULT]] : vector<1xf32>
%f0 = arith.constant 0.0 : f32
%0 = vector.transfer_read %arg[], %f0 {permutation_map = affine_map<()->(0)>} :
diff --git a/mlir/test/Dialect/Linalg/vectorize-tensor-extract.mlir b/mlir/test/Dialect/Linalg/vectorize-tensor-extract.mlir
index 2c56b7139fec49..4c549357dbfed6 100644
--- a/mlir/test/Dialect/Linalg/vectorize-tensor-extract.mlir
+++ b/mlir/test/Dialect/Linalg/vectorize-tensor-extract.mlir
@@ -109,9 +109,7 @@ func.func @vectorize_nd_tensor_extract_transfer_read_basic(
// CHECK: %[[READ:.*]] = vector.transfer_read %[[ARG0]][%[[IDX1]], %[[IDX2]], %[[C0:.*]]], %[[CST_0]] {in_bounds = [true, true, true]} : tensor<3x3x3xf32>, vector<1x1x3xf32>
// CHECK: vector.transfer_write %[[READ]], %[[ARG1]][%[[C0]], %[[C0]], %[[C0]]] {in_bounds = [true, true, true]} : vector<1x1x3xf32>, tensor<1x1x3xf32>
-// Same as example above, but reading into a column tensor. Note that after the
-// vectorizatoin, the `TransferOpReduceRank` will replace
-// `vector.transfer_read` with `tensor.extract -> scalar`.
+// Same as example above, but reading into a column tensor.
// TODO: Currently this fails to vectorise when the indices are non-constant.
@@ -135,9 +133,10 @@ func.func @vectorize_nd_tensor_extract_transfer_read_basic_column(
// CHECK-LABEL: func.func @vectorize_nd_tensor_extract_transfer_read_basic_column(
// CHECK-SAME: %[[INPUT:.*]]: tensor<3x3x3xf32>,
// CHECK-SAME: %[[OUTPUT:.*]]: tensor<3x1x1xf32>)
-// CHECK: %[[C0:.*]] = arith.constant 0 : index
-// CHECK: %[[EXTRACT:.*]] = tensor.extract %[[INPUT]]{{\[}}%[[C0]], %[[C0]], %[[C0]]] : tensor<3x3x3xf32>
-// CHECK: %[[BCAST:.*]] = vector.broadcast %[[EXTRACT]] : f32 to vector<3x1x1xf32>
+// CHECK-DAG: %[[C0:.*]] = arith.constant 0 : index
+// CHECK-DAG: %[[CST_0:.*]] = arith.constant 0.000000e+00 : f32
+// CHECK: %[[READ:.*]] = vector.transfer_read %[[INPUT]]{{\[}}%[[C0]], %[[C0]], %[[C0]]], %[[CST_0]] : tensor<3x3x3xf32>, vector<f32>
+// CHECK: %[[BCAST:.*]] = vector.broadcast %[[READ]] : vector<f32> to vector<3x1x1xf32>
// CHECK: %[[RES:.*]] = vector.transfer_write %[[BCAST]], %[[OUTPUT]]{{\[}}%[[C0]], %[[C0]], %[[C0]]] {in_bounds = [true, true, true]} : vector<3x1x1xf32>, tensor<3x1x1xf32>
// CHECK: return %[[RES]] : tensor<3x1x1xf32>
@@ -514,8 +513,9 @@ func.func @vectorize_nd_tensor_extract_with_tensor_extract(%input_1: tensor<1x20
// CHECK-SAME: %[[INPUT_2:.*]]: tensor<257x24xf32>,
// CHECK: %[[EXTRACTED_0_IDX_0:.*]] = arith.constant 0 : index
// CHECK: %[[EXTRACTED_0_IDX_1:.*]] = vector.extractelement %{{.*}}[%{{.*}} : i32] : vector<4xindex>
-// First `tensor.extract` from the generic Op - loop invariant scalar load.
-// CHECK: tensor.extract %[[INPUT_1]][%[[EXTRACTED_0_IDX_0]], %[[EXTRACTED_0_IDX_1]]] : tensor<1x20xi32>
+// First `vector.transfer_read` from the generic Op - loop invariant scalar load.
+// CHECK: vector.transfer_read %[[INPUT_1]][%[[EXTRACTED_0_IDX_0]], %[[EXTRACTED_0_IDX_1]]]
+// CHECK-SAME: tensor<1x20xi32>, vector<i32>
// The following `tensor.extract` from the generic Op s a contiguous load (all Ops used
// for address calculation also satisfy the required conditions).
// CHECK: vector.transfer_read %[[INPUT_2]][%{{.*}}, %{{.*}}, %{{.*}} {in_bounds = [true, true]} : tensor<257x24xf32>, vector<1x4xf32>
@@ -718,8 +718,8 @@ func.func @vectorize_0d_tensor_extract(%arg0: tensor<f32>, %arg2: tensor<1x1x3xf
// CHECK-LABEL: func.func @vectorize_0d_tensor_extract(
// CHECK-SAME: %[[ARG_0:.*]]: tensor<f32>
-// CHECK: %[[EXTRACT:.*]] = tensor.extract %[[ARG_0]][] : tensor<f32>
-// CHECK: vector.broadcast %[[EXTRACT]] : f32 to vector<1x1x3xf32>
+// CHECK: %[[EXTRACT:.*]] = vector.transfer_read %[[ARG_0]][], %{{.+}} : tensor<f32>
+// CHECK: vector.broadcast %[[EXTRACT]] : vector<f32> to vector<1x1x3xf32>
module attributes {transform.with_named_sequence} {
transform.named_sequence @__transform_main(%arg1: !transform.any_op {transform.readonly}) {
diff --git a/mlir/test/Dialect/Vector/vector-transfer-to-vector-load-store.mlir b/mlir/test/Dialect/Vector/vector-transfer-to-vector-load-store.mlir
index 4d8e4a8296fb5a..f90111b4c88618 100644
--- a/mlir/test/Dialect/Vector/vector-transfer-to-vector-load-store.mlir
+++ b/mlir/test/Dialect/Vector/vector-transfer-to-vector-load-store.mlir
@@ -26,8 +26,8 @@ func.func @vector_transfer_ops_0d_memref(%mem: memref<f32>, %vec: vector<1x1x1xf
func.func @vector_transfer_ops_0d_tensor(%src: tensor<f32>) -> vector<1xf32> {
%f0 = arith.constant 0.0 : f32
-// CHECK-NEXT: %[[S:.*]] = tensor.extract %[[SRC]][] : tensor<f32>
-// CHECK-NEXT: %[[V:.*]] = vector.broadcast %[[S]] : f32 to vector<1xf32>
+// CHECK: %[[S:.*]] = vector.transfer_read %[[SRC]][]
+// CHECK: %[[V:.*]] = vector.broadcast %[[S]] : vector<f32> to vector<1xf32>
%res = vector.transfer_read %src[], %f0 {in_bounds = [true], permutation_map = affine_map<()->(0)>} :
tensor<f32>, vector<1xf32>
|
@llvm/pr-subscribers-mlir-vector Author: Kunwar Grover (Groverkss) ChangesSince https://reviews.llvm.org/D114086 , 0-d vectors are supported in VectorType. This patch removes 0-d vector handling with scalars for the TransferOpReduceRank pattern. This pattern specifically introduces tensor.extract_slice during vectorization, causing vectorization to not fold transfer_read/transfer_write slices properly. The changes in vectorization test files reflect this. There are other places where lowering patterns are still side-stepping from handling 0-d vectors properly, by turning them into scalars, but this patch only focuses on the vector.transfer_x patterns. Full diff: https://github.com/llvm/llvm-project/pull/112907.diff 4 Files Affected:
diff --git a/mlir/lib/Dialect/Vector/Transforms/LowerVectorTransfer.cpp b/mlir/lib/Dialect/Vector/Transforms/LowerVectorTransfer.cpp
index 344cfc0cbffb93..f9428a4ce28640 100644
--- a/mlir/lib/Dialect/Vector/Transforms/LowerVectorTransfer.cpp
+++ b/mlir/lib/Dialect/Vector/Transforms/LowerVectorTransfer.cpp
@@ -358,31 +358,10 @@ struct TransferOpReduceRank
op, "map is not a minor identity with broadcasting");
}
- // TODO: support zero-dimension vectors natively. See:
- // https://llvm.discourse.group/t/should-we-have-0-d-vectors/3097.
- // In the meantime, lower these to a scalar load when they pop up.
- if (reducedShapeRank == 0) {
- Value newRead;
- if (isa<TensorType>(op.getShapedType())) {
- newRead = rewriter.create<tensor::ExtractOp>(
- op.getLoc(), op.getSource(), op.getIndices());
- } else {
- newRead = rewriter.create<memref::LoadOp>(
- op.getLoc(), originalVecType.getElementType(), op.getSource(),
- op.getIndices());
- }
- return rewriter
- .create<vector::BroadcastOp>(op.getLoc(), originalVecType, newRead)
- .getVector();
- }
-
SmallVector<int64_t> newShape(
originalVecType.getShape().take_back(reducedShapeRank));
SmallVector<bool> newScalableDims(
originalVecType.getScalableDims().take_back(reducedShapeRank));
- // Vector rank cannot be zero. Handled by TransferReadToVectorLoadLowering.
- if (newShape.empty())
- return rewriter.notifyMatchFailure(op, "rank-reduced vector is 0-d");
VectorType newReadType = VectorType::get(
newShape, originalVecType.getElementType(), newScalableDims);
diff --git a/mlir/test/Conversion/VectorToSCF/vector-to-scf.mlir b/mlir/test/Conversion/VectorToSCF/vector-to-scf.mlir
index c55a0c558bc2f1..5a6da3a06387a5 100644
--- a/mlir/test/Conversion/VectorToSCF/vector-to-scf.mlir
+++ b/mlir/test/Conversion/VectorToSCF/vector-to-scf.mlir
@@ -503,8 +503,8 @@ func.func @transfer_read_within_async_execute(%A : memref<2x2xf32>) -> !async.to
// CHECK-LABEL: transfer_read_with_tensor
func.func @transfer_read_with_tensor(%arg: tensor<f32>) -> vector<1xf32> {
- // CHECK: %[[EXTRACTED:.*]] = tensor.extract %{{.*}}[] : tensor<f32>
- // CHECK-NEXT: %[[RESULT:.*]] = vector.broadcast %[[EXTRACTED]] : f32 to vector<1xf32>
+ // CHECK: %[[EXTRACTED:.*]] = vector.transfer_read %{{.*}}[], %{{.*}} : tensor<f32>, vector<f32>
+ // CHECK-NEXT: %[[RESULT:.*]] = vector.broadcast %[[EXTRACTED]] : vector<f32> to vector<1xf32>
// CHECK-NEXT: return %[[RESULT]] : vector<1xf32>
%f0 = arith.constant 0.0 : f32
%0 = vector.transfer_read %arg[], %f0 {permutation_map = affine_map<()->(0)>} :
diff --git a/mlir/test/Dialect/Linalg/vectorize-tensor-extract.mlir b/mlir/test/Dialect/Linalg/vectorize-tensor-extract.mlir
index 2c56b7139fec49..4c549357dbfed6 100644
--- a/mlir/test/Dialect/Linalg/vectorize-tensor-extract.mlir
+++ b/mlir/test/Dialect/Linalg/vectorize-tensor-extract.mlir
@@ -109,9 +109,7 @@ func.func @vectorize_nd_tensor_extract_transfer_read_basic(
// CHECK: %[[READ:.*]] = vector.transfer_read %[[ARG0]][%[[IDX1]], %[[IDX2]], %[[C0:.*]]], %[[CST_0]] {in_bounds = [true, true, true]} : tensor<3x3x3xf32>, vector<1x1x3xf32>
// CHECK: vector.transfer_write %[[READ]], %[[ARG1]][%[[C0]], %[[C0]], %[[C0]]] {in_bounds = [true, true, true]} : vector<1x1x3xf32>, tensor<1x1x3xf32>
-// Same as example above, but reading into a column tensor. Note that after the
-// vectorizatoin, the `TransferOpReduceRank` will replace
-// `vector.transfer_read` with `tensor.extract -> scalar`.
+// Same as example above, but reading into a column tensor.
// TODO: Currently this fails to vectorise when the indices are non-constant.
@@ -135,9 +133,10 @@ func.func @vectorize_nd_tensor_extract_transfer_read_basic_column(
// CHECK-LABEL: func.func @vectorize_nd_tensor_extract_transfer_read_basic_column(
// CHECK-SAME: %[[INPUT:.*]]: tensor<3x3x3xf32>,
// CHECK-SAME: %[[OUTPUT:.*]]: tensor<3x1x1xf32>)
-// CHECK: %[[C0:.*]] = arith.constant 0 : index
-// CHECK: %[[EXTRACT:.*]] = tensor.extract %[[INPUT]]{{\[}}%[[C0]], %[[C0]], %[[C0]]] : tensor<3x3x3xf32>
-// CHECK: %[[BCAST:.*]] = vector.broadcast %[[EXTRACT]] : f32 to vector<3x1x1xf32>
+// CHECK-DAG: %[[C0:.*]] = arith.constant 0 : index
+// CHECK-DAG: %[[CST_0:.*]] = arith.constant 0.000000e+00 : f32
+// CHECK: %[[READ:.*]] = vector.transfer_read %[[INPUT]]{{\[}}%[[C0]], %[[C0]], %[[C0]]], %[[CST_0]] : tensor<3x3x3xf32>, vector<f32>
+// CHECK: %[[BCAST:.*]] = vector.broadcast %[[READ]] : vector<f32> to vector<3x1x1xf32>
// CHECK: %[[RES:.*]] = vector.transfer_write %[[BCAST]], %[[OUTPUT]]{{\[}}%[[C0]], %[[C0]], %[[C0]]] {in_bounds = [true, true, true]} : vector<3x1x1xf32>, tensor<3x1x1xf32>
// CHECK: return %[[RES]] : tensor<3x1x1xf32>
@@ -514,8 +513,9 @@ func.func @vectorize_nd_tensor_extract_with_tensor_extract(%input_1: tensor<1x20
// CHECK-SAME: %[[INPUT_2:.*]]: tensor<257x24xf32>,
// CHECK: %[[EXTRACTED_0_IDX_0:.*]] = arith.constant 0 : index
// CHECK: %[[EXTRACTED_0_IDX_1:.*]] = vector.extractelement %{{.*}}[%{{.*}} : i32] : vector<4xindex>
-// First `tensor.extract` from the generic Op - loop invariant scalar load.
-// CHECK: tensor.extract %[[INPUT_1]][%[[EXTRACTED_0_IDX_0]], %[[EXTRACTED_0_IDX_1]]] : tensor<1x20xi32>
+// First `vector.transfer_read` from the generic Op - loop invariant scalar load.
+// CHECK: vector.transfer_read %[[INPUT_1]][%[[EXTRACTED_0_IDX_0]], %[[EXTRACTED_0_IDX_1]]]
+// CHECK-SAME: tensor<1x20xi32>, vector<i32>
// The following `tensor.extract` from the generic Op s a contiguous load (all Ops used
// for address calculation also satisfy the required conditions).
// CHECK: vector.transfer_read %[[INPUT_2]][%{{.*}}, %{{.*}}, %{{.*}} {in_bounds = [true, true]} : tensor<257x24xf32>, vector<1x4xf32>
@@ -718,8 +718,8 @@ func.func @vectorize_0d_tensor_extract(%arg0: tensor<f32>, %arg2: tensor<1x1x3xf
// CHECK-LABEL: func.func @vectorize_0d_tensor_extract(
// CHECK-SAME: %[[ARG_0:.*]]: tensor<f32>
-// CHECK: %[[EXTRACT:.*]] = tensor.extract %[[ARG_0]][] : tensor<f32>
-// CHECK: vector.broadcast %[[EXTRACT]] : f32 to vector<1x1x3xf32>
+// CHECK: %[[EXTRACT:.*]] = vector.transfer_read %[[ARG_0]][], %{{.+}} : tensor<f32>
+// CHECK: vector.broadcast %[[EXTRACT]] : vector<f32> to vector<1x1x3xf32>
module attributes {transform.with_named_sequence} {
transform.named_sequence @__transform_main(%arg1: !transform.any_op {transform.readonly}) {
diff --git a/mlir/test/Dialect/Vector/vector-transfer-to-vector-load-store.mlir b/mlir/test/Dialect/Vector/vector-transfer-to-vector-load-store.mlir
index 4d8e4a8296fb5a..f90111b4c88618 100644
--- a/mlir/test/Dialect/Vector/vector-transfer-to-vector-load-store.mlir
+++ b/mlir/test/Dialect/Vector/vector-transfer-to-vector-load-store.mlir
@@ -26,8 +26,8 @@ func.func @vector_transfer_ops_0d_memref(%mem: memref<f32>, %vec: vector<1x1x1xf
func.func @vector_transfer_ops_0d_tensor(%src: tensor<f32>) -> vector<1xf32> {
%f0 = arith.constant 0.0 : f32
-// CHECK-NEXT: %[[S:.*]] = tensor.extract %[[SRC]][] : tensor<f32>
-// CHECK-NEXT: %[[V:.*]] = vector.broadcast %[[S]] : f32 to vector<1xf32>
+// CHECK: %[[S:.*]] = vector.transfer_read %[[SRC]][]
+// CHECK: %[[V:.*]] = vector.broadcast %[[S]] : vector<f32> to vector<1xf32>
%res = vector.transfer_read %src[], %f0 {in_bounds = [true], permutation_map = affine_map<()->(0)>} :
tensor<f32>, vector<1xf32>
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/153/builds/12515 Here is the relevant piece of the build log for the reference
|
I'm not reverting it because the failure looks unrelated. If someone thinks its related, please feel free to revert. |
…lvm#112907) Since llvm@ddf2d62 , 0-d vectors are supported in VectorType. This patch removes 0-d vector handling with scalars for the TransferOpReduceRank pattern. This pattern specifically introduces tensor.extract_slice during vectorization, causing vectorization to not fold transfer_read/transfer_write slices properly. The changes in vectorization test files reflect this. There are other places where lowering patterns are still side-stepping from handling 0-d vectors properly, by turning them into scalars, but this patch only focuses on the vector.transfer_x patterns.
…ceRank (llvm#112907)" This reverts commit 1004865.
…ceRank (llvm#112907)" This reverts commit 1004865.
…ceRank (llvm#112907)" This reverts commit 1004865.
…ceRank (llvm#112907)" This reverts commit 1004865.
…ceRank (llvm#112907)" This reverts commit 1004865.
…ceRank (llvm#112907)" This reverts commit 1004865.
…ceRank (llvm#112907)" This reverts commit 1004865.
…ceRank (llvm#112907)" This reverts commit 1004865.
…uceRank (llvm#112907)" This reverts commit 8323ca8.
…ceRank (llvm#112907)" This reverts commit 1004865.
…ceRank (llvm#112907)" This reverts commit 1004865. Failing CI as discussed here: * iree-org/iree#19135
…ceRank (llvm#112907)" This reverts commit 1004865. Failing CI as discussed here: * iree-org/iree#19135
…ceRank (llvm#112907)" This reverts commit 1004865. Failing CI as discussed here: * iree-org/iree#19135
…ceRank (llvm#112907)" This reverts commit 1004865. Failing CI as discussed here: * iree-org/iree#19135
…ceRank (llvm#112907)" This reverts commit 1004865. Failing CI as discussed here: * iree-org/iree#19135
…ceRank (llvm#112907)" This reverts commit 1004865. Failing CI as discussed here: * iree-org/iree#19135
…ceRank (llvm#112907)" This reverts commit 1004865. Failing CI as discussed here: * iree-org/iree#19135
…ceRank (llvm#112907)" This reverts commit 1004865. Failing CI as discussed here: * iree-org/iree#19135
…ceRank (llvm#112907)" This reverts commit 1004865. Failing CI as discussed here: * iree-org/iree#19135
…ceRank (llvm#112907)" This reverts commit 1004865. Failing CI as discussed here: * iree-org/iree#19135
…ceRank (llvm#112907)" This reverts commit 1004865. Failing CI as discussed here: * iree-org/iree#19135
…ceRank (llvm#112907)" This reverts commit 1004865. Failing CI as discussed here: * iree-org/iree#19135
…ceRank (llvm#112907)" This reverts commit 1004865. Failing CI as discussed here: * iree-org/iree#19135
…ceRank (llvm#112907)" This reverts commit 1004865. Failing CI as discussed here: * iree-org/iree#19135
Since ddf2d62 , 0-d vectors are supported in VectorType. This patch removes 0-d vector handling with scalars for the TransferOpReduceRank pattern. This pattern specifically introduces tensor.extract_slice during vectorization, causing vectorization to not fold transfer_read/transfer_write slices properly. The changes in vectorization test files reflect this.
There are other places where lowering patterns are still side-stepping from handling 0-d vectors properly, by turning them into scalars, but this patch only focuses on the vector.transfer_x patterns.