-
Notifications
You must be signed in to change notification settings - Fork 645
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Re-enable aarch64 package builds #19135
Re-enable aarch64 package builds #19135
Conversation
I was told that the underlying issue has been resolved. Signed-off-by: Andrzej Warzynski <andrzej.warzynski@arm.com>
a5f81b0
to
001db5a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a few too many spaces but otherwise looks good.
(Just applied the suggestions via a batch commit, hope that's okay for you @banach-space.)
Thanks @marbre ! 🤞🏻 this works 😅 |
Thanks! Runners are online again. The build/test job already spotted a regression in one test too 👀 https://github.com/iree-org/iree/actions/runs/11814365861/job/32913288871#step:6:9488
BTW, a clean revert of #19116 from the GitHub UI may have been easier to make than this manually authored commit. |
I've not been able to repro and have run out of screen time for today :( Will try again tomorrow. EDIT I am able to repro with attrs = {lowering_config = #iree_codegen.lowering_config<tile_sizes = [[2, 2, 3], [1, 1, 4], [0, 0, 0], [0, 0, 0]]>} Have tile size selection logic been updated recently? MLIR repro: func.func @vectorization_test(%extracted_slice : tensor<1x1x3xi32>, %arg0: index, %arg2: index, %3: tensor<2x4xi32>, %4: tensor<1x3x2x4xi32>) -> tensor<1x1x3xi32>{
%c3 = arith.constant 3 :index
%c0 = arith.constant 0 :index
%c1 = arith.constant 1 :index
%8 = linalg.generic {
indexing_maps = [affine_map<(d0, d1, d2) -> (d0, d1, d2)>],
iterator_types = ["parallel", "parallel", "parallel"]}
outs(%extracted_slice : tensor<1x1x3xi32>) {
^bb0(%out: i32):
%9 = linalg.index 0 : index
%10 = affine.apply affine_map<(d0, d1) -> (d0 + d1)>(%9, %arg0)
%11 = linalg.index 1 : index
%12 = affine.apply affine_map<(d0, d1) -> (d0 + d1)>(%11, %arg2)
%13 = linalg.index 2 : index
%extracted = tensor.extract %3[%10, %c0] : tensor<2x4xi32>
%14 = arith.index_cast %extracted : i32 to index
%extracted_0 = tensor.extract %3[%10, %c1] : tensor<2x4xi32>
%15 = arith.index_cast %extracted_0 : i32 to index
%extracted_1 = tensor.extract %3[%10, %c3] : tensor<2x4xi32>
%16 = arith.index_cast %extracted_1 : i32 to index
%17 = arith.maxsi %16, %c0 : index
%18 = arith.minui %17, %c1 : index
%19 = arith.maxsi %15, %c0 : index
%20 = arith.minui %19, %c1 : index
%21 = arith.maxsi %14, %c0 : index
%22 = arith.minui %21, %c1 : index
%23 = arith.addi %18, %12 : index
%24 = arith.addi %22, %13 : index
%extracted_2 = tensor.extract %4[%c0, %23, %20, %24] : tensor<1x3x2x4xi32>
linalg.yield %extracted_2 : i32
} -> tensor<1x1x3xi32>
return %8 : tensor<1x1x3xi32>
}
module attributes {transform.with_named_sequence} {
transform.named_sequence @__transform_main(%arg1: !transform.any_op {transform.readonly}) {
%0 = transform.structured.match ops{["linalg.generic"]} in %arg1 : (!transform.any_op) -> !transform.any_op
%1 = transform.get_parent_op %0 {isolated_from_above} : (!transform.any_op) -> !transform.any_op
// %2 = transform.structured.vectorize_children_and_apply_patterns %1 { vectorize_nd_extract } : (!transform.any_op) -> !transform.any_op
transform.structured.vectorize %0 vector_sizes [1, 1, 4] {vectorize_nd_extract} : !transform.any_op
transform.yield
}
}
} |
Here's the offending patch: #19007. Ping @Groverkss :) |
I had a look, this seems to be a bug in upstream masking vectorization implementation. The transfer_read operation infers the return type differently than vectorization does: transfer_read: https://github.com/llvm/llvm-project/blob/main/mlir/lib/Dialect/Vector/IR/VectorOps.cpp#L4123 It looks like vectorization is wrong here, and should be using an inverse map instead of using the transfer_read indexing map. Mask needs an inverse map. |
Actually, looking more, that's not where it's coming from. There is a bug in linalg vectorization for vector.transfer_read with broadcast permutation maps when using custom vectorization hooks: Vectorizing a tensor.extract using a custom vectorization hook creates a transfer_read with a permutation map of (d0, d1) -> (0, 0, 0). This custom hook does not mask it. Masking is done after this hook is ran. The masking picks an identity map (wrong!), which does not match the indexing map that should have been used for masking of transfer_read. The reason that patch uncovered it was because before, we were doing a hack where if we saw a transfer_read of 0 rank, we would simply turn it into memref.load/tensor.extract . But that only works for the case where you have a full broadcast. If you have any indexing map for the transfer_read generated by the custom vectorization hook which is not an identity map, this will break. |
Thanks for digging into this!
But your patch didn't touch CPU lowering? Was that an LLVM patch that uncovered this? Do you know which? |
Created a smaller repro and moved the discussion here: llvm/llvm-project#116197 |
See iree-org#19135 for a discussion. Signed-off-by: Andrzej Warzynski <andrzej.warzynski@arm.com>
See iree-org#19135 for a discussion. Signed-off-by: Andrzej Warzynski <andrzej.warzynski@arm.com>
As discussed with @Groverkss offline, #19007 exposes a bug related to masked vectorization. I have a prototype fix for that, but it needs more work/consideration. Sending a revert in the meantime: Note, the code generated by the vectorizer will still fail verification, but the buggy/problematic part gets folded away by subsequent transformations. |
…ceRank (llvm#112907)" This reverts commit 1004865. Failing CI as discussed here: * iree-org/iree#19135
…ceRank (llvm#112907)" This reverts commit 1004865. Failing CI as discussed here: * iree-org/iree#19135
…ceRank (llvm#112907)" This reverts commit 1004865. Failing CI as discussed here: * iree-org/iree#19135
…ceRank (llvm#112907)" This reverts commit 1004865. Failing CI as discussed here: * iree-org/iree#19135
…ceRank (llvm#112907)" This reverts commit 1004865. Failing CI as discussed here: * iree-org/iree#19135
…ceRank (llvm#112907)" This reverts commit 1004865. Failing CI as discussed here: * iree-org/iree#19135
…ceRank (llvm#112907)" This reverts commit 1004865. Failing CI as discussed here: * iree-org/iree#19135
…ceRank (llvm#112907)" This reverts commit 1004865. Failing CI as discussed here: * iree-org/iree#19135
…ceRank (llvm#112907)" This reverts commit 1004865. Failing CI as discussed here: * iree-org/iree#19135
…ceRank (llvm#112907)" This reverts commit 1004865. Failing CI as discussed here: * iree-org/iree#19135
…ceRank (llvm#112907)" This reverts commit 1004865. Failing CI as discussed here: * iree-org/iree#19135
…ceRank (llvm#112907)" This reverts commit 1004865. Failing CI as discussed here: * iree-org/iree#19135
…ceRank (llvm#112907)" This reverts commit 1004865. Failing CI as discussed here: * iree-org/iree#19135
…ceRank (llvm#112907)" This reverts commit 1004865. Failing CI as discussed here: * iree-org/iree#19135
It was reported told that the underlying issue has been resolved. Signed-off-by: Andrzej Warzynski <andrzej.warzynski@arm.com> Co-authored-by: Marius Brehler <marius.brehler@amd.com>
See iree-org#19135 for a discussion. Signed-off-by: Andrzej Warzynski <andrzej.warzynski@arm.com>
It was reported told that the underlying issue has been resolved. Signed-off-by: Andrzej Warzynski <andrzej.warzynski@arm.com> Co-authored-by: Marius Brehler <marius.brehler@amd.com> Signed-off-by: Giacomo Serafini <179146510+giacs-epic@users.noreply.github.com>
See iree-org#19135 for a discussion. Signed-off-by: Andrzej Warzynski <andrzej.warzynski@arm.com> Signed-off-by: Giacomo Serafini <179146510+giacs-epic@users.noreply.github.com>
I was told that the underlying issue has been resolved.