Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Re-enable aarch64 package builds #19135

Merged
merged 2 commits into from
Nov 13, 2024

Conversation

banach-space
Copy link
Collaborator

I was told that the underlying issue has been resolved.

I was told that the underlying issue has been resolved.

Signed-off-by: Andrzej Warzynski <andrzej.warzynski@arm.com>
@banach-space banach-space force-pushed the andrzej/re-enable-aarch64-runners branch from a5f81b0 to 001db5a Compare November 13, 2024 11:01
Copy link
Member

@marbre marbre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few too many spaces but otherwise looks good.

(Just applied the suggestions via a batch commit, hope that's okay for you @banach-space.)

.github/workflows/build_package.yml Outdated Show resolved Hide resolved
.github/workflows/build_package.yml Outdated Show resolved Hide resolved
.github/workflows/build_package.yml Outdated Show resolved Hide resolved
@marbre marbre enabled auto-merge (squash) November 13, 2024 11:13
@marbre marbre merged commit ea03080 into iree-org:main Nov 13, 2024
33 of 36 checks passed
@banach-space
Copy link
Collaborator Author

Thanks @marbre ! 🤞🏻 this works 😅

@ScottTodd
Copy link
Member

Thanks! Runners are online again. The build/test job already spotted a regression in one test too 👀 https://github.com/iree-org/iree/actions/runs/11814365861/job/32913288871#step:6:9488

FAILED: tests/e2e/stablehlo_ops/check_llvm-cpu-host_local-task_gather.mlir_module.vmfb /__w/iree/iree/build-arm64/tests/e2e/stablehlo_ops/check_llvm-cpu-host_local-task_gather.mlir_module.vmfb 
cd /__w/iree/iree/build-arm64/tests/e2e/stablehlo_ops && /__w/iree/iree/build-arm64/tools/iree-compile --output-format=vm-bytecode --mlir-print-op-on-diagnostic=false --iree-hal-target-backends=llvm-cpu --iree-input-type=stablehlo --iree-input-demote-f64-to-f32 --iree-llvmcpu-target-cpu=host /__w/iree/iree/tests/e2e/stablehlo_ops/gather.mlir -o check_llvm-cpu-host_local-task_gather.mlir_module.vmfb --iree-hal-executable-object-search-path=\"/__w/iree/iree/build-arm64\" --iree-llvmcpu-embedded-linker-path=\"/__w/iree/iree/build-arm64/llvm-project/bin/lld\" --iree-llvmcpu-wasm-linker-path=\"/__w/iree/iree/build-arm64/llvm-project/bin/lld\"
failed to translate executables
failed to translate executables
/__w/iree/iree/tests/e2e/stablehlo_ops/gather.mlir:92:13: error: 'vector.transfer_read' op inferred mask type ('vector<i1>') and mask operand type ('vector<1x1x4xi1>') don't match
  %result = "stablehlo.gather"(%operand, %start_indices) {
            ^

BTW, a clean revert of #19116 from the GitHub UI may have been easier to make than this manually authored commit.

@banach-space banach-space deleted the andrzej/re-enable-aarch64-runners branch November 13, 2024 16:16
@banach-space
Copy link
Collaborator Author

banach-space commented Nov 13, 2024

I've not been able to repro and have run out of screen time for today :( Will try again tomorrow.

EDIT

I am able to repro with iree-compile. Looks like the vectorizer is failing. Things seems fine with mlir-opt when we let the vectorize decide what vector shape to use. Things brake when using vector shapes as implied by IREE:

attrs =  {lowering_config = #iree_codegen.lowering_config<tile_sizes = [[2, 2, 3], [1, 1, 4], [0, 0, 0], [0, 0, 0]]>}

Have tile size selection logic been updated recently?

MLIR repro:

func.func @vectorization_test(%extracted_slice : tensor<1x1x3xi32>, %arg0: index, %arg2: index, %3: tensor<2x4xi32>, %4: tensor<1x3x2x4xi32>) -> tensor<1x1x3xi32>{
%c3 = arith.constant 3 :index
%c0 = arith.constant 0 :index
%c1 = arith.constant 1 :index

%8 = linalg.generic {
  indexing_maps = [affine_map<(d0, d1, d2) -> (d0, d1, d2)>],
  iterator_types = ["parallel", "parallel", "parallel"]}
  outs(%extracted_slice : tensor<1x1x3xi32>) {
  ^bb0(%out: i32):
    %9 = linalg.index 0 : index
    %10 = affine.apply affine_map<(d0, d1) -> (d0 + d1)>(%9, %arg0)
    %11 = linalg.index 1 : index
    %12 = affine.apply affine_map<(d0, d1) -> (d0 + d1)>(%11, %arg2)
    %13 = linalg.index 2 : index
    %extracted = tensor.extract %3[%10, %c0] : tensor<2x4xi32>
    %14 = arith.index_cast %extracted : i32 to index
    %extracted_0 = tensor.extract %3[%10, %c1] : tensor<2x4xi32>
    %15 = arith.index_cast %extracted_0 : i32 to index
    %extracted_1 = tensor.extract %3[%10, %c3] : tensor<2x4xi32>
    %16 = arith.index_cast %extracted_1 : i32 to index
    %17 = arith.maxsi %16, %c0 : index
    %18 = arith.minui %17, %c1 : index
    %19 = arith.maxsi %15, %c0 : index
    %20 = arith.minui %19, %c1 : index
    %21 = arith.maxsi %14, %c0 : index
    %22 = arith.minui %21, %c1 : index
    %23 = arith.addi %18, %12 : index
    %24 = arith.addi %22, %13 : index
    %extracted_2 = tensor.extract %4[%c0, %23, %20, %24] : tensor<1x3x2x4xi32>
    linalg.yield %extracted_2 : i32
  } -> tensor<1x1x3xi32>

  return %8 : tensor<1x1x3xi32>
}
module attributes {transform.with_named_sequence} {
  transform.named_sequence @__transform_main(%arg1: !transform.any_op {transform.readonly}) {
    %0 = transform.structured.match ops{["linalg.generic"]} in %arg1 : (!transform.any_op) -> !transform.any_op
    %1 = transform.get_parent_op %0 {isolated_from_above} : (!transform.any_op) -> !transform.any_op
    // %2 = transform.structured.vectorize_children_and_apply_patterns %1  { vectorize_nd_extract } : (!transform.any_op) -> !transform.any_op
    transform.structured.vectorize %0 vector_sizes [1, 1, 4] {vectorize_nd_extract} : !transform.any_op
    transform.yield
  }
}
}

@banach-space
Copy link
Collaborator Author

Here's the offending patch: #19007. Ping @Groverkss :)

@Groverkss
Copy link
Contributor

Here's the offending patch: #19007. Ping @Groverkss :)

I had a look, this seems to be a bug in upstream masking vectorization implementation. The transfer_read operation infers the return type differently than vectorization does:

transfer_read: https://github.com/llvm/llvm-project/blob/main/mlir/lib/Dialect/Vector/IR/VectorOps.cpp#L4123
vectorization: https://github.com/llvm/llvm-project/blob/main/mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp#L1402C18-L1402C31

It looks like vectorization is wrong here, and should be using an inverse map instead of using the transfer_read indexing map. Mask needs an inverse map.

@Groverkss
Copy link
Contributor

Here's the offending patch: #19007. Ping @Groverkss :)

I had a look, this seems to be a bug in upstream masking vectorization implementation. The transfer_read operation infers the return type differently than vectorization does:

transfer_read: https://github.com/llvm/llvm-project/blob/main/mlir/lib/Dialect/Vector/IR/VectorOps.cpp#L4123 vectorization: https://github.com/llvm/llvm-project/blob/main/mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp#L1402C18-L1402C31

It looks like vectorization is wrong here, and should be using an inverse map instead of using the transfer_read indexing map. Mask needs an inverse map.

Actually, looking more, that's not where it's coming from. There is a bug in linalg vectorization for vector.transfer_read with broadcast permutation maps when using custom vectorization hooks:

https://github.com/llvm/llvm-project/blob/main/mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp#L1457

Vectorizing a tensor.extract using a custom vectorization hook creates a transfer_read with a permutation map of (d0, d1) -> (0, 0, 0). This custom hook does not mask it. Masking is done after this hook is ran. The masking picks an identity map (wrong!), which does not match the indexing map that should have been used for masking of transfer_read.

The reason that patch uncovered it was because before, we were doing a hack where if we saw a transfer_read of 0 rank, we would simply turn it into memref.load/tensor.extract . But that only works for the case where you have a full broadcast. If you have any indexing map for the transfer_read generated by the custom vectorization hook which is not an identity map, this will break.

@banach-space
Copy link
Collaborator Author

Thanks for digging into this!

The reason that patch uncovered it

But your patch didn't touch CPU lowering? Was that an LLVM patch that uncovered this? Do you know which?

@banach-space
Copy link
Collaborator Author

Created a smaller repro and moved the discussion here: llvm/llvm-project#116197

banach-space added a commit to banach-space/iree that referenced this pull request Nov 14, 2024
See iree-org#19135 for a discussion.

Signed-off-by: Andrzej Warzynski <andrzej.warzynski@arm.com>
banach-space added a commit to banach-space/iree that referenced this pull request Nov 14, 2024
See iree-org#19135 for a discussion.

Signed-off-by: Andrzej Warzynski <andrzej.warzynski@arm.com>
@banach-space
Copy link
Collaborator Author

As discussed with @Groverkss offline, #19007 exposes a bug related to masked vectorization. I have a prototype fix for that, but it needs more work/consideration. Sending a revert in the meantime:

Note, the code generated by the vectorizer will still fail verification, but the buggy/problematic part gets folded away by subsequent transformations.

banach-space added a commit to banach-space/llvm-project that referenced this pull request Nov 14, 2024
banach-space added a commit that referenced this pull request Nov 15, 2024
See #19135 for a discussion.

Signed-off-by: Andrzej Warzynski <andrzej.warzynski@arm.com>
Groverkss pushed a commit to iree-org/llvm-project that referenced this pull request Nov 15, 2024
Groverkss pushed a commit to iree-org/llvm-project that referenced this pull request Nov 15, 2024
Groverkss pushed a commit to iree-org/llvm-project that referenced this pull request Nov 18, 2024
Groverkss pushed a commit to iree-org/llvm-project that referenced this pull request Nov 18, 2024
Groverkss pushed a commit to iree-org/llvm-project that referenced this pull request Nov 21, 2024
Groverkss pushed a commit to iree-org/llvm-project that referenced this pull request Nov 21, 2024
kuhar pushed a commit to iree-org/llvm-project that referenced this pull request Nov 25, 2024
IanWood1 pushed a commit to IanWood1/llvm-project that referenced this pull request Nov 25, 2024
IanWood1 pushed a commit to IanWood1/llvm-project that referenced this pull request Nov 26, 2024
kuhar pushed a commit to iree-org/llvm-project that referenced this pull request Nov 26, 2024
kuhar pushed a commit to iree-org/llvm-project that referenced this pull request Nov 27, 2024
kuhar pushed a commit to iree-org/llvm-project that referenced this pull request Nov 28, 2024
kuhar pushed a commit to iree-org/llvm-project that referenced this pull request Nov 29, 2024
Groverkss pushed a commit to Groverkss/iree that referenced this pull request Dec 1, 2024
It was reported told that the underlying issue has been resolved.

Signed-off-by: Andrzej Warzynski <andrzej.warzynski@arm.com>
Co-authored-by: Marius Brehler <marius.brehler@amd.com>
Groverkss pushed a commit to Groverkss/iree that referenced this pull request Dec 1, 2024
See iree-org#19135 for a discussion.

Signed-off-by: Andrzej Warzynski <andrzej.warzynski@arm.com>
giacs-epic pushed a commit to giacs-epic/iree that referenced this pull request Dec 4, 2024
It was reported told that the underlying issue has been resolved.

Signed-off-by: Andrzej Warzynski <andrzej.warzynski@arm.com>
Co-authored-by: Marius Brehler <marius.brehler@amd.com>
Signed-off-by: Giacomo Serafini <179146510+giacs-epic@users.noreply.github.com>
giacs-epic pushed a commit to giacs-epic/iree that referenced this pull request Dec 4, 2024
See iree-org#19135 for a discussion.

Signed-off-by: Andrzej Warzynski <andrzej.warzynski@arm.com>
Signed-off-by: Giacomo Serafini <179146510+giacs-epic@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants