-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SplitLogicalObjFifo] Fix split-logicalobjfifo pass to analyse unique producers/consumers ObjFifos #1060
base: main
Are you sure you want to change the base?
Conversation
compiler/plugins/target/AMD-AIE/iree-amd-aie/Transforms/test/split_logicalobjfifos.mlir
Show resolved
Hide resolved
compiler/plugins/target/AMD-AIE/iree-amd-aie/Transforms/AMDAIESplitLogicalObjFifos.cpp
Outdated
Show resolved
Hide resolved
compiler/plugins/target/AMD-AIE/iree-amd-aie/Transforms/AMDAIESplitLogicalObjFifos.cpp
Outdated
Show resolved
Hide resolved
...er/plugins/target/AMD-AIE/iree-amd-aie/Transforms/Utils/AMDAIELogicalObjFifoSplittingUtils.h
Outdated
Show resolved
Hide resolved
compiler/plugins/target/AMD-AIE/iree-amd-aie/Transforms/AMDAIESplitLogicalObjFifos.cpp
Outdated
Show resolved
Hide resolved
compiler/plugins/target/AMD-AIE/iree-amd-aie/Transforms/AMDAIESplitLogicalObjFifos.cpp
Outdated
Show resolved
Hide resolved
compiler/plugins/target/AMD-AIE/iree-amd-aie/Transforms/AMDAIESplitLogicalObjFifos.cpp
Outdated
Show resolved
Hide resolved
compiler/plugins/target/AMD-AIE/iree-amd-aie/Transforms/AMDAIESplitLogicalObjFifos.cpp
Outdated
Show resolved
Hide resolved
compiler/plugins/target/AMD-AIE/iree-amd-aie/Transforms/AMDAIESplitLogicalObjFifos.cpp
Outdated
Show resolved
Hide resolved
def3a83
to
0904a94
Compare
.../plugins/target/AMD-AIE/iree-amd-aie/Transforms/Utils/AMDAIELogicalObjFifoSplittingUtils.cpp
Outdated
Show resolved
Hide resolved
.../plugins/target/AMD-AIE/iree-amd-aie/Transforms/Utils/AMDAIELogicalObjFifoSplittingUtils.cpp
Outdated
Show resolved
Hide resolved
...er/plugins/target/AMD-AIE/iree-amd-aie/Transforms/Utils/AMDAIELogicalObjFifoSplittingUtils.h
Outdated
Show resolved
Hide resolved
26ef401
to
1703271
Compare
compiler/plugins/target/AMD-AIE/iree-amd-aie/IR/AMDAIELogicalObjFifoOpInterface.cpp
Outdated
Show resolved
Hide resolved
compiler/plugins/target/AMD-AIE/iree-amd-aie/Transforms/AMDAIESplitLogicalObjFifos.cpp
Outdated
Show resolved
Hide resolved
compiler/plugins/target/AMD-AIE/iree-amd-aie/Transforms/AMDAIESplitLogicalObjFifos.cpp
Outdated
Show resolved
Hide resolved
compiler/plugins/target/AMD-AIE/iree-amd-aie/Transforms/AMDAIESplitLogicalObjFifos.cpp
Outdated
Show resolved
Hide resolved
531ee50
to
da1c62d
Compare
compiler/plugins/target/AMD-AIE/iree-amd-aie/Transforms/AMDAIESplitLogicalObjFifos.cpp
Outdated
Show resolved
Hide resolved
compiler/plugins/target/AMD-AIE/iree-amd-aie/Transforms/AMDAIESplitLogicalObjFifos.cpp
Outdated
Show resolved
Hide resolved
compiler/plugins/target/AMD-AIE/iree-amd-aie/Transforms/AMDAIESplitLogicalObjFifos.cpp
Outdated
Show resolved
Hide resolved
2e57be9
to
8c3f762
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This revision is much more concise and cleaner. I have more comments on the test.
compiler/plugins/target/AMD-AIE/iree-amd-aie/Transforms/AMDAIESplitLogicalObjFifos.cpp
Outdated
Show resolved
Hide resolved
compiler/plugins/target/AMD-AIE/iree-amd-aie/Transforms/test/split_logicalobjfifos.mlir
Outdated
Show resolved
Hide resolved
compiler/plugins/target/AMD-AIE/iree-amd-aie/Transforms/test/split_logicalobjfifos.mlir
Outdated
Show resolved
Hide resolved
compiler/plugins/target/AMD-AIE/iree-amd-aie/Transforms/test/split_logicalobjfifos.mlir
Outdated
Show resolved
Hide resolved
#executable_target_amdaie_pdi_fb = #hal.executable.target<"amd-aie", "amdaie-pdi-fb", {num_cols = 8 : i32, num_rows = 4 : i32, target_device = "npu4", ukernels = "none"}> | ||
#translation = #iree_codegen.translation_info<pipeline = Custom> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand why this test works as in your description num_cols=2 and num_rows=1
, while here it's num_cols=8 and num_rows=4
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's the intention : how much to split is not only dependent on the number of available columns but also the unique L2<->L1 pairs -> so we use gcd between the two (and in some cases we might need to take gcd between source/target sizes)
"to keep the test case concise it demonstrates a similar splitting strategy for 1 row and 2 columns.
" means the actual compute is taking place on 1 row and 2 columns. I could change num_cols=2
and num_rows=1
but that'd defeat the purpose.
I've updated the comments.
3594447
to
42868e6
Compare
compiler/plugins/target/AMD-AIE/iree-amd-aie/Transforms/AMDAIESplitLogicalObjFifos.cpp
Outdated
Show resolved
Hide resolved
2de4496
to
f7bcd0d
Compare
compiler/plugins/target/AMD-AIE/iree-amd-aie/Transforms/test/split_logicalobjfifos.mlir
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just one nit and also update the PR title to producers/consumers instead of L1/L2.
compiler/plugins/target/AMD-AIE/iree-amd-aie/Transforms/AMDAIESplitLogicalObjFifos.cpp
Outdated
Show resolved
Hide resolved
…SplitLogicalObjFifos.cpp Co-authored-by: Jorn Tuyls <jtuyls@users.noreply.github.com>
In order to decide the split factor we were basing the inference solely on the number of columns available.
Because of this, for 4x8 array and for the new pipeline, we were splitting L2 buffers :-
As a result, the tiles being assigned were :-
This causes exhaustion of DMA channels. Refer to this thread for the discussion thread.
The L2 buffer split should be :-
So that later on when the tiles are being assigned, the expected no. of tile assignments for LHS/RHS/OUT matches the corresponding L2 buffers.
This PR aims to analyse the number of unique producers/consumers ObjFifos for the ObjFifo being split..
e2e CI test for Matmul both with/without ukernel via
pack-peel-4-level-tiling
pipeline targeting 4x8 array on Strix have been added.Signed-off-by: Abhishek Varma abhvarma@amd.com