AIRRtToNpuPass SHIM DMA BD optimization #550

erwei-xilinx · 2024-04-24T23:02:22Z

When tiling any wrap>1023 into two wraps, make the largest int factor the outer wrap, so that the new stride is smaller and less likely to go beyond 1M.
Code quality.
Avoid using SymbolTable::getSymbolUses method, which turns out to be quite slow with big IRs.

…ikely to get stride>1M

…method

…prove_aiex_npu_dma_gemm_size_limit

newling

Thanks!

erwei-xilinx and others added 4 commits April 24, 2024 15:13

When tiling wrap>1023, swap inner and outer wrap so that it is less l…

1ebc066

…ikely to get stride>1M

Code quality; improve performance by avoiding use of 'getSymbolUses' …

abfc0b7

…method

faster

af92f6d

Merge branch 'main' of github.com:erwei-xilinx/mlir-air-erwei into im…

2e0da14

…prove_aiex_npu_dma_gemm_size_limit

erwei-xilinx requested a review from newling April 25, 2024 01:07

newling approved these changes Apr 25, 2024

View reviewed changes

erwei-xilinx merged commit 4faaa09 into Xilinx:main Apr 25, 2024
9 checks passed

erwei-xilinx deleted the improve_aiex_npu_dma_gemm_size_limit branch April 25, 2024 16:08

erwei-xilinx mentioned this pull request Apr 25, 2024

Improve compile time (AIRRtToNpuPass) #551

Closed

Provide feedback