Skip to content

Commit 8205791

Browse files
authored
[Refactor] Remove small array reuse condition in shared memory allocation merging (#654)
- Eliminated the condition that disabled the reuse of small arrays (const_nbits <= 32) in the `MergeSharedMemoryAllocations` function, allowing for more flexible memory management. - Added a comment in `OptimizeForTarget` to clarify the order of applying `MergeSharedMemoryAllocations` after `SplitHostDevice`, ensuring correct allocation site handling in device functions.
1 parent 6e994b1 commit 8205791

File tree

2 files changed

+2
-5
lines changed

2 files changed

+2
-5
lines changed

src/transform/merge_shared_memory_allocations.cc

Lines changed: 0 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -965,10 +965,6 @@ class SharedMemoryRewriter : public StmtExprMutator {
965965
StorageEntry *e = it->second;
966966
ICHECK_NE(e->allocs.size(), 0U);
967967

968-
// disable reuse of small arrays
969-
if (e->const_nbits > 0 && e->const_nbits <= 32)
970-
return;
971-
972968
// normal free.
973969
if (e->const_nbits != 0) {
974970
const_free_map_.insert({e->const_nbits, e});

tilelang/engine/phase.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -163,7 +163,8 @@ def OptimizeForTarget(mod: IRModule, target: Target) -> IRModule:
163163
mod = tilelang.transform.ThreadSync("global")(mod)
164164
mod = tilelang.transform.AnnotateDeviceRegions()(mod)
165165
mod = tir.transform.SplitHostDevice()(mod)
166-
166+
# MergeSharedMemoryAllocations must be applied after SplitHostDevice
167+
# because the merged allocation site is at the beginning of each device function
167168
enable_aggressive_merge = should_enable_aggressive_merge(pass_ctx=pass_ctx, target=target)
168169
# Hopper Swizzling requires dynamic shared memory address to be aligned to 1024 bytes
169170
# For other devices, we align to 16 bytes

0 commit comments

Comments
 (0)