Skip to content

Commit

Permalink
[AMDGPU] Remove one case of vmcnt loop header flushing for GFX12 (llv…
Browse files Browse the repository at this point in the history
…m#105550)

When a loop contains a VMEM load whose result is only used outside the
loop, do not bother to flush vmcnt in the loop head on GFX12. A wait for
vmcnt will be required inside the loop anyway, because VMEM instructions
can write their VGPR results out of order.
  • Loading branch information
jayfoad authored Aug 23, 2024
1 parent 96509bb commit fa2dccb
Show file tree
Hide file tree
Showing 2 changed files with 6 additions and 6 deletions.
2 changes: 1 addition & 1 deletion llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -2390,7 +2390,7 @@ bool SIInsertWaitcnts::shouldFlushVmCnt(MachineLoop *ML,
}
if (!ST->hasVscnt() && HasVMemStore && !HasVMemLoad && UsesVgprLoadedOutside)
return true;
return HasVMemLoad && UsesVgprLoadedOutside;
return HasVMemLoad && UsesVgprLoadedOutside && ST->hasVmemWriteVgprInOrder();
}

bool SIInsertWaitcnts::runOnMachineFunction(MachineFunction &MF) {
Expand Down
10 changes: 5 additions & 5 deletions llvm/test/CodeGen/AMDGPU/waitcnt-vmcnt-loop.mir
Original file line number Diff line number Diff line change
Expand Up @@ -295,7 +295,7 @@ body: |
# GFX12-LABEL: waitcnt_vm_loop2
# GFX12-LABEL: bb.0:
# GFX12: BUFFER_LOAD_FORMAT_X_IDXEN
# GFX12: S_WAIT_LOADCNT 0
# GFX12-NOT: S_WAIT_LOADCNT 0
# GFX12-LABEL: bb.1:
# GFX12: S_WAIT_LOADCNT 0
# GFX12-LABEL: bb.2:
Expand Down Expand Up @@ -342,7 +342,7 @@ body: |
# GFX12-LABEL: waitcnt_vm_loop2_store
# GFX12-LABEL: bb.0:
# GFX12: BUFFER_LOAD_FORMAT_X_IDXEN
# GFX12: S_WAIT_LOADCNT 0
# GFX12-NOT: S_WAIT_LOADCNT 0
# GFX12-LABEL: bb.1:
# GFX12: S_WAIT_LOADCNT 0
# GFX12-LABEL: bb.2:
Expand Down Expand Up @@ -499,9 +499,9 @@ body: |
# GFX12-LABEL: waitcnt_vm_loop2_reginterval
# GFX12-LABEL: bb.0:
# GFX12: GLOBAL_LOAD_DWORDX4
# GFX12: S_WAIT_LOADCNT 0
# GFX12-LABEL: bb.1:
# GFX12-NOT: S_WAIT_LOADCNT 0
# GFX12-LABEL: bb.1:
# GFX12: S_WAIT_LOADCNT 0
# GFX12-LABEL: bb.2:
name: waitcnt_vm_loop2_reginterval
body: |
Expand Down Expand Up @@ -600,7 +600,7 @@ body: |
# GFX12-LABEL: bb.0:
# GFX12: BUFFER_LOAD_FORMAT_X_IDXEN
# GFX12: BUFFER_LOAD_FORMAT_X_IDXEN
# GFX12: S_WAIT_LOADCNT 0
# GFX12-NOT: S_WAIT_LOADCNT 0
# GFX12-LABEL: bb.1:
# GFX12: S_WAIT_LOADCNT 0
# GFX12-LABEL: bb.2:
Expand Down

0 comments on commit fa2dccb

Please sign in to comment.