Skip to content

Commit

Permalink
bugfix: Fix SWA Implementation Hanging Issue (#673)
Browse files Browse the repository at this point in the history
This PR resolves an issue in the SWA implementation introduced in #667,
where the process would hang under specific conditions.
  • Loading branch information
yzh119 authored Dec 17, 2024
1 parent d4e8d79 commit 124daea
Showing 1 changed file with 2 additions and 5 deletions.
7 changes: 2 additions & 5 deletions include/flashinfer/attention/hopper/mainloop_mma.cuh
Original file line number Diff line number Diff line change
Expand Up @@ -204,11 +204,8 @@ CUTLASS_DEVICE void mma_f16(const Params& mainloop_params, AttentionVariant& var
}

if constexpr (LEFT_SLIDING_WINDOW) {
constexpr int n_swa_masking_steps = cute::ceil_div(CTA_Q, CTA_KV) + 1;
#pragma unroll
for (int masking_step = 0;
masking_step < n_swa_masking_steps && kv_tile_idx > swa_begin_kv_tile_idx;
++masking_step, --kv_tile_idx) {
#pragma unroll 1
for (; kv_tile_idx > swa_begin_kv_tile_idx; --kv_tile_idx) {
Tensor tSrS = partition_fragment_C(tiled_mma_qk, select<0, 1>(TileShape_QKD{}));
consumer_wait(pipeline_k, smem_pipe_read_k);
WarpScheduler::barrier_sync();
Expand Down

0 comments on commit 124daea

Please sign in to comment.