Skip to content

Commit 099599c

Browse files
Apply suggestions from code review
Co-authored-by: Tyler Michael Smith <tysmith@redhat.com> Signed-off-by: Lucas Wilkinson <lwilkinson@neuralmagic.com>
1 parent 603c1ad commit 099599c

File tree

1 file changed

+4
-4
lines changed

1 file changed

+4
-4
lines changed

hopper/flash_fwd_combine_kernel.h

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -234,11 +234,11 @@ class FlashAttnFwdCombine {
234234

235235
struct StaticVarlenTileScheduler {
236236
//
237-
// For varlen we have too Scheduling algos:
237+
// For varlen we have two Scheduling algos:
238238
// 1) STANDARD, same as StaticTileScheduler
239-
// 2) LINEARIZE_M_AND_BATCH, this to flattens the tiled M dimension and
240-
// batch dimension into a linear tile index. The grid is then a
241-
// 2D grid of (tile_id, k_block) we then map the linear tile id
239+
// 2) LINEARIZE_M_AND_BATCH, this flattens the tiled M dimension and
240+
// batch dimension into a linear tile index. The grid is then a
241+
// 2D grid of (tile_id, k_block). We then map the linear tile id
242242
// to (m_block, bidb) in the get_block_coord function. This mapping
243243
// is non-trivial since each batch element can have a different
244244
// number of m_blocks. This has overhead when computing the block

0 commit comments

Comments
 (0)