Skip to content

Commit a48be76

Browse files
committed
Reduce test size
Signed-off-by: shuw <shuw@nvidia.com>
1 parent 33d5969 commit a48be76

File tree

1 file changed

+4
-0
lines changed

1 file changed

+4
-0
lines changed

tests/kernels/attention/test_cache.py

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -239,6 +239,10 @@ def test_reshape_and_cache_flash(
239239
current_platform.seed_everything(seed)
240240
torch.set_default_device(device)
241241

242+
# fp8 conversion requires continugous memory buffer. Reduce the number of
243+
# blocks and tokens to consume less memory.
244+
num_tokens = num_tokens // 2
245+
num_blocks = num_blocks // 2
242246
# Create a random slot mapping.
243247
num_slots = block_size * num_blocks
244248
slot_mapping_lst = random.sample(range(num_slots), num_tokens)

0 commit comments

Comments
 (0)