tests: Add batch size 1 cases to test_trtllm_gen_attention.py that fail, marked xfail #1897

bkryu · 2025-10-09T18:19:23Z

📌 Description

Trtllm-gen's attention kernels have been discovered to fail tests when batch size is 1.

Current PR adds batch size 1 cases to:
test_trtllm_gen_prefill_deepseek: that triggers an IMA with the newly added parameters

## Running pytest ./tests/attention/test_trtllm_gen_attention.py::test_trtllm_gen_prefill_deepseek -v
>           default_generator.manual_seed(seed)
E           torch.AcceleratorError: CUDA error: an illegal memory access was encountered
E           CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
E           For debugging consider passing CUDA_LAUNCH_BLOCKING=1
E           Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.

/opt/conda/envs/py312/lib/python3.12/site-packages/torch/cuda/random.py:129: AcceleratorError

test_trtllm_batch_decode: that produces incorrect outputs with newly added parameters

## Running pytest ./tests/attention/test_trtllm_gen_attention.py::test_trtllm_batch_decode -v
>                   torch.testing.assert_close(
                        output.float(),
                        output_wrapper.float(),
                        rtol=1e-1,
                        atol=1e-1,
                    )
E                   AssertionError: Tensor-likes are not close!
E                   
E                   Mismatched elements: 1480 / 8192 (18.1%)
E                   Greatest absolute difference: 64.021484375 at index (0, 46, 106) (up to 0.1 allowed)
E                   Greatest relative difference: 1.625 at index (0, 56, 109) (up to 0.1 allowed)

These test cases have been marked as pytest.xfail(). To avoid a combinatorial growth of test parameter combinations, these batch size 1 cases were defined as separate test functions.

B200 status before PR: 2052 passed, 264 skipped in 177.80s (0:02:57)
B200 status after PR: 2052 passed, 264 skipped, 3 xfailed in 195.14s (0:03:15)

Status tracked in Issue 1898

🔍 Related Issues

🚀 Pull Request Checklist

Thank you for contributing to FlashInfer! Before we review your pull request, please make sure the following items are complete.

✅ Pre-commit Checks

I have installed pre-commit by running pip install pre-commit (or used your preferred method).
I have installed the hooks with pre-commit install.
I have run the hooks manually with pre-commit run --all-files and fixed any reported issues.

If you are unsure about how to set up pre-commit, see the pre-commit documentation.

🧪 Tests

Tests have been added or updated as needed.
All tests are passing (unittest, etc.).

Reviewer Notes

bkryu added 3 commits October 9, 2025 17:25

Test cases for trtllm attention errors

034bb83

Cleaning up comment

5fd31a5

Add bs1 as separate test case

abbd125

bkryu marked this pull request as ready for review October 9, 2025 18:21

bkryu requested a review from yongwww October 9, 2025 18:22

bkryu mentioned this pull request Oct 9, 2025

[Bug] trtllm-gen attention kernel issues for batch size 1 #1898

Open

yzh119 approved these changes Oct 9, 2025

View reviewed changes

yzh119 merged commit 40d3fea into flashinfer-ai:main Oct 9, 2025
3 checks passed

bkryu deleted the trtllm-ima branch October 23, 2025 20:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

tests: Add batch size 1 cases to test_trtllm_gen_attention.py that fail, marked xfail #1897

tests: Add batch size 1 cases to test_trtllm_gen_attention.py that fail, marked xfail #1897

Uh oh!

bkryu commented Oct 9, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

tests: Add batch size 1 cases to test_trtllm_gen_attention.py that fail, marked xfail #1897

tests: Add batch size 1 cases to test_trtllm_gen_attention.py that fail, marked xfail #1897

Uh oh!

Conversation

bkryu commented Oct 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

📌 Description

🔍 Related Issues

🚀 Pull Request Checklist

✅ Pre-commit Checks

🧪 Tests

Reviewer Notes

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

bkryu commented Oct 9, 2025 •

edited

Loading