Skip to content

[CI Failure]: Cascade attention E2E test fails with FlashInfer backend #25679

@MatthewBonanni

Description

@MatthewBonanni

Name of failing test

v1/e2e/test_cascade_attention.py::test_cascade_attention[FLASHINFER]

Basic information

  • Flaky test
  • Can reproduce locally
  • Caused by external libraries (e.g. bug in transformers)

🧪 Describe the failing test

As discussed in #25489, removing the _VLLM_V1 suffixes revealed that this test is actually failing on main - there's a correctness issue with cascade attention on the FlashInfer backend (CI run). Previously, the test was trying to run with FLASHINFER_VLLM_V1, a backend which didn't exist, so it'd fall back to FlashAttention and pass (recent nightly).

📝 History of failing test

Failure revealed in #25489, so the test was disabled.

CC List.

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    ci-failureIssue about an unexpected test failure in CI

    Type

    No type

    Projects

    Status

    No status

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions