Skip to content

[Tracking Issue] MLA performance tracking #897

@yzh119

Description

@yzh119

This issue is the followup of #887. Per #892 (comment), we found flashinfer's MLA implementation is slower than FlashMLA in a lot of cases, we create this issue to track the remaining items to improve flashinfer MLA performance (mainly for Hopper):

Performance Tracking Table

Contributed by @abcdabcd987 :
https://docs.google.com/spreadsheets/d/1aGrBzaYeVExerCO4sGDnTN3zyluaXqwsbAbITkZTnug/edit?usp=sharing

Checklist

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions