Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Support for cached multi-query attention towards speculative decoding #1679

Closed
wants to merge 14 commits into from

Commits on Nov 30, 2023

  1. add multi query attn stub

    skrider authored and Ubuntu committed Nov 30, 2023
    Configuration menu
    Copy the full SHA
    5b512da View commit details
    Browse the repository at this point in the history
  2. add test for cached multiquery attention

    skrider authored and Ubuntu committed Nov 30, 2023
    Configuration menu
    Copy the full SHA
    3e13363 View commit details
    Browse the repository at this point in the history
  3. fix slot_mapping construction

    skrider authored and Ubuntu committed Nov 30, 2023
    Configuration menu
    Copy the full SHA
    3fe9911 View commit details
    Browse the repository at this point in the history

Commits on Dec 1, 2023

  1. multi-query cached attention implementation passing tests

    skrider authored and Ubuntu committed Dec 1, 2023
    Configuration menu
    Copy the full SHA
    1f5e28c View commit details
    Browse the repository at this point in the history
  2. remove debugger breakpoints

    skrider authored and Ubuntu committed Dec 1, 2023
    Configuration menu
    Copy the full SHA
    9253ef0 View commit details
    Browse the repository at this point in the history
  3. comments and minor improvement

    skrider authored and Ubuntu committed Dec 1, 2023
    Configuration menu
    Copy the full SHA
    7a8f8d9 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    55eee30 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    8c515d6 View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    ce2ae38 View commit details
    Browse the repository at this point in the history

Commits on Dec 2, 2023

  1. Configuration menu
    Copy the full SHA
    09de61c View commit details
    Browse the repository at this point in the history

Commits on Dec 23, 2023

  1. Configuration menu
    Copy the full SHA
    ad27c57 View commit details
    Browse the repository at this point in the history
  2. flash attention stub

    skrider committed Dec 23, 2023
    Configuration menu
    Copy the full SHA
    1322191 View commit details
    Browse the repository at this point in the history

Commits on Dec 24, 2023

  1. Configuration menu
    Copy the full SHA
    7e0ee61 View commit details
    Browse the repository at this point in the history

Commits on Jan 9, 2024

  1. Configuration menu
    Copy the full SHA
    c53a958 View commit details
    Browse the repository at this point in the history