Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for KV caching and batched inference #1934

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Set enable_gqa flag in scaled_dot_product_attention

d2e9e45
Select commit
Loading
Failed to load commit list.
Sign in for the full log view
Open

Support for KV caching and batched inference #1934

Set enable_gqa flag in scaled_dot_product_attention
d2e9e45
Select commit
Loading
Failed to load commit list.

Annotations

1 error and 1 warning
cpu-tests (macOS-14, 3.10)
failed Feb 17, 2025 in 3m 45s