Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for KV caching and batched inference #1934

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Set enable_gqa flag in scaled_dot_product_attention

d2e9e45
Select commit
Loading
Failed to load commit list.
Open

Support for KV caching and batched inference #1934

Set enable_gqa flag in scaled_dot_product_attention
d2e9e45
Select commit
Loading
Failed to load commit list.
Azure Pipelines / litGPT failed Feb 17, 2025 in 8m 46s

Build #GPU tests failed

Annotations

Check failure on line 94 in Build log

See this annotation in the file changed.

@azure-pipelines azure-pipelines / litGPT

Build log #L94

Bash exited with code 137, which means it ran out of memory. Make sure the agent (container) host has sufficient memory configured.