New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for KV caching and batched inference #1934

Open

mseeger wants to merge 3 commits into Lightning-AI:main from mseeger:kvcache3

Open

Support for KV caching and batched inference #1934

Set enable_gqa flag in scaled_dot_product_attention

Azure Pipelines / litGPT failed Feb 17, 2025 in 8m 46s

Build #GPU tests failed

1 errors / 6 warnings

Annotations

Check failure on line 94 in Build log

azure-pipelines / litGPT

Build log #L94

Bash exited with code 137, which means it ran out of memory. Make sure the agent (container) host has sufficient memory configured.

View more details on Azure Pipelines