Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some improvements for KV caching #1891

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

mseeger
Copy link
Contributor

@mseeger mseeger commented Dec 26, 2024

  • Ensure that KVCache buffers are only as large as config.n_query_groups
  • Shrink buffers returned by KVCache to just cover input_pos entries
  • Clean up children of classes in model.py, in particular remove forward copies

- Shrink buffers returned by KVCache to just cover input_pos entries
- Refactor child classes of model.py classes to avoid copy and paste
@mseeger mseeger force-pushed the kvcache_improvements4 branch from 69d6d6f to a65a96d Compare December 27, 2024 13:40
@mseeger
Copy link
Contributor Author

mseeger commented Dec 27, 2024

Can somebody help with failing tests? I don't understand why tests for Windows are failing, but pass for all other systems. And I also don't understand why the GPU tests are failing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant