Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor KV cache, Rope , reduce common code #1148

Merged
merged 28 commits into from
Dec 3, 2024

Conversation

abhilash1910
Copy link
Contributor

What does this PR do?

Reduces common KVCache class code across generation models, refactors into modelling class.
cc @libinta @vidyasiv

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you make sure to update the documentation with your changes?
  • Did you write any new necessary tests?

@abhilash1910 abhilash1910 requested a review from a user July 22, 2024 08:32
@abhilash1910 abhilash1910 requested a review from regisss as a code owner July 22, 2024 08:32
Copy link
Collaborator

@ssarkar2 ssarkar2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also make similar changes for modeling_mistral.py, modeling_clip.py, modeling_mixtral.py (and other files, which might have Matmul or KVCache)

@nprotasov
Copy link
Contributor

I think this PR can be affected by new PR in transformers:
huggingface/transformers#31999

@yafshar
Copy link
Contributor

yafshar commented Jul 29, 2024

@abhilash1910 can you also consider the point made by @ulivne in #1160 (comment) It makes sense to have them here.

@abhilash1910
Copy link
Contributor Author

Yes @yafshar will update this to include here, also seeing the #31999 - I guess it can added incrementally.

@abhilash1910 abhilash1910 changed the title Refactor KV cache , reduce common code Refactor KV cache, Rope , reduce common code Aug 1, 2024
@abhilash1910
Copy link
Contributor Author

@yafshar @ssarkar2 please help trigger CI and review. Thanks

@yafshar
Copy link
Contributor

yafshar commented Sep 21, 2024

@abhilash1910 would you please re-base the code. The CI is failing. Thanks

@abhilash1910
Copy link
Contributor Author

@yafshar please help to trigger the CI

@yafshar
Copy link
Contributor

yafshar commented Sep 25, 2024

@libinta would you please label this PR

@libinta
Copy link
Collaborator

libinta commented Nov 14, 2024

@abhilash1910 can you resolve conflict?

@abhilash1910
Copy link
Contributor Author

@libinta done, please help to review. Thanks

@emascarenhas
Copy link
Contributor

@abhilash1910 , Please run the CI test requested above and post results here in the pull request if you'd like this to be in 1.19 release, otherwise we can push it to 1.20.

@libinta libinta added the run-test Run CI for PRs from external contributors label Nov 27, 2024
Copy link
Collaborator

@regisss regisss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! I left the same comment everywhere to push the use of relative imports.

Also, please sync your branch with main and run make style.

Copy link

github-actions bot commented Dec 3, 2024

The code quality check failed, please run make style.

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@regisss regisss merged commit af82276 into huggingface:main Dec 3, 2024
4 checks passed
@abhilash1910
Copy link
Contributor Author

Thanks @regisss for attending this and apologies could not update PR in time 👍🏻

@abhilash1910 abhilash1910 deleted the refactor_attn_kv branch December 4, 2024 12:33
regisss added a commit that referenced this pull request Dec 5, 2024
Co-authored-by: regisss <15324346+regisss@users.noreply.github.com>
yafshar added a commit to yafshar/optimum-habana that referenced this pull request Dec 9, 2024
- Correct the datatype during training used for rotary positional
  embedding calculation.
- Remove unused function

Note: The accuracy issue was identified to be caused by a previous
pull request (huggingface#1148)
yafshar added a commit to yafshar/optimum-habana that referenced this pull request Dec 10, 2024
- Correct the datatype during training used for rotary positional
  embedding calculation.
- Remove unused function

Note: The accuracy issue was identified to be caused by a previous
pull request (huggingface#1148)
imangohari1 pushed a commit to imangohari1/optimum-habana that referenced this pull request Dec 10, 2024
Co-authored-by: regisss <15324346+regisss@users.noreply.github.com>
yafshar added a commit to yafshar/optimum-habana that referenced this pull request Dec 10, 2024
- Correct the datatype during training used for rotary positional
  embedding calculation.
- Remove unused function.

Note: The accuracy issue was identified to be caused by a previous
pull request (huggingface#1148)
Liangyx2 pushed a commit to HabanaAI/optimum-habana-fork that referenced this pull request Jan 20, 2025
Co-authored-by: regisss <15324346+regisss@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
run-test Run CI for PRs from external contributors
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants