Refactor KV cache, Rope , reduce common code #1148

abhilash1910 · 2024-07-22T08:32:33Z

What does this PR do?

Reduces common KVCache class code across generation models, refactors into modelling class.
cc @libinta @vidyasiv

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you make sure to update the documentation with your changes?
Did you write any new necessary tests?

ssarkar2

Can you also make similar changes for modeling_mistral.py, modeling_clip.py, modeling_mixtral.py (and other files, which might have Matmul or KVCache)

optimum/habana/transformers/models/falcon/modeling_falcon.py

nprotasov · 2024-07-29T12:29:54Z

I think this PR can be affected by new PR in transformers:
huggingface/transformers#31999

yafshar · 2024-07-29T14:04:16Z

@abhilash1910 can you also consider the point made by @ulivne in #1160 (comment) It makes sense to have them here.

abhilash1910 · 2024-07-31T07:43:00Z

Yes @yafshar will update this to include here, also seeing the #31999 - I guess it can added incrementally.

optimum/habana/transformers/models/modeling_all_models.py

abhilash1910 · 2024-08-05T05:48:21Z

@yafshar @ssarkar2 please help trigger CI and review. Thanks

yafshar · 2024-09-21T12:27:00Z

@abhilash1910 would you please re-base the code. The CI is failing. Thanks

abhilash1910 · 2024-09-25T12:10:51Z

@yafshar please help to trigger the CI

yafshar · 2024-09-25T13:22:55Z

@libinta would you please label this PR

libinta · 2024-11-14T23:15:53Z

@abhilash1910 can you resolve conflict?

abhilash1910 · 2024-11-18T16:55:03Z

@libinta done, please help to review. Thanks

emascarenhas · 2024-11-21T23:36:47Z

@abhilash1910 , Please run the CI test requested above and post results here in the pull request if you'd like this to be in 1.19 release, otherwise we can push it to 1.20.

regisss

Nice! I left the same comment everywhere to push the use of relative imports.

Also, please sync your branch with main and run make style.

optimum/habana/transformers/models/clip/modeling_clip.py

optimum/habana/transformers/models/falcon/modeling_falcon.py

optimum/habana/transformers/models/gpt_neox/modeling_gpt_neox.py

optimum/habana/transformers/models/llama/modeling_llama.py

optimum/habana/transformers/models/mistral/modeling_mistral.py

optimum/habana/transformers/models/mixtral/modeling_mixtral.py

optimum/habana/transformers/models/phi/modeling_phi.py

optimum/habana/transformers/models/qwen2/modeling_qwen2.py

optimum/habana/transformers/models/starcoder2/modeling_starcoder2.py

github-actions · 2024-12-03T22:44:56Z

The code quality check failed, please run make style.

HuggingFaceDocBuilderDev · 2024-12-03T22:51:09Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

abhilash1910 · 2024-12-04T12:33:17Z

Thanks @regisss for attending this and apologies could not update PR in time 👍🏻

Co-authored-by: regisss <15324346+regisss@users.noreply.github.com>

- Correct the datatype during training used for rotary positional embedding calculation. - Remove unused function Note: The accuracy issue was identified to be caused by a previous pull request (huggingface#1148)

Co-authored-by: regisss <15324346+regisss@users.noreply.github.com>

- Correct the datatype during training used for rotary positional embedding calculation. - Remove unused function. Note: The accuracy issue was identified to be caused by a previous pull request (huggingface#1148)

Co-authored-by: regisss <15324346+regisss@users.noreply.github.com>

abhilash1910 and others added 7 commits July 22, 2024 12:37

refactor

c1fe04f

Update __init__.py

6235a7c

add refactor sample

5ca1ee0

add module

f5cefd7

refacttor qwen

25665b6

refactor qwen falcon llama phi

0ec3ebf

revert commits

60dd26d

abhilash1910 requested review from mandy-li and libinta as code owners July 22, 2024 08:32

abhilash1910 requested a review from a user July 22, 2024 08:32

abhilash1910 requested a review from regisss as a code owner July 22, 2024 08:32

format fix

0592530

ssarkar2 reviewed Jul 23, 2024

View reviewed changes

optimum/habana/transformers/models/falcon/modeling_falcon.py Show resolved Hide resolved

ssarkar2 mentioned this pull request Jul 23, 2024

Starcoder2 : KVCache and flash attention (FusedSDPA) enablement #1149

Merged

abhilash1910 added 4 commits July 24, 2024 11:00

refactor clip

c8400a9

refactor mistral

c5392f4

refactor mixtral

5b2be87

Merge branch 'huggingface:main' into refactor_attn_kv

1a7cd62

yafshar mentioned this pull request Jul 29, 2024

Set KV Cache update as static method #1160

Merged

abhilash1910 and others added 2 commits August 1, 2024 11:14

Merge branch 'huggingface:main' into refactor_attn_kv

85d58b4

refactor rope embeddings, use static update

55964bd

abhilash1910 requested a review from ZhaiFeiyue as a code owner August 1, 2024 07:55

abhilash1910 changed the title ~~Refactor KV cache , reduce common code~~ Refactor KV cache, Rope , reduce common code Aug 1, 2024

ulivne suggested changes Aug 1, 2024

View reviewed changes

optimum/habana/transformers/models/modeling_all_models.py Outdated Show resolved Hide resolved

abhilash1910 and others added 2 commits August 1, 2024 10:45

fix bugs

dd54ee7

Merge branch 'main' into refactor_attn_kv

798f99d

Merge branch 'main' into refactor_attn_kv

f2926bc

Merge branch 'main' into refactor_attn_kv

53fbe95

libinta added the run-test Run CI for PRs from external contributors label Nov 27, 2024

regisss reviewed Nov 28, 2024

View reviewed changes

abhilash1910 and others added 3 commits November 30, 2024 13:57

Merge branch 'huggingface:main' into refactor_attn_kv

8e47504

Switch to relative imports

baf7f9f

Merge remote-tracking branch 'optimum-habana/main' into refactor_attn_kv

c56b331

Make style

3ccb167

regisss approved these changes Dec 3, 2024

View reviewed changes

regisss merged commit af82276 into huggingface:main Dec 3, 2024
4 checks passed

abhilash1910 deleted the refactor_attn_kv branch December 4, 2024 12:33

regisss added a commit that referenced this pull request Dec 5, 2024

Refactor KV cache, Rope , reduce common code (#1148)

d49ca3b

Co-authored-by: regisss <15324346+regisss@users.noreply.github.com>

yafshar mentioned this pull request Dec 10, 2024

Fix Accuracy Calculation Issue in GPT-NeoX #1583

Closed

3 tasks

imangohari1 pushed a commit to imangohari1/optimum-habana that referenced this pull request Dec 10, 2024

Refactor KV cache, Rope , reduce common code (huggingface#1148)

1bfc5df

Co-authored-by: regisss <15324346+regisss@users.noreply.github.com>

yafshar mentioned this pull request Dec 10, 2024

Fix Accuracy Calculation Issue in GPT-NeoX #1591

Merged

3 tasks

jiminha mentioned this pull request Dec 12, 2024

Revert common KVCache not to check token_idx #1594

Merged

3 tasks

yeonsily mentioned this pull request Dec 23, 2024

Added missing parameter for llama function call #1663

Merged

3 tasks

Liangyx2 pushed a commit to HabanaAI/optimum-habana-fork that referenced this pull request Jan 20, 2025

Refactor KV cache, Rope , reduce common code (huggingface#1148)

6adc9ea

Co-authored-by: regisss <15324346+regisss@users.noreply.github.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor KV cache, Rope , reduce common code #1148

Refactor KV cache, Rope , reduce common code #1148

abhilash1910 commented Jul 22, 2024

ssarkar2 left a comment

nprotasov commented Jul 29, 2024

yafshar commented Jul 29, 2024

abhilash1910 commented Jul 31, 2024

abhilash1910 commented Aug 5, 2024

yafshar commented Sep 21, 2024

abhilash1910 commented Sep 25, 2024

yafshar commented Sep 25, 2024

libinta commented Nov 14, 2024

abhilash1910 commented Nov 18, 2024

emascarenhas commented Nov 21, 2024

regisss left a comment

github-actions bot commented Dec 3, 2024

HuggingFaceDocBuilderDev commented Dec 3, 2024

abhilash1910 commented Dec 4, 2024

Refactor KV cache, Rope , reduce common code #1148

Refactor KV cache, Rope , reduce common code #1148

Conversation

abhilash1910 commented Jul 22, 2024

What does this PR do?

Before submitting

ssarkar2 left a comment

Choose a reason for hiding this comment

nprotasov commented Jul 29, 2024

yafshar commented Jul 29, 2024

abhilash1910 commented Jul 31, 2024

abhilash1910 commented Aug 5, 2024

yafshar commented Sep 21, 2024

abhilash1910 commented Sep 25, 2024

yafshar commented Sep 25, 2024

libinta commented Nov 14, 2024

abhilash1910 commented Nov 18, 2024

emascarenhas commented Nov 21, 2024

regisss left a comment

Choose a reason for hiding this comment

github-actions bot commented Dec 3, 2024

HuggingFaceDocBuilderDev commented Dec 3, 2024

abhilash1910 commented Dec 4, 2024