Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Qwen 2.5 #1280

Merged
merged 209 commits into from
Nov 12, 2024
Merged

Qwen 2.5 #1280

merged 209 commits into from
Nov 12, 2024

Conversation

danielhanchen
Copy link
Contributor

No description provided.

danielhanchen and others added 30 commits October 21, 2024 01:02
* Fix DPO, ORPO (#1177)

* Fix TRL

* Update mistral.py

* Patch processing_class

* Update tokenizer_utils.py

* Update tokenizer_utils.py

* Update tokenizer_utils.py

* Update tokenizer_utils.py

* Update tokenizer_utils.py

* Update tokenizer_utils.py

* Installation guide (#1165)

* chore: update chat_templates.py (#1166)

orginal -> original

* Disable Flex Attention

* Update tokenizer_utils.py

* Update _utils.py

* n_items

* Update cross_entropy_loss.py

* Fix DPO, ORPO

* Update _utils.py

---------

Co-authored-by: timothelaborie <97834767+timothelaborie@users.noreply.github.com>
Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com>

* Add warning for missing Unpack and KwargsForCausalLM in older Transformers versions

---------

Co-authored-by: Daniel Han <danielhanchen@gmail.com>
Co-authored-by: timothelaborie <97834767+timothelaborie@users.noreply.github.com>
Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com>
* Enhance rotary embedding handling in LlamaAttention and LongRopeRotaryEmbedding

* Typo

* Improve rotary embedding handling in LlamaAttention to prevent errors with short KV cache

* Update llama.py

* Update llama.py

---------

Co-authored-by: Daniel Han <danielhanchen@gmail.com>
danielhanchen and others added 29 commits November 6, 2024 02:10
#1254)

* Fix: cast logits to float32 in cross_entropy_forward to prevent errors

* Update cross_entropy_loss.py

---------

Co-authored-by: Daniel Han <danielhanchen@gmail.com>
)

* Throw error when inferencing longer than max_popsition_embeddings without rope scaling

* Update llama.py

---------

Co-authored-by: Daniel Han <danielhanchen@gmail.com>
Co-authored-by: root <root@ieeres.chu.cam.ac.uk>
@danielhanchen danielhanchen merged commit 899caf0 into main Nov 12, 2024
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants