Qwen 2.5 #1280

danielhanchen · 2024-11-12T11:22:31Z

No description provided.

orginal -> original

* Fix DPO, ORPO (#1177) * Fix TRL * Update mistral.py * Patch processing_class * Update tokenizer_utils.py * Update tokenizer_utils.py * Update tokenizer_utils.py * Update tokenizer_utils.py * Update tokenizer_utils.py * Update tokenizer_utils.py * Installation guide (#1165) * chore: update chat_templates.py (#1166) orginal -> original * Disable Flex Attention * Update tokenizer_utils.py * Update _utils.py * n_items * Update cross_entropy_loss.py * Fix DPO, ORPO * Update _utils.py --------- Co-authored-by: timothelaborie <97834767+timothelaborie@users.noreply.github.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> * Add warning for missing Unpack and KwargsForCausalLM in older Transformers versions --------- Co-authored-by: Daniel Han <danielhanchen@gmail.com> Co-authored-by: timothelaborie <97834767+timothelaborie@users.noreply.github.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com>

* Enhance rotary embedding handling in LlamaAttention and LongRopeRotaryEmbedding * Typo * Improve rotary embedding handling in LlamaAttention to prevent errors with short KV cache * Update llama.py * Update llama.py --------- Co-authored-by: Daniel Han <danielhanchen@gmail.com>

#1254) * Fix: cast logits to float32 in cross_entropy_forward to prevent errors * Update cross_entropy_loss.py --------- Co-authored-by: Daniel Han <danielhanchen@gmail.com>

) * Throw error when inferencing longer than max_popsition_embeddings without rope scaling * Update llama.py --------- Co-authored-by: Daniel Han <danielhanchen@gmail.com>

Co-authored-by: root <root@ieeres.chu.cam.ac.uk>

danielhanchen and others added 30 commits October 21, 2024 01:02

Fix TRL

f0aca90

Update mistral.py

f4ae585

Patch processing_class

106f213

Update tokenizer_utils.py

ef84212

Update tokenizer_utils.py

4f7c527

Update tokenizer_utils.py

aa2b207

Update tokenizer_utils.py

101389d

Update tokenizer_utils.py

c0f0fc9

Update tokenizer_utils.py

b3e0033

Installation guide (#1165)

aabb5ff

chore: update chat_templates.py (#1166)

30bf339

orginal -> original

Disable Flex Attention

2895839

Update tokenizer_utils.py

06f5d75

Update _utils.py

28e6eea

n_items

b821f20

Update cross_entropy_loss.py

e561366

Fix DPO, ORPO

4ff247a

Merge branch 'main' into nightly

2b858a5

Update _utils.py

1c063b4

Update _utils.py

f195ee1

Update cross_entropy_loss.py

5961c34

Update _utils.py

7308bb8

Update _utils.py

0096e5b

Merge branch 'main' into nightly

44b480f

donot upcast lm_head and embeddings to float32 (#1186)

6776055

Cleanup upcast logs (#1188)

625209e

Update transformers

6f28d16

Merge branch 'main' into nightly

f94f7c1

danielhanchen and others added 29 commits November 6, 2024 02:10

Update _utils.py

67611e6

Merge branch 'main' into nightly

36c5836

Fix: cast logits to float32 in cross_entropy_forward to prevent errors (

f24aef5

#1254) * Fix: cast logits to float32 in cross_entropy_forward to prevent errors * Update cross_entropy_loss.py --------- Co-authored-by: Daniel Han <danielhanchen@gmail.com>

Throw error when inferencing longer than max_popsition_embeddings (#1236

3d906e6

) * Throw error when inferencing longer than max_popsition_embeddings without rope scaling * Update llama.py --------- Co-authored-by: Daniel Han <danielhanchen@gmail.com>

CLI now handles user input strings for dtype correctly (#1235)

de1049b

Co-authored-by: root <root@ieeres.chu.cam.ac.uk>

Update flex_attention.py

be72975

Update _utils.py

05170cd

Update _utils.py

7e0877d

Update flex_attention.py

6b5c599

Update flex_attention.py

1ba9f2e

Update loader.py

da61c4d

Update loader.py

3316ee2

Update flex_attention.py

501ca84

Update flex_attention.py

ce621b7

Update flex_attention.py

4b01ff1

Update flex_attention.py

ef5052a

Update _utils.py

52bca32

Merge branch 'main' into nightly

68b8d62

Merge branch 'main' into nightly

15da065

Update cross_entropy_loss.py

8b3e9c2

Update _utils.py

3a1e7ef

Update tokenizer_utils.py

f1ec165

Update tokenizer_utils.py

a4e9705

Update tokenizer_utils.py

92c6a27

Update tokenizer_utils.py

673f541

Update tokenizer_utils.py

8fe9109

triton_cast

ad41479

Update utils.py

fcf2009

Qwen 2.5 Coder

af9ba07

danielhanchen merged commit 899caf0 into main Nov 12, 2024
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Qwen 2.5 #1280

Qwen 2.5 #1280

danielhanchen commented Nov 12, 2024

Qwen 2.5 #1280

Qwen 2.5 #1280

Conversation

danielhanchen commented Nov 12, 2024