Bug fixes #1259

danielhanchen · 2024-11-07T01:16:48Z

orginal -> original

* Fix DPO, ORPO (#1177) * Fix TRL * Update mistral.py * Patch processing_class * Update tokenizer_utils.py * Update tokenizer_utils.py * Update tokenizer_utils.py * Update tokenizer_utils.py * Update tokenizer_utils.py * Update tokenizer_utils.py * Installation guide (#1165) * chore: update chat_templates.py (#1166) orginal -> original * Disable Flex Attention * Update tokenizer_utils.py * Update _utils.py * n_items * Update cross_entropy_loss.py * Fix DPO, ORPO * Update _utils.py --------- Co-authored-by: timothelaborie <97834767+timothelaborie@users.noreply.github.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> * Add warning for missing Unpack and KwargsForCausalLM in older Transformers versions --------- Co-authored-by: Daniel Han <danielhanchen@gmail.com> Co-authored-by: timothelaborie <97834767+timothelaborie@users.noreply.github.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com>

* Enhance rotary embedding handling in LlamaAttention and LongRopeRotaryEmbedding * Typo * Improve rotary embedding handling in LlamaAttention to prevent errors with short KV cache * Update llama.py * Update llama.py --------- Co-authored-by: Daniel Han <danielhanchen@gmail.com>

#1254) * Fix: cast logits to float32 in cross_entropy_forward to prevent errors * Update cross_entropy_loss.py --------- Co-authored-by: Daniel Han <danielhanchen@gmail.com>

) * Throw error when inferencing longer than max_popsition_embeddings without rope scaling * Update llama.py --------- Co-authored-by: Daniel Han <danielhanchen@gmail.com>

Co-authored-by: root <root@ieeres.chu.cam.ac.uk>

danielhanchen and others added 30 commits October 21, 2024 01:02

Fix TRL

f0aca90

Update mistral.py

f4ae585

Patch processing_class

106f213

Update tokenizer_utils.py

ef84212

Update tokenizer_utils.py

4f7c527

Update tokenizer_utils.py

aa2b207

Update tokenizer_utils.py

101389d

Update tokenizer_utils.py

c0f0fc9

Update tokenizer_utils.py

b3e0033

Installation guide (#1165)

aabb5ff

chore: update chat_templates.py (#1166)

30bf339

orginal -> original

Disable Flex Attention

2895839

Update tokenizer_utils.py

06f5d75

Update _utils.py

28e6eea

n_items

b821f20

Update cross_entropy_loss.py

e561366

Fix DPO, ORPO

4ff247a

Merge branch 'main' into nightly

2b858a5

Update _utils.py

1c063b4

Update _utils.py

f195ee1

Update cross_entropy_loss.py

5961c34

Update _utils.py

7308bb8

Update _utils.py

0096e5b

Merge branch 'main' into nightly

44b480f

donot upcast lm_head and embeddings to float32 (#1186)

6776055

Cleanup upcast logs (#1188)

625209e

Update transformers

6f28d16

Merge branch 'main' into nightly

f94f7c1

danielhanchen and others added 29 commits November 5, 2024 14:43

Update cross_entropy_loss.py

aeec57e

Update _utils.py

fb393fc

Update cross_entropy_loss.py

cab1e72

Update cross_entropy_loss.py

51fea97

Update cross_entropy_loss.py

58e541b

Merge branch 'main' into nightly

0ed0532

Update llama.py

ef2c56f

Merge branch 'main' into nightly

24ab0d2

Update _utils.py

13d7412

Update _utils.py

5a7eaf8

Update _utils.py

d2186ed

Update _utils.py

6434447

Update _utils.py

67611e6

Merge branch 'main' into nightly

36c5836

Fix: cast logits to float32 in cross_entropy_forward to prevent errors (

f24aef5

#1254) * Fix: cast logits to float32 in cross_entropy_forward to prevent errors * Update cross_entropy_loss.py --------- Co-authored-by: Daniel Han <danielhanchen@gmail.com>

Throw error when inferencing longer than max_popsition_embeddings (#1236

3d906e6

) * Throw error when inferencing longer than max_popsition_embeddings without rope scaling * Update llama.py --------- Co-authored-by: Daniel Han <danielhanchen@gmail.com>

CLI now handles user input strings for dtype correctly (#1235)

de1049b

Co-authored-by: root <root@ieeres.chu.cam.ac.uk>

Update flex_attention.py

be72975

Update _utils.py

05170cd

Update _utils.py

7e0877d

Update flex_attention.py

6b5c599

Update flex_attention.py

1ba9f2e

Update loader.py

da61c4d

Update loader.py

3316ee2

Update flex_attention.py

501ca84

Update flex_attention.py

ce621b7

Update flex_attention.py

4b01ff1

Update flex_attention.py

ef5052a

Update _utils.py

52bca32

danielhanchen merged commit 8d6d78f into main Nov 7, 2024
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug fixes #1259

Bug fixes #1259

danielhanchen commented Nov 7, 2024

Bug fixes #1259

Bug fixes #1259

Conversation

danielhanchen commented Nov 7, 2024