[ORPO] fix orpo chosen-nll loss #2502

kashif · 2024-12-19T10:09:22Z

What does this PR do?

Calculate the ORPO chosen nll loss with respect to the chosen completion only rather than the whole prompt+compeletion.

Also return the shifted logits when the model is decoder only

HuggingFaceDocBuilderDev · 2024-12-19T10:13:19Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

qgallouedec · 2024-12-19T10:32:23Z

trl/trainer/orpo_trainer.py

-            attention_mask = concatenated_batch["concatenated_attention_mask"]
-            labels = torch.where(attention_mask == 1, labels, self.label_pad_token_id)
-
+        labels = concatenated_batch["concatenated_labels"].clone()


Yes, checked together, if you do

labels = concatenated_batch["concatenated_input_ids"].clone() attention_mask = concatenated_batch["concatenated_attention_mask"] labels = torch.where(attention_mask == 1, labels, self.label_pad_token_id)

you don't ignore the prompt.

fix orpo chosen-nll loss

495bcac

kashif requested a review from qgallouedec December 19, 2024 10:09

kashif mentioned this pull request Dec 19, 2024

[Liger] add native liger-kernel orpo loss #2482

Open

qgallouedec reviewed Dec 19, 2024

View reviewed changes

qgallouedec approved these changes Dec 19, 2024

View reviewed changes

kashif merged commit 88ad1a0 into main Dec 19, 2024
14 checks passed

kashif deleted the orpo-nll-fix branch December 19, 2024 10:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ORPO] fix orpo chosen-nll loss #2502

[ORPO] fix orpo chosen-nll loss #2502

kashif commented Dec 19, 2024

HuggingFaceDocBuilderDev commented Dec 19, 2024

qgallouedec Dec 19, 2024

[ORPO] fix orpo chosen-nll loss #2502

[ORPO] fix orpo chosen-nll loss #2502

Conversation

kashif commented Dec 19, 2024

What does this PR do?

HuggingFaceDocBuilderDev commented Dec 19, 2024

qgallouedec Dec 19, 2024

Choose a reason for hiding this comment