Fix Preference Loss and Refactor for Readability #484

austin362667 · 2024-12-17T03:10:42Z

Summary

Thanks to @winglian and @shivam15s noticed and fixed this #481.

This PR suggests negating the preference loss terms to align with the formulas in the docstrings, while maintaining the base preference structure as nll_loss + preference_loss. This would make our loss computations more consistent since both terms would represent losses to be minimized.

[UPDATE: It seems like being addressed now in here]
This PR also tightened the tolerance in case of encountering a similar issue.

Testing Done

Hardware Type:
run make test to ensure correctness
run make checkstyle to ensure code style
run make test-convergence to ensure convergence

shivam15s · 2024-12-17T03:40:39Z

test/chunked_loss/test_cpo_loss.py

+            losses = -(
                F.logsigmoid(self.beta * logits) * (1 - self.label_smoothing)
                + F.logsigmoid(-self.beta * logits) * self.label_smoothing
            )
        elif self.loss_type == "simpo":
            logits = logits - (self.simpo_gamma / self.beta)
-            losses = (
+            losses = -(
                F.logsigmoid(self.beta * logits) * (1 - self.label_smoothing)
                + F.logsigmoid(-self.beta * logits) * self.label_smoothing
            )


nit: can we have the - sign inside the brackets:
similar to https://github.com/huggingface/trl/blob/0fe73a8af5ff660becc79bff88d9e8b090dd004f/trl/trainer/dpo_trainer.py#L949

Sure! Thank you for reviewing.

Signed-off-by: Austin Liu <austin362667@gmail.com>

shivam15s requested changes Dec 19, 2024

View reviewed changes

winglian and others added 3 commits December 20, 2024 10:10

preference loss sign is inverted and leads to negative loss

0fe5f6c

fix test sign too

60f85bb

Fix readability

9e7d497

Signed-off-by: Austin Liu <austin362667@gmail.com>

austin362667 force-pushed the preference-loss-sign branch from 526bf4e to a12c2f1 Compare December 20, 2024 02:14

austin362667 added 2 commits December 20, 2024 13:57

Format

fac222d

Signed-off-by: Austin Liu <austin362667@gmail.com>

Fix codestyle

ce74888

Signed-off-by: Austin Liu <austin362667@gmail.com>

austin362667 force-pushed the preference-loss-sign branch from a12c2f1 to ce74888 Compare December 20, 2024 06:01

Merge branch 'main' into preference-loss-sign

ecb2b3e

yundai424 approved these changes Dec 20, 2024

View reviewed changes

shivam15s approved these changes Dec 20, 2024

View reviewed changes

shivam15s merged commit 15a2f58 into main Dec 20, 2024
5 checks passed

shivam15s deleted the preference-loss-sign branch December 20, 2024 23:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix Preference Loss and Refactor for Readability #484

Fix Preference Loss and Refactor for Readability #484

austin362667 commented Dec 17, 2024 •

edited

Loading

shivam15s Dec 17, 2024

austin362667 Dec 20, 2024

Fix Preference Loss and Refactor for Readability #484

Fix Preference Loss and Refactor for Readability #484

Conversation

austin362667 commented Dec 17, 2024 • edited Loading

Summary

Testing Done

shivam15s Dec 17, 2024

Choose a reason for hiding this comment

austin362667 Dec 20, 2024

Choose a reason for hiding this comment

austin362667 commented Dec 17, 2024 •

edited

Loading