-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: anchored pref optimization #1928
Conversation
Nice, thanks, I'll take the opportunity to update the documentations for the losses we already support (in another PR that I'd like to merge first) |
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
@karel-contextual can you kindly run:
in the root dir of TRL to fix up the formatting |
LGTM now, thanks @karel-contextual! |
Add APO objectives, specifically equation 7 and 8 of the APO paper (https://huggingface.co/papers/2408.06266)