Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Let DPOTrainer Support padding_free #2422

Open
fzyzcjy opened this issue Dec 1, 2024 · 10 comments · May be fixed by #2437
Open

Let DPOTrainer Support padding_free #2422

fzyzcjy opened this issue Dec 1, 2024 · 10 comments · May be fixed by #2437
Labels
🏋 DPO Related to DPO ✨ enhancement New feature or request 🧒 good second issue Good for contributors with basic project familiarity 🙋 help from community wanted Open invitation for community members to contribute

Comments

@fzyzcjy
Copy link
Contributor

fzyzcjy commented Dec 1, 2024

Feature request

Hi thanks for the library! It seems that https://huggingface.co/blog/packing-with-FA2 introduces a way to avoid a lot of pad tokens in SFT, and makes training faster. Therefore, it would be great if the same thing can be used for DPO.

Motivation

(see above)

Your contribution

n/a

@qgallouedec
Copy link
Member

Thanks @fzyzcjy! Can you elaborate a bit? What is this padding free method?

@fzyzcjy
Copy link
Contributor Author

fzyzcjy commented Dec 1, 2024

Oh sorry I provided the wrong link, now the link is updated to point to the correct "padding_free" article

@qgallouedec
Copy link
Member

Thanks for the pointer. This would be nice addition! Any contribution is welcome. I mark this one as good second issue

@qgallouedec qgallouedec added ✨ enhancement New feature or request 🙋 help from community wanted Open invitation for community members to contribute 🧒 good second issue Good for contributors with basic project familiarity 🏋 DPO Related to DPO labels Dec 1, 2024
@qgallouedec
Copy link
Member

The guideline is basically to:

  1. Update PreferenceCollator to add padding_free like in add arg padding_free to DataCollatorForCompletionOnlyLM #1887
  2. Update concatenated_inputs to (a) make xxx_attention_mask optional and add support for xxx_position_ids
  3. Add a test

@fzyzcjy
Copy link
Contributor Author

fzyzcjy commented Dec 1, 2024

Thank you!

@dame-cell
Copy link

  • should padding_free in PreferenceCollator be like kind of a optional arguement or like keep it default ?

but why make `xxx_attention_mask' optional ?
is it because padding-free sequences might not use attention masks at all.?
for example

  • In regular training with padding, attention_masks are needed to tell the model which tokens are real and which are padding (0s for padding, 1s for real tokens)
  • In padding-free training, since we remove all padding tokens, every token is a real token, so we don't need explicit masks to distinguish between real and padding

does this make sense?
Thank you for your patience - I wanted to verify that I understand these concepts correctly

@qgallouedec
Copy link
Member

I think it makes sense yes.

@dame-cell
Copy link

@fzyzcjy @qgallouedec if no one is working on this I would like to help

@fzyzcjy
Copy link
Contributor Author

fzyzcjy commented Dec 1, 2024

@dame-cell I do not have time recently for that, and your PR would be great and many thanks!

@zwhe99
Copy link

zwhe99 commented Dec 2, 2024

Is it possible for PPO to support padding_free?

@dame-cell dame-cell linked a pull request Dec 4, 2024 that will close this issue
5 tasks
@qgallouedec qgallouedec linked a pull request Dec 13, 2024 that will close this issue
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🏋 DPO Related to DPO ✨ enhancement New feature or request 🧒 good second issue Good for contributors with basic project familiarity 🙋 help from community wanted Open invitation for community members to contribute
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants