-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tool fine-tuning support DPO #2479
base: main
Are you sure you want to change the base?
Tool fine-tuning support DPO #2479
Conversation
Before adding it to all the trainers, what do you think of the overall structure? Is it okay to include the tools in each trainer configuration? |
Thanks for this addition! Let's keep things as separate as possible, and keep this PR for DPO only. The code as is looks good to me. The only question is: can this type ( |
That's why I thought: from trl import DPOConfig, TrlParser
parser = TrlParser((DPOConfig,))
parser.parse_args_and_config()
I'm not sure what the best way to handle it right now, I'll sleep on it. |
a different PR for each trainer then?
Adding tools to the CLI would be quite complicated. It wouldn't be practical to add all the tools into the CLI. My best guess is to read the functions from another source, like another script, if there’s a request for it later. |
does this need anything else? test or docs? |
I also wanted to add it to |
What does this PR do?
adding tool support for function calling models in the DPOTrainer
Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines.
Who can review?
@qgallouedec