-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
enable multiple eval datasets #1052
enable multiple eval datasets #1052
Conversation
@younesbelkada |
I have the same problem. When multiple eval datasets were pass as dict, It cause an error in dataset.map trl/trl/trainer/sft_trainer.py Line 379 in a60ceef
I think this PR is really useful. |
Yes, exactly this. And this error occurs naturally both when using packing and when not. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me but I'll let @younesbelkada have a look as well!
If you could add a test for this that would be awesome! |
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. |
@lvwerra |
@lvwerra |
@lvwerra @younesbelkada |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great to me, thanks for this contribution!
@peter-sk the tests are taking forever, I think that something went wrong with the test you designed, can you please have a quick look? |
@younesbelkada |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks again!
* enable multiple eval datasets * added test * try to avoid infinite computation * make sure eval set is not infinite * downsizing the test
The standard Trainer class from the transformers library (and the documentation of SFTTrainer) allow for multiple validation datasets to be passed as a dictionary from dataset name to dataset.
This does not work however in current SFTTrainer code. This PR fixes this.