-
Notifications
You must be signed in to change notification settings - Fork 530
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
clean up dataset conversion readme #168
Conversation
Co-authored-by: Daniel King <43149077+dakinggg@users.noreply.github.com>
Co-authored-by: Daniel King <43149077+dakinggg@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The finetuning stuff looks good to me, so I'm approving that part. But, as mentioned in a comment. I have caught some QOL stuff with the preprocessing function that I think needs to be addressed in a separate PR. That will require me to edit the finetuning section, but what you added here is a very useful starting point for that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add a link in the top level description to streaming?
* clean up dataset conversion readme * Update scripts/data_prep/README.md Co-authored-by: Daniel King <43149077+dakinggg@users.noreply.github.com> * Update scripts/data_prep/README.md Co-authored-by: Daniel King <43149077+dakinggg@users.noreply.github.com> * addresses feedback on PR * add links to relevant proprocessing functions * add link to streaming --------- Co-authored-by: Daniel King <43149077+dakinggg@users.noreply.github.com>
Cleans up existing readme and adds finetuning dataset example