-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update dataset_formats.mdx #2222
Update dataset_formats.mdx #2222
Conversation
docs/source/dataset_formats.mdx
Outdated
@@ -209,6 +209,7 @@ Choosing the right dataset format depends on the task you are working on and the | |||
| [`RewardTrainer`] | [Preference (implicit prompt recommended)](#preference) | | |||
| [`SFTTrainer`] | [Language modeling](#language-modeling) | | |||
| [`XPOTrainer`] | [Prompt-only](#prompt-only) | | |||
| [`GKDTrainer`] | [Conversational](#conversational-dataset-format) | | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
conversational is a type, not a format. I think GKD needs prompt-only but you'll need to double check
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please also sort it alphabeticaly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no GKD needs prompts as well as full conversations in the "messages" key, the prompts are typically the messages without the very last assistant reply... depending on the mode... we need the prompts to figure out labels for the completion in the full message and to generate from the teacher/student for the online case
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think GKD needs prompt-only but you'll need to double check
no GKD needs prompts as well as full conversations in the "messages" key,
in GKDTrainer docs the expected data format is described as:
The dataset should be formatted as a list of “messages” where each message is a list of dictionaries with the following keys:
- role: either system, assistant or user
- content: the message content
like:
messages = [
{"role": "user", "content": "Hello, how are you?"},
{"role": "assistant", "content": "I'm doing great. How can I help you today?"},
{"role": "user", "content": "I'd like to show off how chat templating works!"},
]
which is why I wrote it as Conversational
is the GKDTrainer docs incorrect?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no its good! that is fine and if there is no "prompt" then it makes it fro the messages
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
which is why I wrote it as Conversational
"conversational" is a type not a format, please refer to the doc.
Thanks for the clarification @kashif. From what I understand, the correct format is then "prompt-completion".
@kashif, to be 100% correct, we will have to modify GKDTrainer so that it concats prompt and completion into a new column "messages"
. Ideally we want to have
prompt = [{"content": "Why do stars last so long?", "role": "user"}]
completion = [{"content": "Stars last a long time due to the process of nuclear fusion that occurs []...]", "role": "assistant"}]
instead of
prompt = [{"content": "Why do stars last so long?", "role": "user"}]
messages = [
{"content": "Why do stars last so long?", "role": "user"},
{"content": "Stars last a long time due to the process of nuclear fusion that occurs []...]", "role": "assistant"},
]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is the GKDTrainer docs incorrect?
it's not incorrect but it has to be updated. Don't worry about that, we will make this clarification in a follow-up PR
When you write |
Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>
Lgtm thanks @August-murr! |
What does this PR do?
Fixes #2219
The trainers on TRL docs from the website have links attached, but the markdown file in the repo didn't contain any of the links. So, I wasn't sure If I should add the
GKDTrainer
docs link to the table, please let me know if I need to add it to this PR.Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines.
Who can review?
@qgallouedec