Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update dataset_formats.mdx #2222

Merged
merged 5 commits into from
Oct 11, 2024

Conversation

August-murr
Copy link
Collaborator

What does this PR do?

Fixes #2219
The trainers on TRL docs from the website have links attached, but the markdown file in the repo didn't contain any of the links. So, I wasn't sure If I should add the GKDTrainer docs link to the table, please let me know if I need to add it to this PR.

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a GitHub issue? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines.
  • Did you write any new necessary tests?

Who can review?

@qgallouedec

@@ -209,6 +209,7 @@ Choosing the right dataset format depends on the task you are working on and the
| [`RewardTrainer`] | [Preference (implicit prompt recommended)](#preference) |
| [`SFTTrainer`] | [Language modeling](#language-modeling) |
| [`XPOTrainer`] | [Prompt-only](#prompt-only) |
| [`GKDTrainer`] | [Conversational](#conversational-dataset-format) | |
Copy link
Member

@qgallouedec qgallouedec Oct 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

conversational is a type, not a format. I think GKD needs prompt-only but you'll need to double check

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please also sort it alphabeticaly

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no GKD needs prompts as well as full conversations in the "messages" key, the prompts are typically the messages without the very last assistant reply... depending on the mode... we need the prompts to figure out labels for the completion in the full message and to generate from the teacher/student for the online case

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think GKD needs prompt-only but you'll need to double check

no GKD needs prompts as well as full conversations in the "messages" key,

in GKDTrainer docs the expected data format is described as:
The dataset should be formatted as a list of “messages” where each message is a list of dictionaries with the following keys:

  • role: either system, assistant or user
  • content: the message content

like:

messages = [
    {"role": "user", "content": "Hello, how are you?"},
    {"role": "assistant", "content": "I'm doing great. How can I help you today?"},
    {"role": "user", "content": "I'd like to show off how chat templating works!"},
]

which is why I wrote it as Conversational

is the GKDTrainer docs incorrect?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no its good! that is fine and if there is no "prompt" then it makes it fro the messages

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

which is why I wrote it as Conversational

"conversational" is a type not a format, please refer to the doc.

Thanks for the clarification @kashif. From what I understand, the correct format is then "prompt-completion".
@kashif, to be 100% correct, we will have to modify GKDTrainer so that it concats prompt and completion into a new column "messages". Ideally we want to have

prompt = [{"content": "Why do stars last so long?", "role": "user"}]
completion = [{"content": "Stars last a long time due to the process of nuclear fusion that occurs []...]", "role": "assistant"}]

instead of

prompt = [{"content": "Why do stars last so long?", "role": "user"}]
messages = [
    {"content": "Why do stars last so long?", "role": "user"},
    {"content": "Stars last a long time due to the process of nuclear fusion that occurs []...]", "role": "assistant"},
]

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is the GKDTrainer docs incorrect?

it's not incorrect but it has to be updated. Don't worry about that, we will make this clarification in a follow-up PR

@qgallouedec
Copy link
Member

qgallouedec commented Oct 11, 2024

The trainers on TRL docs from the website have links attached, but the markdown file in the repo didn't contain any of the links. So, I wasn't sure If I should add the GKDTrainer docs link to the table, please let me know if I need to add it to this PR.

When you write [`GKDTrainer`], the link is automatically created, no need to add it

August-murr and others added 3 commits October 11, 2024 20:45
@qgallouedec
Copy link
Member

Lgtm thanks @August-murr!

@qgallouedec qgallouedec merged commit b81a612 into huggingface:main Oct 11, 2024
@August-murr August-murr deleted the update-datasets_format-docs branch December 10, 2024 07:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

GKD trainer missing from docs/dataset formats
3 participants