Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Question] add_generation_prompt=True on prompt #2346

Closed
Galaxy-Husky opened this issue Nov 11, 2024 · 5 comments
Closed

[Question] add_generation_prompt=True on prompt #2346

Galaxy-Husky opened this issue Nov 11, 2024 · 5 comments

Comments

@Galaxy-Husky
Copy link
Contributor

Hi,

I noticed that from v0.11.0, maybe_apply_chat_template added generation prompt on the prompt of the example, which was different from the previous version.

trl/trl/data_utils.py

Lines 86 to 88 in 0238d96

# Apply the chat template to the prompt, adding the generation prompt
if "prompt" in example:
prompt = tokenizer.apply_chat_template(example["prompt"], tokenize=False, add_generation_prompt=True)

Was it a little bug? Will the generation prompt influence the final result?

A prompt reply is appreciated.

@qgallouedec
Copy link
Member

Can you point the "previous version" you are refering to?

@qgallouedec
Copy link
Member

I think it has been like this from the initial implementation (see #2020)

@Galaxy-Husky
Copy link
Contributor Author

I think it has been like this from the initial implementation (see #2020)

Sorry, I didn't say that right. I mean before v0.11.0, there was no maybe_apply_chat_template back then. For example, the dpo dataset was preprocessed like:

trl/examples/scripts/dpo.py

Lines 156 to 160 in 55cc4b1

def process(row):
row["prompt"] = tokenizer.apply_chat_template(row["chosen"][:-1], tokenize=False)
row["chosen"] = tokenizer.apply_chat_template([row["chosen"][-1]], tokenize=False)
row["rejected"] = tokenizer.apply_chat_template([row["rejected"][-1]], tokenize=False)
return row

Since the code has been refactored , I'm not sure if there was generation prompt or not. If so, could you please point out where it was implemented?

@qgallouedec
Copy link
Member

Yes the example code was wrong, you need to add a generation prompt at the end of the prompt.

@Galaxy-Husky
Copy link
Contributor Author

Yes the example code was wrong, you need to add a generation prompt at the end of the prompt.

I see. Thanks a lot!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants