-
Notifications
You must be signed in to change notification settings - Fork 27k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow apply_chat_template to pass kwargs to the template and support a dict of templates #29658
Conversation
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
tokenizer_kwargs: Optional[Dict[str, Any]] = None, | ||
**kwargs, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a breaking change? 👀
Previously passed "tokenizer kwargs" will now be passed to "kwargs"
template_dict = self.chat_template or self.default_chat_template | ||
if chat_template is not None and chat_template in template_dict: | ||
# The user can pass the name of a template to the chat template argument instead of an entire template | ||
chat_template = template_dict[chat_template] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At this point, isn't chat_template
the actual chat template that's passed as input?
chat_template (str, *optional*): A Jinja template to use for this conversion.
It's being used as the key in the template_dict
, but what if the user had passed a chat_template
to be used as a template?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like if self.chat_template
is a dict
, when passing in chat_template
it is always considered a key when it could be a template. If self.chat_template
isn't a dict and self.default_chat_template
isn't either, then chat_template
is now considered a chat template
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a bit hacky, but chat_template
is only used as a key if it exists as a key in the template_dict. If the user passes an actual Jinja template, that almost certainly will not exist as a key, and so chat_template
is treated as a template string.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome, this looks good to me! Thanks @Rocketknight1
This'll end up changing some templates on the Hub so let's let others that depend on it know.
…a dict of templates (#29658) * Allow apply_chat_template to pass kwargs to the template * Fix priority for template_kwargs * Fix docstring * style fix * Add the option for the model to have a dict of templates * Error message cleanup * Add test for chat template dicts * Simplify the chat template dict test and apply it to all tokenizers in self.get_tokenizers() * Save chat template dicts as lists with fixed key names * Add test for serialization/reloading * Add require_jinja just to be safe, even though I don't think we use it
As the title suggests, this is a simple PR that allows two new features:
kwargs
can be passed throughapply_chat_template
to the template renderer.apply_chat_template
(this is used by the new Command-R model)This PR is very slightly breaking (we used to pass kwargs to the tokenizer), but I don't think this was commonly used, and I think the new usage is more intuitive. Users can still pass kwargs through the method to the tokenizer with the
tokenizer_kwargs
argument.Other than that, it should have no effect on existing users/templates!