Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use jinja template for chat formatting (#730) #744

Merged
merged 6 commits into from
Apr 4, 2024

Conversation

nsarrazin
Copy link
Collaborator

@nsarrazin nsarrazin commented Jan 26, 2024

  • Upgrade transformers version and fix the types in transformersjs.ts

  • Got rid of legacy parameters userMessageToken, userMessageEndToken, assistantMessageToken, assistantMessageEndToken so this is a breaking change but we've been pushing for using chatPromptTemplate for a while so I feel like it makes sense.

We support both chatPromptTemplate and just specifying the tokenizer. Priority goes chatPromptTemplate > tokenizer (more specific)

Did not change the templates in prod yet because they don't work well with system prompts

(#730)

@nsarrazin nsarrazin added documentation Improvements or additions to documentation enhancement New feature or request back This issue is related to the Svelte backend or the DB labels Jan 26, 2024
@nsarrazin nsarrazin marked this pull request as draft January 26, 2024 16:46
@nsarrazin nsarrazin marked this pull request as ready for review April 3, 2024 13:33
@nsarrazin nsarrazin requested a review from mishig25 April 4, 2024 07:11
@nsarrazin
Copy link
Collaborator Author

@mishig25 this should be good to go! am not using it in the prod config because most of the templates fail, but tested locally with some other models and it should work well.

async function getChatPromptRender(
m: z.infer<typeof modelConfig>
): Promise<ReturnType<typeof compileTemplate<ChatTemplateInput>>> {
if (m.chatPromptTemplate) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldnt the order of fallback be the other way around? First, try to use transfomersjs, then try to use chatPromptTemplate

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The downside of doing it this way is that for example in prod, we specify the tokenizer for token counting but we actually want to override the chat template with our own template.

I think in terms of specificity it makes sense to be custom chat template > tokenizer "default" template wdyt?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep sounds good. And if we want to change the order, let's do it in follow-up PR. Also, if we want to change the order, we would need to handle #744 (comment) first

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes agreed, overall we should push towards using tokenizers rather than chat prompt templates, but we will need to handle the edge cases before indeed 😁

];
}

const output = tokenizer.apply_chat_template(formattedMessages, {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right now, "system" role would case error on prompts that does not naturally support "system" right ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, I tried looking into it, but because the error messages are hardcoded in the jinja template, it's not obvious to me how to determine if the template is failing bc of a lack of system prompt or for some other reason

I guess you could retry without the "system" message and see if it builds the prompt then? we could do it in a second PR if that sounds good to you ?

Copy link
Collaborator

@mishig25 mishig25 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm !

@nsarrazin nsarrazin merged commit 0819256 into main Apr 4, 2024
3 checks passed
@nsarrazin nsarrazin deleted the feature/730_jinja_template branch April 4, 2024 10:38
@flexchar
Copy link
Contributor

flexchar commented Apr 7, 2024

So I pulled the latest changes and was met with the exciting greeting No tokenizer specified and no chat prompt template specified for model gpt-3.5-turbo-0125. Not so nice :)

I debugged my way to this PR introducing the error https://github.com/huggingface/chat-ui/blame/a9c711026b13672328ce93abc3865f1823524b0d/src/lib/server/models.ts#L133.

What is the migration path for this breaking change?

It would be nice to have such ones documented in a dedicated file. It would save a lot of time us all down the road. ✌️

@zacps
Copy link
Contributor

zacps commented Apr 7, 2024

I just ran into this as well, I'm not sure I understand why the default prompt template was removed?

It seems like keeping it would remove the need for a breaking change.

@nsarrazin
Copy link
Collaborator Author

Hey, apologies for the breaking change, fixed it in #985

My reasoning was that we want people to use either a tokenizer or chatPromptTemplate to ensure the chat is formatted correctly but of course if you use an endpoint that uses chat completion API, you don't need a chat template 🤦 So you would get an error complaining that something was missing even though you didn't need it.

I've set a default that should be sane (ChatML) which should fix the issue, and in the future I want to add a raw: true/false flag at the model level, so we can check if we need a prompt template at all and use the right API routes.

Seems like most people are moving away from text completion APIs and moving towards chat completion anyway 👀

@iChristGit
Copy link

Hey @nsarrazin
Ive been following for a while!
I had some issues that some models don't respond correctly to making up the "Title" and web search.
Does using the tokenizer instead fix the issue with the chatPromptTemplate being hard to set up for local models that are not on the prompt list?
https://github.com/huggingface/chat-ui/blob/main/PROMPTS.md

@nsarrazin
Copy link
Collaborator Author

Yes it should help! As long as your model has a chat_template in the tokenizer_config.json on the hub (example for google/gemma-1.1-7b-it) , you can just use the model name in the tokenizer field. (for example tokenizer: "google/gemma-1.1-7b-it" in this case)

ice91 pushed a commit to ice91/chat-ui that referenced this pull request Oct 30, 2024
…#744)

* Use jinja template for chat formatting

* Add support for transformers js chat template

* update to latest transformers version

* Make sure to `add_generation_prompt`

* unindent
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
back This issue is related to the Svelte backend or the DB documentation Improvements or additions to documentation enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants