- 
          
- 
                Notifications
    You must be signed in to change notification settings 
- Fork 10.9k
[Frontend] Chat template fallbacks for multimodal models #17805
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Frontend] Chat template fallbacks for multimodal models #17805
Conversation
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
| 👋 Hi! Thank you for contributing to the vLLM project. 💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels. Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run  Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can either: Add  🚀 | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
…t#17805) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com>
…t#17805) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> Signed-off-by: Mu Huai <tianbowen.tbw@antgroup.com>
| chat_template: Optional[str], | ||
| tools: Optional[list[dict[str, Any]]], | ||
| *, | ||
| trust_remote_code: bool, | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removing args may break dependent code e..g ray-project/ray#52975
trying to fix in #18098
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both this and a fix seem to break lm_eval though
https://github.com/EleutherAI/lm-evaluation-harness/blob/main/lm_eval/models/vllm_causallms.py#L143
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you open an issue to lm-eval to update their code? It looks like they already use version bounds so they just have to add another case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PR description in fix #18098 shows how to fix, also you can refer https://github.com/ray-project/ray/blob/c0564b74155c4b1b9fb4c1244eeec7aa763ccf23/python/ray/llm/_internal/serve/deployments/llm/vllm/vllm_engine.py#L252-L262
…t#17805) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
…t#17805) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> Signed-off-by: Yuqi Zhang <yuqizhang@google.com>
Since multi-modal models must use Chat Completions API, for user convenience I have added automatic fallbacks for supported multi-modal models. These fallbacks are used when no default chat template is available and the user did not input one. They are located inside a new module
vllm.transformers_utils.chat_templates.cc @mgoin since you considered moving chat templates into the main vLLM library. I think without automatic chat template selection, this would cause some inconvenience to users since they have to specify a longer path to get to the chat template. We can consider moving more chat templates over to this new module once we can automatically fallback to them.