Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

planning: Cortex supports chat_templates in model.yaml #1758

Open
3 tasks
dan-homebrew opened this issue Dec 2, 2024 · 0 comments
Open
3 tasks

planning: Cortex supports chat_templates in model.yaml #1758

dan-homebrew opened this issue Dec 2, 2024 · 0 comments
Assignees
Labels
category: model management Model pull, yaml, model state

Comments

@dan-homebrew
Copy link
Contributor

dan-homebrew commented Dec 2, 2024

Goal

  • We change model.yaml to align with a generic chat_template that can support reasoning models
  • Reasoning models require System Prompts (e.g. qwq etc)
  • We need to adapt Cortex model.yaml to accomodate System Prompt
    • Bad: prompt_template (our own creation?) is not a standard field
    • Bad: prompt_template currently takes in a system_message variable
    • We need to define the system_message, as well as things like %tools%, etc
  • We should align with industry standards
    • We need to parse the existing chat_template (from HF Transformers)
    • into GGUF's in-prefix and in-suffix

Tasklist

  • How do we transition if we change the field to chat_template (align with HF Transformers?)
    • We support both chat_template and prompt_templates
    • chat_template overrides prompt_template
    • Mark prompt_template as deprecated (we don't support for future models)
    • Update Model CI to have support chat_template
  • Is there a scalable way to do this, by leveraging on existing standards?
    • Option 1: Having a 2nd field on all model.yaml that adds chat_template
    • Option 2: Copying over tokenizer_config.json
    • Option 3: Model.yaml
  • Models
    • Marco-o1
    • qwq

Resources

GGUF

HF Transformers

  • Bigger question: should we just have tokenizer_config.json define chat_template and include it in model repos
  • For everything else, we depend on GGUF
  • We have a lightweight model.yaml that can override this if needed

Example

qwq Chat Template prompt (jinja2)

# qwq chat template prompt
# jinja2 template

"chat_template": "{%- if tools %}\n    {{- '<|im_start|>system\\n' }}\n    {%- if messages[0]['role'] == 'system' %}\n        {{- messages[0]['content'] }}\n    {%- else %}\n        {{- 'You are a helpful and harmless assistant. You are Qwen developed by Alibaba. You should think step-by-step.' }}\n    {%- endif %}\n    {{- \"\\n\\n# Tools\\n\\nYou may call one or more functions to assist with the user query.\\n\\nYou are provided with function signatures within <tools></tools> XML tags:\\n<tools>\" }}\n    {%- for tool in tools %}\n        {{- \"\\n\" }}\n        {{- tool | tojson }}\n    {%- endfor %}\n    {{- \"\\n</tools>\\n\\nFor each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:\\n<tool_call>\\n{\\\"name\\\": <function-name>, \\\"arguments\\\": <args-json-object>}\\n</tool_call><|im_end|>\\n\" }}\n{%- else %}\n    {%- if messages[0]['role'] == 'system' %}\n        {{- '<|im_start|>system\\n' + messages[0]['content'] + '<|im_end|>\\n' }}\n    {%- else %}\n        {{- '<|im_start|>system\\nYou are a helpful and harmless assistant. You are Qwen developed by Alibaba. You should think step-by-step.<|im_end|>\\n' }}\n    {%- endif %}\n{%- endif %}\n{%- for message in messages %}\n    {%- if (message.role == \"user\") or (message.role == \"system\" and not loop.first) or (message.role == \"assistant\" and not message.tool_calls) %}\n        {{- '<|im_start|>' + message.role + '\\n' + message.content + '<|im_end|>' + '\\n' }}\n    {%- elif message.role == \"assistant\" %}\n        {{- '<|im_start|>' + message.role }}\n        {%- if message.content %}\n            {{- '\\n' + message.content }}\n        {%- endif %}\n        {%- for tool_call in message.tool_calls %}\n            {%- if tool_call.function is defined %}\n                {%- set tool_call = tool_call.function %}\n            {%- endif %}\n            {{- '\\n<tool_call>\\n{\"name\": \"' }}\n            {{- tool_call.name }}\n            {{- '\", \"arguments\": ' }}\n            {{- tool_call.arguments | tojson }}\n            {{- '}\\n</tool_call>' }}\n        {%- endfor %}\n        {{- '<|im_end|>\\n' }}\n    {%- elif message.role == \"tool\" %}\n        {%- if (loop.index0 == 0) or (messages[loop.index0 - 1].role != \"tool\") %}\n            {{- '<|im_start|>user' }}\n        {%- endif %}\n        {{- '\\n<tool_response>\\n' }}\n        {{- message.content }}\n        {{- '\\n</tool_response>' }}\n        {%- if loop.last or (messages[loop.index0 + 1].role != \"tool\") %}\n            {{- '<|im_end|>\\n' }}\n        {%- endif %}\n    {%- endif %}\n{%- endfor %}\n{%- if add_generation_prompt %}\n    {{- '<|im_start|>assistant\\n' }}\n{%- endif %}\n",
  "clean_up_tokenization_spaces": false,
  "eos_token": "<|im_end|>",
  "errors": "replace",
  "model_max_length": 32768,
  "pad_token": "<|endoftext|>",
  "split_special_tokens": false,
  "tokenizer_class": "Qwen2Tokenizer",
  "unk_token": null,
  "add_bos_token": false
}


Current model.yaml format

# Our current qwq `model.yaml`

prompt_template: <|im_start|>system\n{system_message}<|im_end|>\n<|im_start|>user\n{prompt}<|im_end|>\n<|im_start|>assistant\n
ctx_len: 4096
ngl: 34
# END REQUIRED
# END MODEL LOAD PARAMETERS

@dan-homebrew dan-homebrew converted this from a draft issue Dec 2, 2024
@dan-homebrew dan-homebrew changed the title planning: Cortex supports System Prompt planning: Cortex supports chat_templates in model.yaml Dec 2, 2024
@dan-homebrew dan-homebrew assigned namchuai and unassigned vansangpfiev Dec 2, 2024
@louis-jan louis-jan added the category: model management Model pull, yaml, model state label Dec 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: model management Model pull, yaml, model state
Projects
Status: Investigating
Development

No branches or pull requests

5 participants