Skip to content

Commit

Permalink
Add a section on writing tool templates to the chat template docs (hu…
Browse files Browse the repository at this point in the history
…ggingface#33924)

* Add a section on writing tool templates to the chat template docs

* Small cleanups
  • Loading branch information
Rocketknight1 authored and BernardZach committed Dec 5, 2024
1 parent 5f63144 commit 42d7211
Showing 1 changed file with 126 additions and 1 deletion.
127 changes: 126 additions & 1 deletion docs/source/en/chat_templating.md
Original file line number Diff line number Diff line change
Expand Up @@ -962,4 +962,129 @@ tokenizer.chat_template = open("template.jinja").read()

As an added bonus, when you write a long, multi-line template in a separate file, line numbers in that file will
exactly correspond to line numbers in template parsing or execution errors. This will make it much easier to
identify the source of issues.
identify the source of issues.

### Writing templates for tools

Although chat templates do not enforce a specific API for tools (or for anything, really), we recommend
template authors try to stick to a standard API where possible. The whole point of chat templates is to allow code
to be transferable across models, so deviating from the standard tools API means users will have to write
custom code to use tools with your model. Sometimes it's unavoidable, but often with clever templating you can
make the standard API work!

Below, we'll list the elements of the standard API, and give tips on writing templates that will work well with it.

#### Tool definitions

Your template should expect that the variable `tools` will either be null (if no tools are passed), or is a list
of JSON schema dicts. Our chat template methods allow users to pass tools as either JSON schema or Python functions, but when
functions are passed, we automatically generate JSON schema and pass that to your template. As a result, the
`tools` variable that your template receives will always be a list of JSON schema. Here is
a sample tool JSON schema:

```json
{
"type": "function",
"function": {
"name": "multiply",
"description": "A function that multiplies two numbers",
"parameters": {
"type": "object",
"properties": {
"a": {
"type": "number",
"description": "The first number to multiply"
},
"b": {
"type": "number",
"description": "The second number to multiply"
}
},
"required": ["a", "b"]
}
}
}
```

And here is some example code for handling tools in your chat template. Remember, this is just an example for a
specific format - your model will probably need different formatting!

```text
{%- if tools %}
{%- for tool in tools %}
{{- '<tool>' + tool['function']['name'] + '\n' }}
{%- for argument in tool['function']['parameters']['properties'] %}
{{- argument + ': ' + tool['function']['parameters']['properties'][argument]['description'] + '\n' }}
{%- endfor %}
{{- '\n</tool>' }}
{%- endif %}
{%- endif %}
```

The specific tokens and tool descriptions your template renders should of course be chosen to match the ones your model
was trained with. There is no requirement that your **model** understands JSON schema input, only that your template can translate
JSON schema into your model's format. For example, [Command-R](https://huggingface.co/CohereForAI/c4ai-command-r-plus-08-2024)
was trained with tools defined using Python function headers, but the Command-R tool template accepts JSON schema,
converts types internally and renders the input tools as Python headers. You can do a lot with templates!

#### Tool calls

Tool calls, if present, will be a list attached to a message with the "assistant" role. Note that `tool_calls` is
always a list, even though most tool-calling models only support single tool calls at a time, which means
the list will usually only have a single element. Here is a sample message dict containing a tool call:

```json
{
"role": "assistant",
"tool_calls": [
{
"type": "function",
"function": {
"name": "multiply",
"arguments": {
"a": 5,
"b": 6
}
}
}
]
}
```

And a common pattern for handling them would be something like this:

```text
{%- if message['role'] == 'assistant' and 'tool_calls' in message %}
{%- for tool_call in message['tool_calls'] %}
{{- '<tool_call>' + tool_call['function']['name'] + '\n' + tool_call['function']['arguments']|tojson + '\n</tool_call>' }}
{%- endif %}
{%- endfor %}
{%- endif %}
```

Again, you should render the tool call with the formatting and special tokens that your model expects.

#### Tool responses

Tool responses have a simple format: They are a message dict with the "tool" role, a "name" key giving the name
of the called function, and a "content" key containing the result of the tool call. Here is a sample tool response:

```json
{
"role": "tool",
"name": "multiply",
"content": "30"
}
```

You don't need to use all of the keys in the tool response. For example, if your model doesn't expect the function
name to be included in the tool response, then rendering it can be as simple as:

```text
{%- if message['role'] == 'tool' %}
{{- "<tool_result>" + message['content'] + "</tool_result>" }}
{%- endif %}
```

Again, remember that the actual formatting and special tokens are model-specific - you should take a lot of care
to ensure that tokens, whitespace and everything else exactly match the format your model was trained with!

0 comments on commit 42d7211

Please sign in to comment.