WIP - Tool Implementation improvements™️ #193

gittb · 2024-09-08T01:02:26Z

After #154, I've continued to think about and evaluate TabbyAPI's tool calling integration. This is a draft PR as I continue to iterate and improve upon TabbyAPI's Tools implementation. Below are aspects that I notice and intend to address - in this PR.

Below are some observations and my planned actions.

Many models have a hard time writing the correct dtype (and sometimes the arg names) when calling functions.

We will attempt to build a custom tool response schema from the tool spec provided by the client. This mean we can not only ensure the format of the tool response is correct, but we can guarantee a couple things:
1. args are spelled correctly
2. required args are present
3. dtypes for args are correct
Status: In PR ✅ - Working MVP

There are some model providers that have worked hard to adopt a tool call/response schema that closely resembles OpenAI's spec. I appreciate you! These models will naturally have better support within the current tools implementation within TabbyAPI, but may need slight tweaks to the default tool calling prompt template.

After discussions with @bdashore3 I feel that it's best to move tabby's default tool calling template to support Hermes, rather than generically accommodating Llama 3.1's chat template. While tool calling with Llama 3.1 models works well in TabbyAPI, I think Hermes 3's finetuned models better exhibit how tabby can deliver tool calling to it's users. I intend to replace the default tool template with a Hermes 3 steered template (this should also continue to work quite well for Llama 3.1, as it has no true specific tool calling token training) This will only require slight changes.
Status : Written in the below Just-in-Time system prompt format. Has not been added to PR just yet.

Tool calling models are strongly coupled to their system prompts

If you've read though the docs and the default tool calling template; you will see how we inform the model about the tools available to it, and further into the conversation, remind the model of it's tools/response schema after it indicates it wants to make a tool call via it's tool_start token. There isn't any magic solution for this, certainly models trained to tool call more effectively use tools deeper into the context window. But how can we help on the inference side? I'd like to experiment with some just-in-time system prompt based chat templates - these will more closely follow what Mistral does with their V3 template, but in ChatML. This is currently just an experiment with promising early results. This change would fall into the "unknown" section for a tools implementation, since we do not know how tools are conveyed to OpenAI models (or reminded). But the same currently applies for the tool reminder implementation in the default tool chat template that currently exists. I believe this should be considered.

Status: Experimenting - this template is not in the PR currently. I can add if folks would like.

Translation layer between Model <--> TabbyAPI for tools™

As I mention above, certain models more closely follow OpenAI's tool call/response schema. We don't know if the actual model is following this schema, we only know that the schema is used when the user communicates with the inference engine. Many models, do not follow this structure when making tool calls. This is not to say that model is bad at making tool calls, but that the responsibly falls to TabbyAPI to correctly translate between model tool speak, and the agreed OpenAI tool speak with the user. Currently, we do not have a graceful way of accommodating a different structure for the model. In order to accommodate models from C4AI and others, we must build this translation layer until the community finds consensus amongst model tool formatting.

Status: Planning

Notes on the code changes

Tool related functions have now been moved into tools.py

…ns to their own modules

…tool prompt unless tools are provided

gittb added 12 commits August 22, 2024 22:42

rouge prompt print

8aa5b56

remove print pt2

3e2f189

Print Removal Final

342818c

Merge branch 'main' of https://github.com/gittb/tabbyAPI-function

50f20a6

adding pydantic sampler for use with strict mode

e75d764

experimentation with creating tools specific schemas

08028c9

Merge branch 'main' of https://github.com/gittb/tabbyAPI-function

8922dca

Fix type hinting

27438cb

removed pydantic filter - no longer needed

b3caf7b

Inital supertools (TM) Moving all of the tool related support functio…

f903054

…ns to their own modules

rewrite of gen strict schema leveraging pydantic style json schema

8ccf8dd

Merge branch 'theroyallab:main' into main

f0ae356

gittb marked this pull request as draft September 8, 2024 01:02

gittb added 3 commits September 8, 2024 14:24

Merge branch 'theroyallab:main' into main

5c17421

Inital Hermes 3 Just in time system version - and fix to not inlcude …

f8a2078

…tool prompt unless tools are provided

Merge branch 'theroyallab:main' into main

6f8aa64

gittb closed this Oct 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP - Tool Implementation improvements™️ #193

WIP - Tool Implementation improvements™️ #193

gittb commented Sep 8, 2024

WIP - Tool Implementation improvements™️ #193

WIP - Tool Implementation improvements™️ #193

Conversation

gittb commented Sep 8, 2024

Below are some observations and my planned actions.

Many models have a hard time writing the correct dtype (and sometimes the arg names) when calling functions.

Tool calling models are strongly coupled to their system prompts

Translation layer between Model <--> TabbyAPI for tools™

Notes on the code changes