-
Notifications
You must be signed in to change notification settings - Fork 9.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Server: add function calling API #5588
Comments
Research on MeetKai's implementationMy python snippet: https://gist.github.com/ngxson/c477fd9fc8e0a25c52ff4aa6129dc7a1 Key things to notice:
Link to OAI docs for tool_calls: https://platform.openai.com/docs/api-reference/chat/create#chat-create-tools |
This issue was closed because it has been inactive for 14 days since being marked as stale. |
I'm actually early and waiting on some things to finish through. I thought I would be busy until mid April, but I might have some free time sooner than I thought. I mention this because while some models are trained to use tools, I've noticed some models are smart enough to do it on their own with the right amount of solicitation. I'm planning on implementing the proof of concept in more detail in a simplified and streamlined way. There's also a fine-tuned mistral model trained to do this as well I don't think it needs it, but it probably helps reduce the amount of necessary context to orientate it. @abetlen Also has the functionary model I was "discussing" it with the Mistral 7B v0.2 model quantized to Q4_0 and it understood exactly what I wanted, but this was only after I provided it with the appropriate context. It did surprisingly well regardless. The only reason I really care about this is because I want the models to have a "memory" via a SQLite database. It's something I've been working on for over a year because I genuinely do not like "RAG" which is just a Q & A with segmentation and language models. I never really liked it and always felt dissatisfied with it. |
Please Let me cast my humble vote in favour of this issue. It seems that agents capability is going to be the next big thing in LLMs. I mean, seriosly, chatting and RAG is supported by literally every possible toolkit, with its simplicity and limitations, but in order to keep up with the big tech the open source community must move on. My goal is to be able to run (at the very least) this: https://docs.llamaindex.ai/en/stable/examples/agent/openai_agent/ or this: https://github.com/abetlen/llama-cpp-python/blob/main/examples/notebooks/Functions.ipynb I am yet to explore @ngxson 's #5695 solution. It seems though, that it is geared towards MeetKai (Can anyone confirm this?) while we need a universal solution that can support llama-3, open ai, etc. interfaces. To my intermediate understanding the support boils down to a set of prompt templates appropriate for a particular model (Can anyone confirm this, too?). I am partucularly interested in llama-3-instruct model support. I have found a similar solution that works with llama.cpp server (more or less), see: https://github.com/Maximilian-Winter/llama-cpp-agent |
Yes that's correct. Function calling is simply just a more complicated chat template. When I first started this PR, MeetKai was the only open-source model to implement this idea. Of course we have many new models now, but the problem is still the same with chat templates: there is no "standard" way, each model uses its own template. Also because we're having more visibility now (i.e. more models to see the pattern), I'm planning to re-make all of this. Maybe as a dedicated side project - a wrapper for llama.cpp's server, because it will be quite messy. Then we will the if one day we can merge it back to llama.cpp |
Hi @ngxson , thank you for getting back. |
This issue was closed because it has been inactive for 14 days since being marked as stale. |
These models support function calling (without fine-tuning):
For Qwen, function calling can be implemented outside the interference application. I have implemented these in chatllm.cpp. |
Motivation
This subject is already brought up in #4216 , but my initial research failed.
Recently, I discovered a new line of model designed specifically for this usage: https://github.com/MeetKai/functionary
This model can decide whether to call functions (and which function to be called) in a given context. The chat template looks like this:
Example:
Possible implementation
Since this is the only one model available publicly that can do this function, it's quite risky to modify
llama_chat_apply_template
to support it (we may end up pollute the code base).The idea is to firstly keep the implementation in server example, then when the template become more mainstream, we can adopt it in
llama_chat_apply_template
.Data passing in the direction from user ==> model (input direction)
You are a helpful assistant
) and one for function definition.Data passing in the direction from model ==> user (output direction)
The text was updated successfully, but these errors were encountered: