Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add basic function calling example using a llama-cli python wrapper #9592

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
51 changes: 51 additions & 0 deletions examples/function-calling/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
# llama.cpp/examples/function-calling

This example shows how to do basic function calling using llama-cli and a python wrapper to declare and call functions.

## Options

Important options for llama-cli-function-runner.py:

- `-m FNAME, --model FNAME`: Specify the path to the function calling model (e.g., `-m "$(huggingface-cli download meetkai/functionary-small-v3.2-GGUF functionary-small-v3.2.Q4_0.gguf)"`).
- `--ctx-size N`: Set the size of the prompt context. The default is 1024
- `--special`: show special tokens and function calling details

## Example showing showing function call details

```
./examples/function-calling/llama-cli-function-runner.py -m `huggingface-cli download meetkai/functionary-small-v3.2-GGUF functionary-small-v3.2.Q4_0.gguf` -i --special
What is the weather in Phoenix?
Sure, I'll look that up for you. Let me just check the current weather conditions in Phoenix.>>>get_weather
{"location": "Phoenix"}<|eot_id|>
{"temperature": "30C"}<|eot_id|><|start_header_id|>assistant<|end_header_id|>
The current weather in Phoenix is 30C.<|eot_id|>
What is 38484 + 323?
Sure, let's calculate that.>>>calculate
{"expression": "38484 + 323"}<|eot_id|>
{"result": 38807}<|eot_id|><|start_header_id|>assistant<|end_header_id|>
The sum of 38484 and 323 is 38807.<|eot_id|>
What is 67 feet in meters?
To convert 67 feet into meters, we use the conversion factor: 1 foot is approximately 0.3048 meters. Let's calculate it.>>>calculate
{"expression": "67 * 0.3048"}<|eot_id|>
{"result": 20.4216}<|eot_id|><|start_header_id|>assistant<|end_header_id|>
67 feet is approximately 20.4216 meters.<|eot_id|>
```

## Function calling example, hiding details
```
./examples/function-calling/llama-cli-function-runner.py -m `huggingface-cli download meetkai/functionary-small-v3.2-GGUF functionary-small-v3.2.Q4_0.gguf` -i
What is the weather in Phoenix?
To provide you with the current weather in Phoenix, Arizona, I will need to check the weather data for you. Let me get that information.
The current weather in Phoenix, Arizona is 30°C. If you have any more questions about weather in other locations, feel free to ask!
Is it colder in Vegas?
To determine if the current temperature in Las Vegas is colder than in Phoenix, which is currently 30°C, I will need to check the weather data for Las Vegas. Let's find out.
The current weather in Las Vegas, Nevada is also 30°C. Therefore, there is no difference in temperature between Phoenix and Las Vegas at the moment. If you have any more questions or need further assistance, please let me know!
What is 37234 times 39?
To calculate 37234 times 39, I'll perform the multiplication. Let's do that.
The result of multiplying 37234 by 39 is 1,452,126. If you have any more calculations or questions, feel free to ask!
```

## Function calling example, using Phi-3 function calling
```
./examples/function-calling/llama-cli-function-runner.py -m `huggingface-cli download nold/Phi-3-mini-4k-instruct-function-calling-GGUF Phi-3-mini-4k-instruct-function-calling_Q4_K_M.gguf` --special --display-prompt -i
```
110 changes: 110 additions & 0 deletions examples/function-calling/function_tool.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,110 @@
# Generate function calling definitions function schemas

import inspect
import re

import json

# Extract OpenAI function calling style definitions from functions
#
# Generated with: Create a python function to to generate the OpenAI function calling definition from a given function, getting the description, parameter type and parameter description from the function documentation, assuming the function documentation contains sphynx style parameter descriptions, marked with :param.
def get_function_tool_json(func):
typemap = { 'str': 'string' };
def get_type(s):
return typemap[s] if s in typemap else s

function_name = func.__name__
doc_parts = re.split(r'\n\s*:param[^:]*\s+', func.__doc__.rstrip());

function_description = doc_parts[0]
params_doc = [ re.split(r'\:\s*', param_doc, maxsplit=1) for param_doc in doc_parts[1:] ]
params_doc = { param: desc for param, desc in params_doc }

function_def = {
'name': function_name,
'description': function_description,
'parameters': { 'type': 'object', 'properties': {}, 'required': [] }
}

for param_name, param in inspect.signature(func).parameters.items():
function_def['parameters']['properties'][param_name] = {
'type' : get_type(param.annotation.__name__) if param.annotation is not param.empty else '',
'description': params_doc[param_name] if param_name in params_doc else ''
}
function_def['parameters']['required'].append(param_name);

return function_def

# Generate function definition schema from function definitions
#
# This is from llama-cpp-python, llama_chat_format.py
def generate_functionary_schema_from_functions(functions, namespace="functions") -> str:
schema = (
"// Supported function definitions that should be called when necessary.\n"
)
schema += f"namespace {namespace} {{\n\n"

for function in functions:
function_name = function["name"]
description = function.get("description", "")
parameters = function.get("parameters", {})
required_params = parameters.get("required", [])

schema += f"// {description}\n"
schema += f"type {function_name} = (_: {{\n"

for param_name, param in parameters.get("properties", {}).items():
param_description = param.get("description", "")
param_type = param.get("type", "any")
optional_indicator = "" if param_name in required_params else "?"
schema += f"// {param_description}\n"
schema += f"{param_name}{optional_indicator}: {param_type},\n"
schema += "}) => any;\n\n"

schema += "}} // namespace {}".format(namespace)
return schema

def generate_simple_schema_from_functions(functions) -> str:
return '\n'.join([json.dumps(function).replace('{', '{ ').replace('}', ' }') for function in functions])

functionary_prompt_start = """<|start_header_id|>system<|end_header_id|>

You are capable of executing available function(s) if required.
Execute function(s) as needed.
The function calls are not shown in the conversation and should be called covertly to answer questions.
Ask for the required input to:recipient==all
Use JSON for function arguments.
Respond in this format:
>>>${recipient}
${content}
Available functions:
"""
functionary_prompt_end = """<|eot_id|><|start_header_id|>system<|end_header_id|>

When you send a message containing Python code to python, it will be executed in a stateful Jupyter notebook environment. python will respond with the output of the execution or time out after 60.0 seconds. The drive at '/mnt/data' can be used to save and persist user files.<|eot_id|><|start_header_id|>user<|end_header_id|>
"""

simple_prompt_start = """<s><|user|> You are a helpful assistant with access to the following functions. Use them if required - """
simple_prompt_end = """<|end|>"""

def get_chat_tool_format(args, tools):
if 'functionary' in args.model.lower():
return {
'prompt': functionary_prompt_start + generate_functionary_schema_from_functions(tools) + functionary_prompt_end,
'function_marker': '>>>',
'function_re': r'>>>([^\n]*)\n(.*)<\|eot_id\|>',
'user_start': '<|start_header_id|>user<|end_header_id|>\n',
'user_end': '<|eot_id|><|start_header_id|>assistant<|end_header_id|>' + '\n',
'tool_start': '',
'tool_end': '<|eot_id|><|start_header_id|>assistant<|end_header_id|>'
}
else:
return {
'prompt': simple_prompt_start + generate_simple_schema_from_functions(tools) + simple_prompt_end,
'function_marker': '<functioncall>',
'function_re': r'<functioncall> \n?(.*)<\|end\|>',
'user_start': '<|user|> ',
'user_end': '<|end|>' + '\n',
'tool_start': '<|user|>',
'tool_end': '<|end|> <|assistant|>'
}
30 changes: 30 additions & 0 deletions examples/function-calling/functions.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
def calculate(expression: str):
"""Evaluate a mathematical expression
:param expression: The mathematical expression to evaluate
"""
try:
result = eval(expression)
return {"result": result}
except:
return {"error": "Invalid expression"}

def get_weather(location: str):
"""get the weather of a location
:param location: where to get weather.
"""
return {"temperature": "30C"}

def _run_python(code):
allowed_globals = { '__builtins__': None, '_': None }
allowed_locals = {}

code = code.splitlines()
code[-1] = f"_ = {code[-1]}"
code = '\n'.join(code)

try:
exec(code, allowed_globals, allowed_locals)
except Exception as e:
return None

return {'result': allowed_locals.get('_', None)}
112 changes: 112 additions & 0 deletions examples/function-calling/llama-cli-function-runner.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,112 @@
#!/usr/bin/env python3
# function calling using llama-cli

import subprocess
import sys
import select
import os
import re

import json

import functions
from function_tool import get_function_tool_json, get_chat_tool_format

function_name_list = [ name for name in dir(functions) if not name.startswith('_') ]
function_lookup = { name: getattr(functions, name) for name in function_name_list }
tools = [ get_function_tool_json(f) for (n, f) in function_lookup.items() ]

def main():
import argparse

parser = argparse.ArgumentParser(epilog='For more options: llama-cli --help')
parser.add_argument('--display-prompt', action=argparse.BooleanOptionalAction, default=False)
parser.add_argument('--special', action=argparse.BooleanOptionalAction, default=False)
parser.add_argument('--reverse-prompt', type=str)
parser.add_argument('-m', '--model', type=str, default='model.gguf')
parser.add_argument('--ctx-size', type=int, default=1024)
args, other_args = parser.parse_known_args()

tool_format = get_chat_tool_format(args, tools)
if args.reverse_prompt is None: args.reverse_prompt = tool_format['user_start']

if args.display_prompt: print(tool_format['prompt'])

command = [ './llama-cli', '-i', '-p', tool_format['prompt'], '--model', args.model, '--reverse-prompt', args.reverse_prompt, '--escape', '--special', '--no-display-prompt', '--log-disable', '--simple-io', '--ctx-size', str(args.ctx_size), *other_args]

process = subprocess.Popen(
command,
stdin=subprocess.PIPE,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
text=True,
)
if process.stdout is not None: os.set_blocking(process.stdout.fileno(), False)

try:
run_loop(process, args, tool_format)
except KeyboardInterrupt:
print("\nInterrupted by user.")
finally:
process.terminate()
process.wait()

def run_loop(process, args, tool_format):
pbuffer = ''
skip_output_until_result = False
while True:
readable, _, _ = select.select([process.stdout, process.stderr, sys.stdin], [], [])

for stream in readable:
if stream == process.stdout:
pdata = process.stdout.read()
if not pdata: continue
pbuffer += pdata

if(match := re.search(tool_format['function_re'], pbuffer, re.S)):
if not args.special:
pdata = pdata[:match.pos]
pbuffer = ''
skip_output_until_result = False
try:
if 1 < len(match.groups()):
tool_name = match.group(1)
tool_args = json.loads(match.group(2))
else:
tool = json.loads(match.group(1))
tool_name = tool['name']
tool_args = tool['arguments']

if tool_name == 'python':
result = functions._run_python(tool_args);
else:
result = function_lookup[tool_name](**tool_args)
except ValueError as e:
result = {'error': 'unknown'}

result = tool_format['tool_start'] + json.dumps(result) + tool_format['tool_end']
process.stdin.write(result + '\n')
process.stdin.flush()
if(args.special): pdata += '\n' + result
elif (n := pdata.find(tool_format['function_marker'])) >= 0:
if not args.special:
pdata = pdata[:n]
skip_output_until_result = True
elif skip_output_until_result:
pdata = ''

if not args.special:
pdata = re.sub(r'<\|[^\|>]*\|>', '', pdata)
sys.stdout.write(pdata)
sys.stdout.flush()

elif stream == sys.stdin:
user_input = sys.stdin.readline()
if user_input:
user_input = user_input.rstrip()
process.stdin.write(user_input + tool_format['user_end'] + '\n')
process.stdin.flush()

if __name__ == '__main__':
main()