Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
83 changes: 80 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -141,7 +141,83 @@ Because of their special behavior of being preserved on context window overflow,

The Prompt API supports **tool use** via the `tools` option, allowing you to define external capabilities that a language model can invoke in a model-agnostic way. Each tool is represented by an object that includes an `execute` member that specifies the JavaScript function to be called. When the language model initiates a tool use request, the user agent calls the corresponding `execute` function and sends the result back to the model.

Here’s an example of how to use the `tools` option:
There are 2 tool use modes: with automatic execution (closed loop) and without automatic execution (open loop)

Regardless of with or without automatic execution, the session creation and appending signature are the same. Here’s an example:

```js
const session = await LanguageModel.create({
initialPrompts: [
{
role: "system",
content: `You are a helpful assistant. You can use tools to help the user.`
}
],
tools: [
{
name: "getWeather",
description: "Get the weather in a location.",
inputSchema: {
type: "object",
properties: {
location: {
type: "string",
description: "The city to check for the weather condition.",
},
},
required: ["location"],
},
}
]
});
```

In this example, the `tools` array defines a `getWeather` tool, specifying its name, description and input schema.

Few shot examples of tool use can be appended like so:

```js
await session.append([
{role: "user", content: "What is the weather in Seattle?"},
{role: "tool-call", content: {type: "tool-call", value: {callID:" get_weather_1", name: "get_weather", arguments: {location:"Seattle"}}},
{role: "tool-result", content: {type: "tool-response", value: {callID: "get_weather_1", name: "get_weather", result: [{type:"object", value: {temperature: "55F", humidity: "67%"}}]}},
{role: "assistant", content: "The temperature in Seattle is 55F and humidity is 67%"},
]);
```

Note that "role" and "type" now supports "tool-call" and "tool-result".
`content.result` is a list of a dictionary of `type` and `value`, where `type` can be `{"text", "image", "audio", "object" }` and `value` is `any`.

#### Open Loop:

Open loop is enabled by specifying `tool-call` in `expectedOutputs` when the session is created.

When a tool needs to be called, the API will return an object with `callId` (a unique identifier of this tool call), `name` (name of the tool), and `arguments` (inputs to the tool), and client is expected to handle the tool execution and append the tool result back to the session. The `argument` is a dictionary fitting the JSON input schema of the tool's declaration; if the input schema is not "object", the value will be wrapped in a key.

Example:

```js
sessionOptions = structuredClone(options);
sessionOptions.expectedOutputs.push(["tool-call"]);
session = await LanguageModel.create(sessionOptions);

var result = await session.prompt("What is the weather in Seattle?");
if (result.type=="tool-call") {
if (result.name == "get_weather") {
const tool_result = getWeather(result.arguments.location);
result = session.prompt([{role:"tool-result", content: {type: "tool-result", value: {callId: result.callID, name: result.name, result: [{type:"object", value: tool_result}]}}}])
}
} else{
console.log(result)
}
```

Note that we always require tool-response to immediately follow tool-call generated by the model.


#### Closed Loop:

To enable automatic execution, add a `execute` function for each tool's implementation, and add a `toolUseConfig` to indicate that execution is enabled and pose a max number of tool calls invoked in a single session generation:

```js
const session = await LanguageModel.create({
Expand Down Expand Up @@ -171,13 +247,14 @@ const session = await LanguageModel.create({
return JSON.stringify(await res.json());
},
}
]
],
toolUseConfig: {enabled: true},
});

const result = await session.prompt("What is the weather in Seattle?");
```

In this example, the `tools` array defines a `getWeather` tool, specifying its name, description, input schema, and `execute` implementation. When the language model determines that a tool call is needed, the user agent invokes the `getWeather` tool's `execute()` function with the provided arguments and returns the result to the model, which can then incorporate it into its response.
When the language model determines that a tool call is needed, the user agent invokes the `getWeather` tool's `execute()` function with the provided arguments and returns the result to the model, which can then incorporate it into its response.

#### Concurrent tool use

Expand Down
51 changes: 44 additions & 7 deletions index.bs
Original file line number Diff line number Diff line change
Expand Up @@ -39,8 +39,11 @@ interface LanguageModel : EventTarget {
static Promise<Availability> availability(optional LanguageModelCreateCoreOptions options = {});
static Promise<LanguageModelParams?> params();

// The return type from prompt() method and those alike.
typedef (DOMString or sequence<LanguageModelMessageContent>) LanguageModelPromptResult;

// These will throw "NotSupportedError" DOMExceptions if role = "system"
Promise<DOMString> prompt(
Promise<LanguageModelPromptResult> prompt(
LanguageModelPrompt input,
optional LanguageModelPromptOptions options = {}
);
Expand Down Expand Up @@ -80,13 +83,11 @@ interface LanguageModelParams {
callback LanguageModelToolFunction = Promise<DOMString> (any... arguments);

// A description of a tool call that a language model can invoke.
dictionary LanguageModelTool {
dictionary LanguageModelToolDeclaration {
required DOMString name;
required DOMString description;
// JSON schema for the input parameters.
required object inputSchema;
// The function to be invoked by user agent on behalf of language model.
required LanguageModelToolFunction execute;
};

dictionary LanguageModelCreateCoreOptions {
Expand All @@ -97,7 +98,7 @@ dictionary LanguageModelCreateCoreOptions {

sequence<LanguageModelExpected> expectedInputs;
sequence<LanguageModelExpected> expectedOutputs;
sequence<LanguageModelTool> tools;
sequence<LanguageModelToolDeclaration> tools;
};

dictionary LanguageModelCreateOptions : LanguageModelCreateCoreOptions {
Expand Down Expand Up @@ -148,16 +149,52 @@ dictionary LanguageModelMessageContent {
required LanguageModelMessageValue value;
};

enum LanguageModelMessageRole { "system", "user", "assistant" };
enum LanguageModelMessageRole { "system", "user", "assistant", "tool-call", "tool-response" };

enum LanguageModelMessageType { "text", "image", "audio" };
enum LanguageModelMessageType { "text", "image", "audio","tool-call", "tool-response" };

typedef (
ImageBitmapSource
or AudioBuffer
or BufferSource
or DOMString
or LanguageModelToolCall
or LanguageModelToolResponse
) LanguageModelMessageValue;

// The definitions of `LanguageModelToolCall` and `LanguageModelToolResponse` values
enum LanguageModelToolResultType { "text", "image", "audio", "object" };

dictionary LanguageModelToolResultContent {
required LanguageModelToolResultType type;
required any value;
};

// Represents a tool call requested by the language model.
dictionary LanguageModelToolCall {
required DOMString callID;
required DOMString name;
object arguments;
};

// Successful tool execution result.
dictionary LanguageModelToolSuccess {
required DOMString callID;
required DOMString name;
required sequence<LanguageModelToolResultContent> result;
};

// Failed tool execution result.
dictionary LanguageModelToolError {
required DOMString callID;
required DOMString name;
required DOMString errorMessage;
};

// The response from executing a tool call - either success or error.
typedef (LanguageModelToolSuccess or LanguageModelToolError) LanguageModelToolResponse;


</xmp>

<h3 id="prompt-processing">Prompt processing</h3>
Expand Down