diff --git a/docs/README.md b/docs/README.md index ee8f8469..ca8a9e8d 100644 --- a/docs/README.md +++ b/docs/README.md @@ -4,4 +4,8 @@ The Foundry Local documentation is provided on [Microsoft Learn](https://learn.m ## API Reference -- [Foundry Local C# SDK API Reference](./cs-api/Microsoft.AI.Foundry.Local.md) \ No newline at end of file +- [Foundry Local SDK Reference](https://learn.microsoft.com/en-us/azure/ai-foundry/foundry-local/reference/reference-sdk?view=foundry-classic&tabs=windows&pivots=programming-language-javascript) + +## Integration Guides + +- [Using GitHub Copilot SDK with Foundry Local](./copilot-sdk-integration.md) \ No newline at end of file diff --git a/docs/copilot-sdk-integration.md b/docs/copilot-sdk-integration.md new file mode 100644 index 00000000..a12aada2 --- /dev/null +++ b/docs/copilot-sdk-integration.md @@ -0,0 +1,270 @@ +# Using GitHub Copilot SDK with Foundry Local + +## Overview + +For **agentic workflows** — tool calling, multi-step planning, and multi-turn conversations — you can use [GitHub Copilot SDK](https://github.com/github/copilot-sdk) with Foundry Local as the on-device inference backend. Copilot SDK provides the agentic orchestration layer while Foundry Local handles local model execution. + +This approach requires **no changes** to Foundry Local or its APIs. Copilot SDK connects to Foundry Local's OpenAI-compatible endpoint via its [Bring Your Own Key (BYOK)](https://github.com/github/copilot-sdk/blob/main/docs/auth/byok.md) feature. + +## Architecture + +``` +Your Application + | + ├─ foundry-local-sdk ──→ Foundry Local service (model lifecycle) + | + └─ @github/copilot-sdk (CopilotClient) + | + ├─ JSON-RPC ──→ Copilot CLI (agent orchestration, tool execution) + | + └─ BYOK provider: { type: "openai", baseUrl: "http://localhost:5272/v1" } + | + └─ POST /v1/chat/completions ──→ Foundry Local (on-device inference) + | + └─ Local Model (e.g., phi-4-mini via ONNX Runtime) +``` + +**Key components:** + +- **Foundry Local SDK** (`foundry-local-sdk`) — Manages the local inference service lifecycle (start, model download/load) +- **Copilot SDK** (`@github/copilot-sdk`) — Provides `CopilotClient` for agentic orchestration (sessions, tools, streaming, multi-turn) +- **Copilot CLI** — Background process that the SDK communicates with over JSON-RPC. Handles agent orchestration and tool execution +- **BYOK** — Routes inference requests from Copilot SDK to Foundry Local's OpenAI-compatible endpoint instead of GitHub Copilot's cloud + +## Prerequisites + +1. **Install Foundry Local** + - Windows: `winget install Microsoft.FoundryLocal` + - macOS: `brew install microsoft/foundrylocal/foundrylocal` + +2. **Install GitHub Copilot CLI** and authenticate + - See [Copilot CLI installation guide](https://docs.github.com/en/copilot/how-tos/set-up/install-copilot-cli) + - Verify: `copilot --version` + +3. **Download a model** + ```bash + foundry model run phi-4-mini + ``` + +## Quick Start: Node.js / TypeScript + +```typescript +import { CopilotClient, defineTool } from "@github/copilot-sdk"; +import { FoundryLocalManager } from "foundry-local-sdk"; + +// Bootstrap Foundry Local (starts service + loads model) +const manager = new FoundryLocalManager(); +const modelInfo = await manager.init("phi-4-mini"); + +// Create a Copilot SDK client (communicates with Copilot CLI over JSON-RPC) +const client = new CopilotClient(); + +// Create a session with BYOK pointing to Foundry Local +const session = await client.createSession({ + model: modelInfo.id, + provider: { + type: "openai", + baseUrl: manager.endpoint, // e.g., "http://localhost:5272/v1" + apiKey: manager.apiKey, + wireApi: "completions", // Foundry Local uses Chat Completions API + }, + streaming: true, +}); + +// Subscribe to streaming response chunks +session.on("assistant.message_delta", (event) => { + process.stdout.write(event.data.deltaContent); +}); + +// Send a message and wait for the complete response (timeout in ms, default 60 000) +await session.sendAndWait({ prompt: "What is the golden ratio?" }, 120_000); + +// Clean up +await session.destroy(); +await client.stop(); +``` + +Install and run: + +```bash +npm install @github/copilot-sdk foundry-local-sdk tsx +npx tsx app.ts +``` + +## Quick Start: Python + +```python +import asyncio +import sys +from copilot import CopilotClient +from copilot.generated.session_events import SessionEventType +from foundry_local import FoundryLocalManager + +async def main(): + # Bootstrap Foundry Local + manager = FoundryLocalManager("phi-4-mini") + model_info = manager.get_model_info("phi-4-mini") + + # Create a Copilot SDK client + client = CopilotClient() + await client.start() + + # Create a session with BYOK pointing to Foundry Local + session = await client.create_session({ + "model": model_info.id, + "provider": { + "type": "openai", + "base_url": manager.endpoint, + "api_key": manager.api_key, + "wire_api": "completions", + }, + "streaming": True, + }) + + # Subscribe to streaming response chunks + def on_event(event): + if event.type == SessionEventType.ASSISTANT_MESSAGE_DELTA: + sys.stdout.write(event.data.delta_content) + sys.stdout.flush() + + session.on(on_event) + + await session.send_and_wait({"prompt": "What is the golden ratio?"}, timeout=120_000) + + await session.destroy() + await client.stop() + +asyncio.run(main()) +``` + +Install and run: + +```bash +pip install github-copilot-sdk foundry-local-sdk +python app.py +``` + +## Adding Custom Tools + +Copilot SDK supports custom tools that the model can invoke during a conversation. This enables agentic workflows where the model can call your code: + +### Node.js / TypeScript + +```typescript +import { CopilotClient, defineTool } from "@github/copilot-sdk"; +import { FoundryLocalManager } from "foundry-local-sdk"; + +const manager = new FoundryLocalManager(); +const modelInfo = await manager.init("phi-4-mini"); + +// Define a tool the model can call +const getSystemInfo = defineTool("get_system_info", { + description: "Get information about the local AI system", + parameters: { + type: "object", + properties: { + query: { type: "string", description: "What to look up: 'model' or 'endpoint'" }, + }, + required: ["query"], + }, + handler: async (args: { query: string }) => { + if (args.query === "model") { + return { modelId: modelInfo.id, runtime: "ONNX Runtime" }; + } + return { url: manager.endpoint, protocol: "OpenAI-compatible" }; + }, +}); + +const client = new CopilotClient(); +const session = await client.createSession({ + model: modelInfo.id, + provider: { + type: "openai", + baseUrl: manager.endpoint, + apiKey: manager.apiKey, + wireApi: "completions", + }, + streaming: true, + tools: [getSystemInfo], +}); + +session.on("assistant.message_delta", (event) => { + process.stdout.write(event.data.deltaContent); +}); + +await session.sendAndWait({ + prompt: "What model am I running locally? Use the get_system_info tool to find out.", +}); + +await session.destroy(); +await client.stop(); +``` + +### Python + +```python +from copilot import CopilotClient +from copilot.tools import define_tool +from pydantic import BaseModel, Field + +class SystemInfoParams(BaseModel): + query: str = Field(description="What to look up: 'model' or 'endpoint'") + +@define_tool(description="Get information about the local AI system") +async def get_system_info(params: SystemInfoParams) -> dict: + if params.query == "model": + return {"modelId": model_info.id, "runtime": "ONNX Runtime"} + return {"url": manager.endpoint, "protocol": "OpenAI-compatible"} + +# Pass tools when creating the session: +session = await client.create_session({ + "model": model_info.id, + "provider": { ... }, # BYOK config as above + "tools": [get_system_info], +}) +``` + +## BYOK Provider Configuration Reference + +The `provider` object in `createSession()` configures where Copilot SDK sends inference requests: + +| Field | Type | Description | +|-------|------|-------------| +| `type` | `"openai"` | Provider type. Use `"openai"` for Foundry Local (OpenAI-compatible) | +| `baseUrl` | string | Foundry Local endpoint, e.g., `"http://localhost:5272/v1"` | +| `apiKey` | string | API key (optional for local endpoints) | +| `wireApi` | `"completions"` \| `"responses"` | API format. Use `"completions"` for Foundry Local | + +For the full BYOK reference including Azure, Anthropic, and other providers, see [Copilot SDK BYOK docs](https://github.com/github/copilot-sdk/blob/main/docs/auth/byok.md). + +## When to Use Which Approach + +| Scenario | Recommended Approach | +|----------|---------------------| +| Simple chat completions | Foundry Local SDK + OpenAI client ([existing samples](../samples/)) | +| **Agentic workflows** (tools, planning, multi-turn) | **Copilot SDK + Foundry Local** (this guide) | +| Model management only (download, load, unload) | Foundry Local SDK directly | +| Production cloud inference with agentic features | Copilot SDK with cloud providers | + +> **Note:** The existing Foundry Local SDKs (Python, JavaScript, C#, Rust) remain fully supported. This guide provides an additional option for developers who need agentic orchestration capabilities. + +## Limitations + +- **Timeouts**: Local inference is slower than cloud. `sendAndWait()` defaults to 60 s; pass a higher value (e.g. `120_000`) for on-device models, especially on CPU-only hardware. The [working sample](../samples/js/copilot-sdk-foundry-local/) uses a `FOUNDRY_TIMEOUT_MS` environment variable for easy tuning. +- **Copilot CLI required**: The Copilot SDK requires the [Copilot CLI](https://docs.github.com/en/copilot/how-tos/set-up/install-copilot-cli) to be installed and authenticated. The SDK communicates with it over JSON-RPC. +- **Tool calling**: Depends on model support. Not all Foundry Local models support function calling. Check model capabilities with `foundry model ls`. +- **Preview APIs**: Both Foundry Local's REST API and Copilot SDK may have breaking changes during preview. +- **Model size**: On-device models are smaller than cloud models. Agentic performance (multi-step planning, complex tool use) may vary compared to cloud-hosted models. +- **Platform**: Foundry Local supports Windows (x64/arm64) and macOS (Apple Silicon). + +## Working Sample + +See the complete working sample at [`samples/js/copilot-sdk-foundry-local/`](../samples/js/copilot-sdk-foundry-local/) which demonstrates bootstrapping, BYOK configuration, tool calling, streaming, and multi-turn conversation. + +## Related Links + +- [GitHub Copilot SDK](https://github.com/github/copilot-sdk) — Multi-platform SDK for agentic workflows +- [Copilot SDK Getting Started](https://github.com/github/copilot-sdk/blob/main/docs/getting-started.md) — Official tutorial +- [Copilot SDK BYOK Documentation](https://github.com/github/copilot-sdk/blob/main/docs/auth/byok.md) — Full BYOK configuration reference +- [Foundry Local Samples](../samples/) — Existing samples using Foundry Local SDK + OpenAI client +- [Foundry Local Documentation (Microsoft Learn)](https://learn.microsoft.com/azure/ai-foundry/foundry-local/) diff --git a/samples/js/copilot-sdk-foundry-local/README.md b/samples/js/copilot-sdk-foundry-local/README.md new file mode 100644 index 00000000..e911f7f7 --- /dev/null +++ b/samples/js/copilot-sdk-foundry-local/README.md @@ -0,0 +1,130 @@ +# Copilot SDK + Foundry Local Sample + +This sample demonstrates using [GitHub Copilot SDK](https://github.com/github/copilot-sdk) with [Foundry Local](https://github.com/microsoft/Foundry-Local) for on-device agentic AI workflows. + +## What This Shows + +- Bootstrapping Foundry Local with the Foundry Local SDK (service lifecycle + model management) +- Configuring Copilot SDK's **BYOK (Bring Your Own Key)** to use Foundry Local as the inference backend +- Creating a Copilot session with a **custom tool** (agentic capability) +- Streaming responses and multi-turn conversation via the Copilot SDK session API + +## Prerequisites + +1. **[Foundry Local](https://github.com/microsoft/Foundry-Local#installing)** installed +2. **[GitHub Copilot CLI](https://docs.github.com/en/copilot/how-tos/set-up/install-copilot-cli)** installed and authenticated +3. **Node.js 18+** + +Verify prerequisites: + +```bash +foundry --version +copilot --version +node --version +``` + +## Setup and Run + +```bash +cd samples/js/copilot-sdk-foundry-local +npm install +``` + +### Basic example (streaming + multi-turn) + +```bash +npm start +``` + +### Tool calling example (calculator, glossary lookup, system info) + +```bash +npm run tools +``` + +## Examples + +### `app.ts` — Basic (npm start) + +Bootstraps Foundry Local, creates a BYOK session, and runs a two-turn streaming conversation. + +### `tool-calling.ts` — Tool Calling (npm run tools) + +Registers three tools the model can invoke during conversation: + +| Tool | What it does | +|------|-------------| +| `calculate` | Evaluates math expressions (e.g. `Math.sqrt(144) + 8 * 3`) | +| `lookup_definition` | Looks up AI/programming terms (BYOK, ONNX, RAG, etc.) | +| `get_system_info` | Returns OS, architecture, memory, CPU count, and running model | + +Runs three turns, each designed to trigger a specific tool. When a tool is called you'll see `[Tool called: ...]` in the output. + +## Configuration + +### Timeout + +Both examples default to **120 seconds** per model turn. On slower hardware (CPU-only, low RAM) you may need more time. Override via the `FOUNDRY_TIMEOUT_MS` environment variable: + +```bash +# 3-minute timeout +FOUNDRY_TIMEOUT_MS=180000 npm start + +# 5-minute timeout for tool-calling (tool round-trips take longer) +FOUNDRY_TIMEOUT_MS=300000 npm run tools +``` + +The Copilot SDK's built-in `sendAndWait()` also accepts an optional `timeout` parameter (default 60 000 ms). The samples use a custom `sendMessage()` helper that wraps `session.send()` with its own timeout to work around a Foundry Local streaming quirk (missing `finish_reason`). The `FOUNDRY_TIMEOUT_MS` env var controls that helper's timeout. + +## What Happens + +1. **Foundry Local bootstrap** — Starts the local inference service (if not running) and downloads/loads the `phi-4-mini` model +2. **Copilot SDK client creation** — Creates a `CopilotClient` which communicates with the Copilot CLI over JSON-RPC +3. **BYOK session** — Creates a session with `provider: { type: "openai", baseUrl: "" }`, routing all inference through Foundry Local instead of GitHub Copilot's cloud +4. **Tool calling** — Tools are registered at session creation; the model can invoke them and receive results mid-conversation +5. **Multi-turn conversation** — Multiple messages in the same session share conversational context + +## Architecture + +``` +Your App (this sample) + | + ├─ foundry-local-sdk ──→ Foundry Local service (model lifecycle) + | + └─ @github/copilot-sdk + | + ├─ JSON-RPC ──→ Copilot CLI (agent orchestration) + | + └─ BYOK provider config + | + └─ POST /v1/chat/completions ──→ Foundry Local (inference) + | + └─ Local Model (phi-4-mini via ONNX Runtime) +``` + +## Key Configuration: BYOK Provider + +The critical piece is the `provider` config in `createSession()`: + +```typescript +const session = await client.createSession({ + model: modelInfo.id, + provider: { + type: "openai", // Foundry Local exposes OpenAI-compatible API + baseUrl: manager.endpoint, // e.g., "http://localhost:5272/v1" + apiKey: manager.apiKey, + wireApi: "completions", // Chat Completions API format + }, + streaming: true, + tools: [getSystemInfo], +}); +``` + +This tells Copilot SDK to route inference requests to Foundry Local's endpoint instead of GitHub Copilot's cloud service. See the [Copilot SDK BYOK documentation](https://github.com/github/copilot-sdk/blob/main/docs/auth/byok.md) for all provider options. + +## Related + +- [Copilot SDK Integration Guide](../../../docs/copilot-sdk-integration.md) — Full integration guide with architecture details +- [Copilot SDK Getting Started](https://github.com/github/copilot-sdk/blob/main/docs/getting-started.md) — Official Copilot SDK tutorial +- [Copilot SDK BYOK Docs](https://github.com/github/copilot-sdk/blob/main/docs/auth/byok.md) — Full BYOK configuration reference +- [Foundry Local hello-foundry-local sample](../hello-foundry-local/) — Simpler sample using OpenAI client directly (no Copilot SDK) diff --git a/samples/js/copilot-sdk-foundry-local/package.json b/samples/js/copilot-sdk-foundry-local/package.json new file mode 100644 index 00000000..d01a25a9 --- /dev/null +++ b/samples/js/copilot-sdk-foundry-local/package.json @@ -0,0 +1,19 @@ +{ + "name": "copilot-sdk-foundry-local-sample", + "version": "1.0.0", + "description": "Sample: Using GitHub Copilot SDK with Foundry Local for agentic workflows", + "type": "module", + "scripts": { + "start": "npx tsx src/app.ts", + "tools": "npx tsx src/tool-calling.ts" + }, + "dependencies": { + "@github/copilot-sdk": "latest", + "foundry-local-sdk": "latest", + "zod": "^3.0.0" + }, + "devDependencies": { + "tsx": "^4.0.0", + "typescript": "^5.0.0" + } +} diff --git a/samples/js/copilot-sdk-foundry-local/src/app.ts b/samples/js/copilot-sdk-foundry-local/src/app.ts new file mode 100644 index 00000000..0e3bd2f3 --- /dev/null +++ b/samples/js/copilot-sdk-foundry-local/src/app.ts @@ -0,0 +1,106 @@ +// Copyright (c) Microsoft Corporation. All rights reserved. +// Licensed under the MIT License. + +import { CopilotClient, defineTool } from "@github/copilot-sdk"; +import { FoundryLocalManager } from "foundry-local-sdk"; +import { z } from "zod"; +import * as os from "os"; + +const alias = "phi-4-mini"; + +// Timeout for each model turn (ms). Override with FOUNDRY_TIMEOUT_MS env var. +// Local models on CPU can be slow — increase this on less powerful hardware. +const TIMEOUT_MS = Number(process.env.FOUNDRY_TIMEOUT_MS) || 120_000; + +async function sendMessage( + session: Awaited>, + prompt: string, + timeoutMs = TIMEOUT_MS, +) { + return new Promise((resolve) => { + let settled = false; + let turnStarted = false; + const finish = () => { + if (!settled) { + settled = true; + unsub(); + resolve(); + } + }; + + const unsub = session.on((event: any) => { + if (event.type === "assistant.turn_start") turnStarted = true; + if (turnStarted && event.type === "session.idle") finish(); + if (turnStarted && event.type === "session.error") finish(); + }); + + session.send({ prompt }).catch(() => finish()); + setTimeout(finish, timeoutMs); + }); +} + +async function main() { + console.log("Initializing Foundry Local..."); + const manager = new FoundryLocalManager(); + const modelInfo = await manager.init(alias); + console.log(`Model: ${modelInfo.id}`); + console.log(`Endpoint: ${manager.endpoint}\n`); + + const client = new CopilotClient(); + + const getSystemInfo = defineTool("get_system_info", { + description: + "Get information about the current system including OS, architecture, memory, and CPU count", + parameters: z.object({}), + handler: async () => ({ + platform: os.platform(), + arch: os.arch(), + cpus: os.cpus().length, + totalMemory: `${Math.round(os.totalmem() / (1024 ** 3))} GB`, + freeMemory: `${Math.round(os.freemem() / (1024 ** 3))} GB`, + nodeVersion: process.version, + model: modelInfo.id, + endpoint: manager.endpoint, + }), + }); + + const session = await client.createSession({ + model: modelInfo.id, + provider: { + type: "openai", + baseUrl: manager.endpoint, + apiKey: manager.apiKey, + wireApi: "completions", + }, + streaming: true, + tools: [getSystemInfo], + systemMessage: { + content: + "You are a helpful AI assistant running locally via Foundry Local. " + + "You have access to tools — use them when the user asks for system or runtime information.", + }, + }); + + session.on("assistant.message_delta", (event) => { + process.stdout.write(event.data.deltaContent); + }); + session.on("tool.execution_start", (event) => { + console.log(`\n [Tool called: ${(event as any).data?.toolName ?? "unknown"}]`); + }); + + console.log("--- Turn 1: Ask about the local AI setup ---\n"); + process.stdout.write("Assistant: "); + await sendMessage(session, "What AI model am I running locally and what are its capabilities?"); + console.log("\n"); + + console.log("--- Turn 2: Follow-up conversation ---\n"); + process.stdout.write("Assistant: "); + await sendMessage(session, "What is the golden ratio? Explain in one paragraph."); + console.log("\n"); + + await session.destroy(); + await client.stop(); + console.log("Done!"); +} + +main().catch(console.error); diff --git a/samples/js/copilot-sdk-foundry-local/src/tool-calling.ts b/samples/js/copilot-sdk-foundry-local/src/tool-calling.ts new file mode 100644 index 00000000..d1261f50 --- /dev/null +++ b/samples/js/copilot-sdk-foundry-local/src/tool-calling.ts @@ -0,0 +1,212 @@ +// Copyright (c) Microsoft Corporation. All rights reserved. +// Licensed under the MIT License. + +/** + * Tool Calling Example — Copilot SDK + Foundry Local + * + * Demonstrates multiple custom tools that the model can invoke: + * - calculate: Evaluate math expressions + * - get_system_info: Return local system details + * - lookup_definition: Look up programming term definitions + * + * Run: npm run tools + */ + +import { CopilotClient, defineTool } from "@github/copilot-sdk"; +import { FoundryLocalManager } from "foundry-local-sdk"; +import { z } from "zod"; +import * as os from "os"; + +const alias = "phi-4-mini"; + +// Timeout for each model turn (ms). Override with FOUNDRY_TIMEOUT_MS env var. +// Local models on CPU can be slow — increase this on less powerful hardware. +const TIMEOUT_MS = Number(process.env.FOUNDRY_TIMEOUT_MS) || 120_000; + +// --------------------------------------------------------------------------- +// Helper: send a message and wait for the assistant's full reply. +// Foundry Local streaming sometimes omits finish_reason, which causes a +// session.error that can break sendAndWait(). This helper gates on +// assistant.turn_start so stale events from previous turns are ignored. +// --------------------------------------------------------------------------- +async function sendMessage( + session: Awaited>, + prompt: string, + timeoutMs = TIMEOUT_MS, +) { + return new Promise((resolve) => { + let settled = false; + let turnStarted = false; + const finish = () => { + if (!settled) { + settled = true; + unsub(); + resolve(); + } + }; + + const unsub = session.on((event: any) => { + if (event.type === "assistant.turn_start") turnStarted = true; + if (turnStarted && event.type === "session.idle") finish(); + if (turnStarted && event.type === "session.error") finish(); + }); + + session.send({ prompt }).catch(() => finish()); + setTimeout(finish, timeoutMs); + }); +} + +// --------------------------------------------------------------------------- +// Tool definitions +// --------------------------------------------------------------------------- + +function defineCalculateTool() { + return defineTool("calculate", { + description: + "Evaluate a math expression and return the numeric result. " + + "Supports +, -, *, /, parentheses, and Math.* functions like Math.sqrt, Math.pow.", + parameters: z.object({ + expression: z.string().describe('Math expression to evaluate, e.g. "2 + 2" or "Math.sqrt(144)"'), + }), + handler: async (args) => { + try { + // Only allow safe math characters and Math.* calls + const sanitized = args.expression.replace(/[^0-9+\-*/().,%\s]|Math\.\w+/g, (m) => + m.startsWith("Math.") ? m : "", + ); + const result = new Function(`"use strict"; return (${sanitized})`)(); + console.log(`\n → calculate("${args.expression}") = ${result}`); + return { expression: args.expression, result: Number(result) }; + } catch { + return { expression: args.expression, error: "Could not evaluate expression" }; + } + }, + }); +} + +function defineLookupTool() { + const glossary: Record = { + "byok": "Bring Your Own Key — a pattern where you supply your own API credentials to route requests to a custom endpoint instead of the default provider.", + "onnx": "Open Neural Network Exchange — an open format for representing machine learning models, enabling interoperability between frameworks.", + "rag": "Retrieval-Augmented Generation — a technique that combines a retrieval system with a generative model so responses are grounded in external documents.", + "json-rpc": "JSON Remote Procedure Call — a lightweight protocol for calling methods on a remote server using JSON-encoded messages.", + "streaming": "A technique where the server sends response tokens incrementally as they are generated, rather than waiting for the full response.", + }; + + return defineTool("lookup_definition", { + description: + "Look up the definition of a programming or AI term. " + + "Available terms: " + Object.keys(glossary).join(", "), + parameters: z.object({ + term: z.string().describe("The term to look up (case-insensitive)"), + }), + handler: async (args) => { + const key = args.term.toLowerCase().trim(); + const definition = glossary[key]; + console.log(`\n → lookup_definition("${args.term}") → ${definition ? "found" : "not found"}`); + if (definition) { + return { term: args.term, definition }; + } + return { term: args.term, error: `Term not found. Available: ${Object.keys(glossary).join(", ")}` }; + }, + }); +} + +function defineSystemInfoTool(modelId: string, endpoint: string) { + return defineTool("get_system_info", { + description: "Get information about the local system: OS, architecture, memory, CPU count, and the running model.", + parameters: z.object({}), + handler: async () => { + const info = { + platform: os.platform(), + arch: os.arch(), + cpus: os.cpus().length, + totalMemory: `${Math.round(os.totalmem() / 1024 ** 3)} GB`, + freeMemory: `${Math.round(os.freemem() / 1024 ** 3)} GB`, + nodeVersion: process.version, + model: modelId, + endpoint, + }; + console.log(`\n → get_system_info() → ${JSON.stringify(info)}`); + return info; + }, + }); +} + +// --------------------------------------------------------------------------- +// Main +// --------------------------------------------------------------------------- + +async function main() { + console.log("Initializing Foundry Local..."); + const manager = new FoundryLocalManager(); + const modelInfo = await manager.init(alias); + console.log(`Model: ${modelInfo.id}`); + console.log(`Endpoint: ${manager.endpoint}\n`); + + const calculate = defineCalculateTool(); + const lookupDefinition = defineLookupTool(); + const getSystemInfo = defineSystemInfoTool(modelInfo.id, manager.endpoint); + + const client = new CopilotClient(); + + const session = await client.createSession({ + model: modelInfo.id, + provider: { + type: "openai", + baseUrl: manager.endpoint, + apiKey: manager.apiKey, + wireApi: "completions", + }, + streaming: true, + tools: [calculate, lookupDefinition, getSystemInfo], + systemMessage: { + content: + "You are a helpful AI assistant running locally via Foundry Local. " + + "You have access to tools. ALWAYS use the appropriate tool when the user asks you to " + + "calculate something, look up a term, or get system information. " + + "Do not guess — call the tool and report its result.", + }, + }); + + // Stream assistant text to stdout + session.on("assistant.message_delta", (event) => { + process.stdout.write(event.data.deltaContent); + }); + session.on("tool.execution_start", (event) => { + console.log(`\n [Tool called: ${(event as any).data?.toolName ?? "unknown"}]`); + }); + + // --- Turn 1: Calculator tool --- + console.log("=== Turn 1: Calculator ===\n"); + process.stdout.write("User: What is the square root of 144 plus 8 times 3?\n\nAssistant: "); + await sendMessage( + session, + "Use the calculate tool to compute: Math.sqrt(144) + 8 * 3", + ); + console.log("\n"); + + // --- Turn 2: Glossary lookup tool --- + console.log("=== Turn 2: Glossary Lookup ===\n"); + process.stdout.write("User: What does BYOK mean? And what about RAG?\n\nAssistant: "); + await sendMessage( + session, + "Use the lookup_definition tool to look up 'byok' and 'rag', then explain both.", + ); + console.log("\n"); + + // --- Turn 3: System info tool --- + console.log("=== Turn 3: System Info ===\n"); + process.stdout.write("User: What system am I running on?\n\nAssistant: "); + await sendMessage( + session, + "Use the get_system_info tool to check what system this is running on, then summarize.", + ); + console.log("\n"); + + await session.destroy(); + await client.stop(); + console.log("Done!"); +} + +main().catch(console.error);