microsoft · metang · Feb 6, 2026 · Feb 6, 2026 · Feb 6, 2026 · Feb 6, 2026
diff --git a/docs/README.md b/docs/README.md
@@ -4,4 +4,8 @@ The Foundry Local documentation is provided on [Microsoft Learn](https://learn.m
 
 ## API Reference
 
-- [Foundry Local C# SDK API Reference](./cs-api/Microsoft.AI.Foundry.Local.md)
+- [Foundry Local SDK Reference](https://learn.microsoft.com/en-us/azure/ai-foundry/foundry-local/reference/reference-sdk?view=foundry-classic&tabs=windows&pivots=programming-language-javascript)
+
+## Integration Guides
+
+- [Using GitHub Copilot SDK with Foundry Local](./copilot-sdk-integration.md)
diff --git a/docs/copilot-sdk-integration.md b/docs/copilot-sdk-integration.md
@@ -0,0 +1,270 @@
+# Using GitHub Copilot SDK with Foundry Local
+
+## Overview
+
+For **agentic workflows** — tool calling, multi-step planning, and multi-turn conversations — you can use [GitHub Copilot SDK](https://github.com/github/copilot-sdk) with Foundry Local as the on-device inference backend. Copilot SDK provides the agentic orchestration layer while Foundry Local handles local model execution.
+
+This approach requires **no changes** to Foundry Local or its APIs. Copilot SDK connects to Foundry Local's OpenAI-compatible endpoint via its [Bring Your Own Key (BYOK)](https://github.com/github/copilot-sdk/blob/main/docs/auth/byok.md) feature.
+
+## Architecture
+
+```
+Your Application
+     |
+     ├─ foundry-local-sdk ──→ Foundry Local service (model lifecycle)
+     |
+     └─ @github/copilot-sdk (CopilotClient)
+              |
+              ├─ JSON-RPC ──→ Copilot CLI (agent orchestration, tool execution)
+              |
+              └─ BYOK provider: { type: "openai", baseUrl: "http://localhost:5272/v1" }
+                       |
+                       └─ POST /v1/chat/completions ──→ Foundry Local (on-device inference)
+                                                              |
+                                                              └─ Local Model (e.g., phi-4-mini via ONNX Runtime)
+```
+
+**Key components:**
+
+- **Foundry Local SDK** (`foundry-local-sdk`) — Manages the local inference service lifecycle (start, model download/load)
+- **Copilot SDK** (`@github/copilot-sdk`) — Provides `CopilotClient` for agentic orchestration (sessions, tools, streaming, multi-turn)
+- **Copilot CLI** — Background process that the SDK communicates with over JSON-RPC. Handles agent orchestration and tool execution
+- **BYOK** — Routes inference requests from Copilot SDK to Foundry Local's OpenAI-compatible endpoint instead of GitHub Copilot's cloud
+
+## Prerequisites
+
+1. **Install Foundry Local**
+   - Windows: `winget install Microsoft.FoundryLocal`
+   - macOS: `brew install microsoft/foundrylocal/foundrylocal`
+
+2. **Install GitHub Copilot CLI** and authenticate
+   - See [Copilot CLI installation guide](https://docs.github.com/en/copilot/how-tos/set-up/install-copilot-cli)
+   - Verify: `copilot --version`
+
+3. **Download a model**
+   ```bash
+   foundry model run phi-4-mini
+   ```
+
+## Quick Start: Node.js / TypeScript
+
+```typescript
+import { CopilotClient, defineTool } from "@github/copilot-sdk";
+import { FoundryLocalManager } from "foundry-local-sdk";
+
+// Bootstrap Foundry Local (starts service + loads model)
+const manager = new FoundryLocalManager();
+const modelInfo = await manager.init("phi-4-mini");
+
+// Create a Copilot SDK client (communicates with Copilot CLI over JSON-RPC)
+const client = new CopilotClient();
+
+// Create a session with BYOK pointing to Foundry Local
+const session = await client.createSession({
+    model: modelInfo.id,
+    provider: {
+        type: "openai",
+        baseUrl: manager.endpoint,     // e.g., "http://localhost:5272/v1"
+        apiKey: manager.apiKey,
+        wireApi: "completions",        // Foundry Local uses Chat Completions API
+    },
+    streaming: true,
+});
+
+// Subscribe to streaming response chunks
+session.on("assistant.message_delta", (event) => {
+    process.stdout.write(event.data.deltaContent);
+});
+
+// Send a message and wait for the complete response (timeout in ms, default 60 000)
+await session.sendAndWait({ prompt: "What is the golden ratio?" }, 120_000);
+
+// Clean up
+await session.destroy();
+await client.stop();
+```
+
+Install and run:
+
+```bash
+npm install @github/copilot-sdk foundry-local-sdk tsx
+npx tsx app.ts
+```
+
+## Quick Start: Python
+
+```python
+import asyncio
+import sys
+from copilot import CopilotClient
+from copilot.generated.session_events import SessionEventType
+from foundry_local import FoundryLocalManager
+
+async def main():
+    # Bootstrap Foundry Local
+    manager = FoundryLocalManager("phi-4-mini")
+    model_info = manager.get_model_info("phi-4-mini")
+
+    # Create a Copilot SDK client
+    client = CopilotClient()
+    await client.start()
+
+    # Create a session with BYOK pointing to Foundry Local
+    session = await client.create_session({
+        "model": model_info.id,
+        "provider": {
+            "type": "openai",
+            "base_url": manager.endpoint,
+            "api_key": manager.api_key,
+            "wire_api": "completions",
+        },
+        "streaming": True,
+    })
+
+    # Subscribe to streaming response chunks
+    def on_event(event):
+        if event.type == SessionEventType.ASSISTANT_MESSAGE_DELTA:
+            sys.stdout.write(event.data.delta_content)
+            sys.stdout.flush()
+
+    session.on(on_event)
+
+    await session.send_and_wait({"prompt": "What is the golden ratio?"}, timeout=120_000)
+
+    await session.destroy()
+    await client.stop()
+
+asyncio.run(main())
+```
+
+Install and run:
+
+```bash
+pip install github-copilot-sdk foundry-local-sdk
+python app.py
+```
+
+## Adding Custom Tools
+
+Copilot SDK supports custom tools that the model can invoke during a conversation. This enables agentic workflows where the model can call your code:
+
+### Node.js / TypeScript
+
+```typescript
+import { CopilotClient, defineTool } from "@github/copilot-sdk";
+import { FoundryLocalManager } from "foundry-local-sdk";
+
+const manager = new FoundryLocalManager();
+const modelInfo = await manager.init("phi-4-mini");
+
+// Define a tool the model can call
+const getSystemInfo = defineTool("get_system_info", {
+    description: "Get information about the local AI system",
+    parameters: {
+        type: "object",
+        properties: {
+            query: { type: "string", description: "What to look up: 'model' or 'endpoint'" },
+        },
+        required: ["query"],
+    },
+    handler: async (args: { query: string }) => {
+        if (args.query === "model") {
+            return { modelId: modelInfo.id, runtime: "ONNX Runtime" };
+        }
+        return { url: manager.endpoint, protocol: "OpenAI-compatible" };
+    },
+});
+
+const client = new CopilotClient();
+const session = await client.createSession({
+    model: modelInfo.id,
+    provider: {
+        type: "openai",
+        baseUrl: manager.endpoint,
+        apiKey: manager.apiKey,
+        wireApi: "completions",
+    },
+    streaming: true,
+    tools: [getSystemInfo],
+});
+
+session.on("assistant.message_delta", (event) => {
+    process.stdout.write(event.data.deltaContent);
+});
+
+await session.sendAndWait({
+    prompt: "What model am I running locally? Use the get_system_info tool to find out.",
+});
+
+await session.destroy();
+await client.stop();
+```
+
+### Python
+
+```python
+from copilot import CopilotClient
+from copilot.tools import define_tool
+from pydantic import BaseModel, Field
+
+class SystemInfoParams(BaseModel):
+    query: str = Field(description="What to look up: 'model' or 'endpoint'")
+
+@define_tool(description="Get information about the local AI system")
+async def get_system_info(params: SystemInfoParams) -> dict:
+    if params.query == "model":
+        return {"modelId": model_info.id, "runtime": "ONNX Runtime"}
+    return {"url": manager.endpoint, "protocol": "OpenAI-compatible"}
+
+# Pass tools when creating the session:
+session = await client.create_session({
+    "model": model_info.id,
+    "provider": { ... },  # BYOK config as above
+    "tools": [get_system_info],
+})
+```
+
+## BYOK Provider Configuration Reference
+
+The `provider` object in `createSession()` configures where Copilot SDK sends inference requests:
+
+| Field | Type | Description |
+|-------|------|-------------|
+| `type` | `"openai"` | Provider type. Use `"openai"` for Foundry Local (OpenAI-compatible) |
+| `baseUrl` | string | Foundry Local endpoint, e.g., `"http://localhost:5272/v1"` |
+| `apiKey` | string | API key (optional for local endpoints) |
+| `wireApi` | `"completions"` \| `"responses"` | API format. Use `"completions"` for Foundry Local |
+
+For the full BYOK reference including Azure, Anthropic, and other providers, see [Copilot SDK BYOK docs](https://github.com/github/copilot-sdk/blob/main/docs/auth/byok.md).
+
+## When to Use Which Approach
+
+| Scenario | Recommended Approach |
+|----------|---------------------|
+| Simple chat completions | Foundry Local SDK + OpenAI client ([existing samples](../samples/)) |
+| **Agentic workflows** (tools, planning, multi-turn) | **Copilot SDK + Foundry Local** (this guide) |
+| Model management only (download, load, unload) | Foundry Local SDK directly |
+| Production cloud inference with agentic features | Copilot SDK with cloud providers |
+
+> **Note:** The existing Foundry Local SDKs (Python, JavaScript, C#, Rust) remain fully supported. This guide provides an additional option for developers who need agentic orchestration capabilities.
+
+## Limitations
+
+- **Timeouts**: Local inference is slower than cloud. `sendAndWait()` defaults to 60 s; pass a higher value (e.g. `120_000`) for on-device models, especially on CPU-only hardware. The [working sample](../samples/js/copilot-sdk-foundry-local/) uses a `FOUNDRY_TIMEOUT_MS` environment variable for easy tuning.
+- **Copilot CLI required**: The Copilot SDK requires the [Copilot CLI](https://docs.github.com/en/copilot/how-tos/set-up/install-copilot-cli) to be installed and authenticated. The SDK communicates with it over JSON-RPC.
+- **Tool calling**: Depends on model support. Not all Foundry Local models support function calling. Check model capabilities with `foundry model ls`.
+- **Preview APIs**: Both Foundry Local's REST API and Copilot SDK may have breaking changes during preview.
+- **Model size**: On-device models are smaller than cloud models. Agentic performance (multi-step planning, complex tool use) may vary compared to cloud-hosted models.
+- **Platform**: Foundry Local supports Windows (x64/arm64) and macOS (Apple Silicon).
+
+## Working Sample
+
+See the complete working sample at [`samples/js/copilot-sdk-foundry-local/`](../samples/js/copilot-sdk-foundry-local/) which demonstrates bootstrapping, BYOK configuration, tool calling, streaming, and multi-turn conversation.
+
+## Related Links
+
+- [GitHub Copilot SDK](https://github.com/github/copilot-sdk) — Multi-platform SDK for agentic workflows
+- [Copilot SDK Getting Started](https://github.com/github/copilot-sdk/blob/main/docs/getting-started.md) — Official tutorial
+- [Copilot SDK BYOK Documentation](https://github.com/github/copilot-sdk/blob/main/docs/auth/byok.md) — Full BYOK configuration reference
+- [Foundry Local Samples](../samples/) — Existing samples using Foundry Local SDK + OpenAI client
+- [Foundry Local Documentation (Microsoft Learn)](https://learn.microsoft.com/azure/ai-foundry/foundry-local/)