Skip to content

cryptoecc/openai-realtime-agents

Repository files navigation

Realtime API Agents Demo

This is a simple demonstration of more advanced, agentic patterns built on top of the Realtime API. In particular, this demonstrates:

  • Sequential agent handoffs according to a defined agent graph (taking inspiration from OpenAI Swarm)
  • Background escalation to more intelligent models like o1-mini for high-stakes decisions
  • Prompting models to follow a state machine, for example to accurately collect things like names and phone numbers with confirmation character by character to authenticate a user.

You should be able to use this repo to prototype your own multi-agent realtime voice app in less than 20 minutes!

Screenshot of the Realtime API Agents Demo

Setup

  • This is a Next.js typescript app
  • Install dependencies with npm i
  • Add your OPENAI_API_KEY to your env
  • Start the server with npm run dev
  • Open your browser to http://localhost:3000 to see the app. It should automatically connect to the simpleExample Agent Set.

Configuring Agents

Configuration in src/app/agentConfigs/simpleExample.ts

import { AgentConfig } from "@/app/types";
import { injectTransferTools } from "./utils";

// Define agents
const haiku: AgentConfig = {
  name: "haiku",
  publicDescription: "Agent that writes haikus.", // Context for the agent_transfer tool
  instructions:
    "Ask the user for a topic, then reply with a haiku about that topic.",
  tools: [],
};

const greeter: AgentConfig = {
  name: "greeter",
  publicDescription: "Agent that greets the user.",
  instructions:
    "Please greet the user and ask them if they'd like a Haiku. If yes, transfer them to the 'haiku' agent.",
  tools: [],
  downstreamAgents: [haiku],
};

// add the transfer tool to point to downstreamAgents
const agents = injectTransferTools([greeter, haiku]);

export default agents;

This fully specifies the agent set that was used in the interaction shown in the screenshot above.

Next steps

  • Check out the configs in src/app/agentConfigs. The example above is a minimal demo that illustrates the core concepts.
  • frontDeskAuthentication Guides the user through a step-by-step authentication flow, confirming each value character-by-character, authenticates the user with a tool call, and then transfers to another agent. Note that the second agent is intentionally "bored" to show how to prompt for personality and tone.
  • customerServiceRetail Also guides through an authentication flow, reads a long offer from a canned script verbatim, and then walks through a complex return flow which requires looking up orders and policies, gathering user context, and checking with o1-mini to ensure the return is eligible. To test this flow, say that you'd like to return your snowboard and go through the necessary prompts!

Defining your own agents

UI

  • You can select agent scenarios in the Scenario dropdown, and automatically switch to a specific agent with the Agent dropdown.
  • The conversation transcript is on the left, including tool calls, tool call responses, and agent changes. Click to expand non-message elements.
  • The event log is on the right, showing both client and server events. Click to see the full payload.
  • On the bottom, you can disconnect, toggle between automated voice-activity detection or PTT, turn off audio playback, and toggle logs.

Core Contributors

About

fork of openai realtime agents opensource project

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages