openai · MattVOLTA · May 4, 2025 · May 4, 2025 · May 4, 2025 · May 4, 2025
diff --git a/.gitignore b/.gitignore
@@ -37,4 +37,7 @@ yarn-error.log*
 # typescript
 *.tsbuildinfo
 next-env.d.ts
-todo.md
+todo.md
+
+# MCP
+mcp.json
diff --git a/Interview_agent_dev_spec.md b/Interview_agent_dev_spec.md
@@ -0,0 +1,138 @@
+# Interview Agent Development Specification
+
+## Overview
+
+The OpenAI Realtime Agents Interview Application is a Next.js-based web application that leverages OpenAI's Realtime API to create interactive voice-based agents for conducting structured interviews. The application specializes in qualitative research and feedback collection for startup support engagements, featuring an agent-based system that can conduct interviews following predefined conversation flows.
+
+## Core Technologies
+
+- **Frontend**: Next.js, React, TypeScript, Tailwind CSS
+- **Backend**: Next.js API routes
+- **Database**: Supabase (PostgreSQL)
+- **Real-time Communication**: WebSockets, RTCDataChannel
+- **AI**: OpenAI Realtime API
+- **Voice Processing**: WebRTC for audio streaming
+
+## System Architecture
+
+### Client-Server Model
+- **Client**: React application that manages WebRTC connections and user interactions
+- **Server**: Next.js API routes for database operations and OpenAI API interactions
+
+### Database Structure
+- **Tables**:
+  - `interviews`: Stores interview metadata and session information
+  - `questions`: Stores interview questions with ordinal positions
+  - `answers`: Stores participant responses to questions
+  - `companies`: Reference table for organization information
+  - `people`: Reference table for interviewee information
+  - `support_engagements`: Reference table for specific support instances
+
+## Key Features
+
+### 1. Agent Management
+- Pre-configured agent templates with customizable conversation flows
+- Dynamic agent configuration based on interview context
+- Voice customization (using "shimmer" voice)
+- Speech playback optimization (1.25x speed)
+
+### 2. Interview Process
+- Structured conversation states with transition rules
+- Context-aware questioning based on interviewee responses
+- Active listening with follow-up question generation
+- Real-time transcription of conversation
+
+### 3. Realtime Voice Interaction
+- Push-to-talk functionality
+- Real-time voice streaming
+- Voice activity detection (semantic_vad with high eagerness)
+- Audio playback controls
+
+### 4. Data Persistence
+- Interview session recording and storage
+- Question and answer tracking
+- Contextual metadata storage
+- Support engagement linking
+
+### 5. User Interface
+- Transcript visualization
+- Event logging and monitoring
+- Session control and management
+- Interview creation and management
+
+## Agent Configuration
+
+The application supports configurable interview agents with:
+
+1. **Personality & Tone**: Professional yet friendly researcher persona
+2. **Core Objectives**: Context-specific interview goals
+3. **Engagement Context**: Dynamic fields for company and support information
+4. **Conversation Flow**: Sequential question progression
+5. **Conversation States**: Structured interview phases with transition rules
+   - Introduction
+   - Context questions
+   - Challenge identification
+   - Impact assessment
+   - Conclusion
+
+## Data Flow
+
+1. **Interview Setup**:
+   - Agent configuration loaded with contextual information
+   - WebRTC connection established with OpenAI Realtime API
+   - Session metadata stored in Supabase
+
+2. **Interview Execution**:
+   - Voice data streamed bidirectionally
+   - Conversation transcribed in real-time
+   - Responses processed by agent logic
+   - Follow-up questions generated contextually
+
+3. **Data Persistence**:
+   - Interview responses stored in database
+   - Metadata updated throughout session
+   - Full transcript preserved
+
+## API Endpoints
+
+### Interview Management
+- `GET /api/interviews`: Retrieve all interviews with associated questions
+- `POST /api/interviews/create`: Create a new interview with questions
+- `GET /api/interviews/connect`: Connect to specific interview data
+
+### Session Management
+- `GET /api/session`: Generate ephemeral keys for OpenAI Realtime API
+
+### Data Access
+- Endpoints for companies, people, and support engagements
+
+## Deployment Considerations
+
+- Environment variables for API keys and database connections
+- WebRTC compatibility considerations
+- Audio processing requirements
+- Database migration scripts for schema updates
+
+## Security
+
+- Ephemeral key management for OpenAI API
+- Server-side data validation
+- Secure database access patterns
+- Client-side security measures
+
+## Future Enhancement Areas
+
+1. **Enhanced Analytics**: Interview data visualization and insights
+2. **Improved Agent Intelligence**: More contextual awareness and natural conversation
+3. **Multi-language Support**: Internationalization for global usage
+4. **Integration Capabilities**: API endpoints for external system connections
+5. **Advanced Question Generation**: Dynamic question creation based on previous responses
+
+## Development Guidelines
+
+1. Follow existing code conventions in the repository
+2. Maintain agent configuration patterns for consistency
+3. Use TypeScript interfaces for data validation
+4. Implement proper error handling for API endpoints
+5. Test WebRTC functionality across different environments
+6. Document new agent configurations thoroughly 
diff --git a/README.md b/README.md
@@ -17,6 +17,23 @@ You should be able to use this repo to prototype your own multi-agent realtime v
 - Start the server with `npm run dev`
 - Open your browser to [http://localhost:3000](http://localhost:3000) to see the app. It should automatically connect to the `simpleExample` Agent Set.
 
+## Supabase Integration
+
+This demo app includes integration with Supabase for retrieving real support engagement data. To use this functionality:
+
+1. Create a Supabase project at [https://supabase.com](https://supabase.com)
+2. Set up your database with the following tables:
+   - `companies` - Information about companies receiving support
+   - `support_engagements` - Details about support engagements
+
+3. Update your `.env.development.local` file with your Supabase credentials:
+```
+NEXT_PUBLIC_SUPABASE_URL=https://your-project-id.supabase.co
+NEXT_PUBLIC_SUPABASE_ANON_KEY=your_supabase_anon_key
+```
+
+4. The app will automatically fetch and use real data when you select the "startupInterviewer" agent.
+
 ## Configuring Agents
 Configuration in `src/app/agentConfigs/simpleExample.ts`
 ```javascript

diff --git a/components.json b/components.json
@@ -0,0 +1,21 @@
+{
+  "$schema": "https://ui.shadcn.com/schema.json",
+  "style": "new-york",
+  "rsc": true,
+  "tsx": true,
+  "tailwind": {
+    "config": "tailwind.config.ts",
+    "css": "src/app/globals.css",
+    "baseColor": "neutral",
+    "cssVariables": true,
+    "prefix": ""
+  },
+  "aliases": {
+    "components": "@/components",
+    "utils": "@/lib/utils",
+    "ui": "@/components/ui",
+    "lib": "@/lib",
+    "hooks": "@/hooks"
+  },
+  "iconLibrary": "lucide"
+}
diff --git a/interviewee_ux_spec.md b/interviewee_ux_spec.md
@@ -0,0 +1,73 @@
+# Interviewee UX – Proof of Concept Specification
+
+_Last updated: {{DATE}}_
+
+## 1. Access & Security
+
+| Item | Decision |
+|------|----------|
+| Link format | `https://<domain>/i/{invite_token}` – token in path segment |
+| Auth | Route `/i/*` and `/app?candidate=1` exempt from auth middleware |
+| Token validity | Link works as long as associated interview status is **not** `completed` |
+| Re-use | Multiple openings allowed until marked completed |
+| Expiration | none for POC |
+
+## 2. Entry Flow
+1. Candidate clicks the invite link (`/i/{token}`).
+2. Server resolves `invite_token` → `interview.id`.
+3. If interview status is `completed`: redirect → `/invite-completed` (future page).  
+   If token not found: redirect → `/invite-not-found` (future page).
+4. Otherwise, redirect → `/app?interviewId={id}&candidate=1`.
+
+_No additional onboarding or mic-check screens for POC._
+
+## 3. Candidate UI inside `/app`
+
+| Element | Behaviour / Notes |
+|---------|-------------------|
+| Header | Minimal: "Interview Session" + product logo. No scenario/agent selectors. |
+| Main area | `InterviewExperience` component reused. Shows:<br>• Current question (medium size, centred left column)<br>• Progress text: "Question N of M"<br>• Horizontal progress bar<br>• Audio-wave visualisation canvas below<br>• Status pill (Live/Connecting) |
+| Agent state indicator | `Agent is speaking…` (green pulse) vs `Agent is listening…` (grey) |
+| Transcript & Events panes | **Hidden** in candidate view |
+| Bottom toolbar | Temporarily left visible for dev controls; will be hidden in prod. |
+| Typing fallback | Not implemented for POC |
+
+## 4. Completion & Thank-You
+
+Trigger: Client detects assistant's **final** message + session disconnect → sets `sessionStatus = DISCONNECTED`.
+
+Action:
+* After 500 ms debounce, if `isCandidateView && isInterviewMode && sessionStatus === DISCONNECTED` → `router.push("/i/thank-you")`.
+
+_New in v2 – agent-driven completion_
+
+* When the agent reaches its `wrap_up` conversation state it calls the function `markInterviewCompleted` with `{ "interview_id": <uuid> }`.
+* The front-end handles this function call (via `toolLogic`) which hits `POST /api/interviews/complete` and updates the DB.
+* The subsequent disconnect triggers the existing redirect logic above.
+
+### Thank-You screen (`/i/thank-you`)
+* Large headline: "Thank you for your time!"
+* Sub-text: "You may now close this tab or return to Volta."
+* Button `Return to Volta` → `https://voltaeffect.com` (opens new tab)
+* No other navigation.
+* Refreshing this page keeps the user on thank-you screen (static route).
+
+## 5. Edge Cases / Out-of-Scope
+* Mic permission failures → **not** handled (risk accepted).
+* Session resume after refresh during interview → deferred.
+* One-time / time-limited tokens → deferred.
+* Manual "Finish" button for candidate → deferred.
+* Legal/privacy blurb → delivered verbally by AI, no UI display for POC.
+
+## 6. Implementation Notes
+* `middleware.ts` updated: public routes `/i` & `/app` bypass auth.
+* `/i/[token]/page.tsx` handles token resolution & redirect logic.
+* Candidate mode detected via query param `candidate=1`.
+* UI conditional logic in `App.tsx`:
+  * Hides transcript/events
+  * Hides bottom toolbar once ready for prod
+  * Tracks agent-speaking state via transcript items
+* Thank-You page implemented at `src/app/i/thank-you/page.tsx`.
+
+---
+**Ready for developer hand-off.** 
diff --git a/next.config.ts b/next.config.ts
@@ -1,7 +1,15 @@
 import type { NextConfig } from "next";
 
 const nextConfig: NextConfig = {
-  /* config options here */
+  images: {
+    remotePatterns: [
+      {
+        protocol: "https",
+        hostname: "*.supabase.co",
+        pathname: "/storage/v1/object/public/**",
+      },
+    ],
+  },
 };
 
 export default nextConfig;