-
Notifications
You must be signed in to change notification settings - Fork 3
feat: Add voice preview feature to podcast generation #316
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This change introduces a new voice preview feature to the podcast generation modal, allowing users to listen to a sample of each voice before making a selection. Key changes: - Replaced the voice selection dropdowns with a new interactive `VoiceSelector` component. - Added a play/pause button next to each voice option to trigger an audio preview. - Implemented a new backend endpoint at `/api/podcasts/voice-preview/{voice_id}` to generate and serve the voice previews. - Added unit and integration tests for the new backend functionality. - Addressed a potential memory leak in the frontend audio player. - Improved frontend error handling for voice previews. Note: - Frontend verification was skipped due to a Docker permission error that prevented the development server from starting. This is a known issue mentioned in the original PR that this work is based on.
👋 Jules, reporting for duty! I'm here to lend a hand with this pull request. When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down. I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job! For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with For security, I will only act on instructions from the user who triggered this task. |
🚀 Development Environment OptionsThis repository supports Dev Containers for a consistent development environment. Option 1: GitHub Codespaces (Recommended)Create a cloud-based development environment:
Option 2: VS Code Dev Containers (Local)Use Dev Containers on your local machine:
Option 3: Traditional Local SetupSet up the development environment manually: # Clone the repository
git clone https://github.com/manavgup/rag_modulo.git
cd rag_modulo
git checkout feature/voice-preview
# Initialize development environment
make dev-init
make dev-build
make dev-up
make dev-validate Available CommandsOnce in your development environment: make help # Show all available commands
make dev-validate # Validate environment setup
make test-atomic # Run atomic tests
make test-unit # Run unit tests
make lint # Run linting Services AvailableWhen running
This automated message helps reviewers quickly set up the development environment. |
Code Review for Voice Preview FeatureThank you for implementing the voice preview feature! This is a valuable UX enhancement. I've conducted a thorough review and found several issues that need to be addressed before this can be merged. 🐛 Critical Issues1. Broken Abstract Method in Base Class (
|
- Fix malformed abstract method in base.py - Remove HTTPException from voice preview service layer - Fix memory leak in VoiceSelector component - Improve error handling in router with proper exception chaining - Add accessibility attributes to VoiceSelector - Implement generate_speech_from_text in OllamaAudioProvider - Add docstrings to test_podcast_api.py
The VoiceId type was defined as a restrictive literal union ('alloy' | 'echo' | ...), but voices are dynamically fetched from the backend API and may include provider-specific voices beyond the OpenAI defaults. Changes: - Changed VoiceId from literal union to string type - Fixes TypeScript compilation error in VoiceSelector.tsx - Backend validates voice IDs, so client-side validation is redundant - Allows support for voices from multiple providers (OpenAI, Ollama, etc.) Fixes compilation error: TS2345: Argument of type 'string' is not assignable to parameter of type 'VoiceId' 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
All other routers in the application use /api prefix (e.g., /api/auth, /api/collections, /api/search), but podcast_router was missing it, causing 404 errors. Changes: - Changed router prefix from "/podcasts" to "/api/podcasts" - Fixes 404 errors when accessing voice preview endpoint - Maintains consistency with application routing patterns Before: /podcasts/voice-preview/{voice_id} (404) After: /api/podcasts/voice-preview/{voice_id} (working) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Two fixes in podcast service: 1. Changed `collection_service.get_by_id()` to `get_collection()` 2. Fixed NotFoundError to use proper constructor arguments Changes: - Use get_collection() instead of non-existent get_by_id() - Removed user_id parameter (not needed for get_collection) - Fixed NotFoundError to include resource_type and resource_id - Removed incorrect type: ignore comments Fixes error: 'CollectionService' object has no attribute 'get_by_id' 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
The podcast service was calling count_documents() which doesn't exist on CollectionService. Instead, use the files array from CollectionOutput. Changes: - Replaced `collection_service.count_documents()` with `len(collection.files)` - Removed incorrect type: ignore comment Fixes: 'CollectionService' object has no attribute 'count_documents' 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Removed collection existence and document count validation that was causing sync/async SQLAlchemy session conflicts. Collection validation will happen during RAG search instead. Root cause: CollectionService uses sync Session but PodcastRepository uses AsyncSession, causing ChunkedIteratorResult errors when mixing calls. Temporary solution: Skip collection validation in podcast service Long-term: Migrate all repositories to AsyncSession Fixes: object ChunkedIteratorResult can't be used in 'await' expression 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Root cause: get_db() returns sync Session but podcast components were typed/implemented for AsyncSession, causing ChunkedIteratorResult errors. Changes: - PodcastRepository: Accept Session|AsyncSession, make count_active_for_user sync - PodcastService: Use Session instead of AsyncSession - podcast_router: Fix type annotation from AsyncSession to Session - Removed await from count_active_for_user call This aligns with how SearchService and CollectionService handle sessions. Fixes: object ChunkedIteratorResult can't be used in 'await' expression 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Removed doc_count variable reference from validation logger since collection validation was removed. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Added is_async flag to detect session type and handle commit/refresh operations appropriately for both sync and async sessions. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Closing as duplicate. Voice preview feature was already successfully implemented and merged in PR #306. |
This change introduces a new voice preview feature to the podcast generation modal, allowing users to listen to a sample of each voice before making a selection.
Key changes:
VoiceSelector
component./api/podcasts/voice-preview/{voice_id}
to generate and serve the voice previews.Note:
PR created automatically by Jules for task 1449928868930487865