-
Notifications
You must be signed in to change notification settings - Fork 4
Description
Overview
Improve podcast quality and user experience by fixing prompt leakage in transcripts, implementing dynamic chapters from Q&A structure, and adding transcript download functionality.
Problem 1: Prompt Leakage in Transcript 🐛
Current Behavior:
The podcast transcript includes internal LLM prompts and instructions that should not be visible to users.
Example from transcript:
Thank you for having me. It was a pleasure to share IBM's comprehensive approach...
[End of script]
This script covers key topics from the provided documents...
Word count: 3,200 (approximately 20 minutes at 160 words/minute)
**Instruction 3 (Most Difficult):** Create a comprehensive podcast script that explores...
Root Cause:
Similar to the Chain of Thought (CoT) leakage issue fixed in #461, the LLM's reasoning/instructions are bleeding into the final output.
Proposed Solution:
Apply the same hardening pattern used for CoT:
- Structured output with XML tags:
<thinking>and<script> - Multi-layer parsing: 5 fallback strategies (XML → JSON → markers → regex → full response)
- Quality scoring: Confidence assessment (0.0-1.0) with artifact detection
- Retry logic: Up to 3 attempts with quality threshold validation
- Enhanced prompts: System rules + few-shot examples
Reference:
- CoT hardening implementation:
docs/features/chain-of-thought-hardening.md - Original CoT fix: Issue Critical: Chain of Thought reasoning leaking into final responses - garbage output #461
Problem 2: Hardcoded Chapters 📚
Current Behavior:
Podcast chapters are hardcoded placeholder data:
00:00 - 01:00 | Introduction & Welcome
01:00 - 02:30 | IBM's Technology Stack Overview
02:30 - 04:00 | Strategic Evolution
04:00 - 06:00 | Future Investments & Focus Areas
Proposed Solution:
Generate chapters dynamically from the actual podcast Q&A structure:
- Extract questions/topics from the HOST/EXPERT dialogue
- Generate timestamps for each section based on word count
- Make chapters clickable to jump to that part of the audio
- Format:
00:00 - 01:30 | How does IBM's business strategy work?
Benefits:
- Accurate reflection of actual content
- Better user navigation
- Improved accessibility
Problem 3: Missing Transcript Download Button 📥
Current Behavior:
Users can view the transcript on the page but cannot download it.
Proposed Solution:
Add a "Download Transcript" button at the top of the podcast page (next to "Share" and "Hide Transcript").
Requirements:
- Download as
.txtor.mdformat - Include podcast metadata (title, duration, date)
- Clean transcript without prompt leakage artifacts
Implementation Plan
Phase 1: Fix Prompt Leakage (Critical)
- Apply CoT hardening pattern to podcast script generation
- Implement XML tag separation (
<thinking>and<script>) - Add multi-layer parsing with fallback strategies
- Implement quality scoring and retry logic
- Add comprehensive testing
Phase 2: Dynamic Chapters
- Parse HOST/EXPERT dialogue structure
- Extract questions/topics from script
- Calculate timestamps based on word count
- Update frontend to render dynamic chapters
- Add clickable chapter navigation
Phase 3: Transcript Download
- Add "Download Transcript" button to UI
- Implement transcript download endpoint
- Format transcript with metadata
- Support
.txtand.mdformats
Acceptance Criteria
Prompt Leakage Fix
- Transcripts contain only the actual podcast dialogue
- No LLM instructions or meta-commentary visible
- Quality score ≥ 0.6 (configurable)
- Retry logic handles failures gracefully
Dynamic Chapters
- Chapters reflect actual Q&A structure
- Timestamps are accurate (±10 seconds)
- Chapters are clickable and navigate to correct position
- UI displays chapters in collapsible format
Transcript Download
- Button appears on podcast details page
- Download includes clean transcript + metadata
- Supports
.txtand.mdformats - File naming:
{podcast_title}_transcript.{format}
Testing
Manual Testing
- Generate new podcast and verify clean transcript
- Verify chapters match actual content
- Test chapter click navigation
- Download transcript and verify format
- Test with different podcast lengths (5, 15, 30 min)
Automated Testing
- Unit tests for prompt parsing logic
- Integration tests for chapter generation
- API tests for transcript download endpoint
References
- CoT Hardening:
docs/features/chain-of-thought-hardening.md - CoT Quick Reference:
docs/features/cot-quick-reference.md - Original CoT Issue: Critical: Chain of Thought reasoning leaking into final responses - garbage output #461
- Podcast Service:
backend/rag_solution/services/podcast_service.py - Podcast Router:
backend/rag_solution/router/podcast_router.py
Priority
High - Affects user experience and podcast quality
Estimated Effort
- Prompt Leakage Fix: 4-6 hours (reuse CoT patterns)
- Dynamic Chapters: 3-4 hours
- Transcript Download: 2-3 hours
Total: ~10-13 hours
🤖 Generated with Claude Code