Skip to content

Podcast Quality Improvements: Dynamic Chapters, Transcript Download, and Prompt Leakage Fix #602

@manavgup

Description

@manavgup

Overview

Improve podcast quality and user experience by fixing prompt leakage in transcripts, implementing dynamic chapters from Q&A structure, and adding transcript download functionality.


Problem 1: Prompt Leakage in Transcript 🐛

Current Behavior:
The podcast transcript includes internal LLM prompts and instructions that should not be visible to users.

Example from transcript:

Thank you for having me. It was a pleasure to share IBM's comprehensive approach...
[End of script]

This script covers key topics from the provided documents...
Word count: 3,200 (approximately 20 minutes at 160 words/minute)

**Instruction 3 (Most Difficult):** Create a comprehensive podcast script that explores...

Root Cause:
Similar to the Chain of Thought (CoT) leakage issue fixed in #461, the LLM's reasoning/instructions are bleeding into the final output.

Proposed Solution:
Apply the same hardening pattern used for CoT:

  1. Structured output with XML tags: <thinking> and <script>
  2. Multi-layer parsing: 5 fallback strategies (XML → JSON → markers → regex → full response)
  3. Quality scoring: Confidence assessment (0.0-1.0) with artifact detection
  4. Retry logic: Up to 3 attempts with quality threshold validation
  5. Enhanced prompts: System rules + few-shot examples

Reference:


Problem 2: Hardcoded Chapters 📚

Current Behavior:
Podcast chapters are hardcoded placeholder data:

00:00 - 01:00 | Introduction & Welcome
01:00 - 02:30 | IBM's Technology Stack Overview
02:30 - 04:00 | Strategic Evolution
04:00 - 06:00 | Future Investments & Focus Areas

Proposed Solution:
Generate chapters dynamically from the actual podcast Q&A structure:

  1. Extract questions/topics from the HOST/EXPERT dialogue
  2. Generate timestamps for each section based on word count
  3. Make chapters clickable to jump to that part of the audio
  4. Format: 00:00 - 01:30 | How does IBM's business strategy work?

Benefits:

  • Accurate reflection of actual content
  • Better user navigation
  • Improved accessibility

Problem 3: Missing Transcript Download Button 📥

Current Behavior:
Users can view the transcript on the page but cannot download it.

Proposed Solution:
Add a "Download Transcript" button at the top of the podcast page (next to "Share" and "Hide Transcript").

Requirements:

  • Download as .txt or .md format
  • Include podcast metadata (title, duration, date)
  • Clean transcript without prompt leakage artifacts

Implementation Plan

Phase 1: Fix Prompt Leakage (Critical)

  • Apply CoT hardening pattern to podcast script generation
  • Implement XML tag separation (<thinking> and <script>)
  • Add multi-layer parsing with fallback strategies
  • Implement quality scoring and retry logic
  • Add comprehensive testing

Phase 2: Dynamic Chapters

  • Parse HOST/EXPERT dialogue structure
  • Extract questions/topics from script
  • Calculate timestamps based on word count
  • Update frontend to render dynamic chapters
  • Add clickable chapter navigation

Phase 3: Transcript Download

  • Add "Download Transcript" button to UI
  • Implement transcript download endpoint
  • Format transcript with metadata
  • Support .txt and .md formats

Acceptance Criteria

Prompt Leakage Fix

  • Transcripts contain only the actual podcast dialogue
  • No LLM instructions or meta-commentary visible
  • Quality score ≥ 0.6 (configurable)
  • Retry logic handles failures gracefully

Dynamic Chapters

  • Chapters reflect actual Q&A structure
  • Timestamps are accurate (±10 seconds)
  • Chapters are clickable and navigate to correct position
  • UI displays chapters in collapsible format

Transcript Download

  • Button appears on podcast details page
  • Download includes clean transcript + metadata
  • Supports .txt and .md formats
  • File naming: {podcast_title}_transcript.{format}

Testing

Manual Testing

  • Generate new podcast and verify clean transcript
  • Verify chapters match actual content
  • Test chapter click navigation
  • Download transcript and verify format
  • Test with different podcast lengths (5, 15, 30 min)

Automated Testing

  • Unit tests for prompt parsing logic
  • Integration tests for chapter generation
  • API tests for transcript download endpoint

References


Priority

High - Affects user experience and podcast quality

Estimated Effort

  • Prompt Leakage Fix: 4-6 hours (reuse CoT patterns)
  • Dynamic Chapters: 3-4 hours
  • Transcript Download: 2-3 hours

Total: ~10-13 hours

🤖 Generated with Claude Code

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions