A FastAPI-based project for fetching YouTube video transcripts and processing them with OpenAI's GPT models for summarization and analysis.
- Fetch YouTube video transcripts using the
youtube-transcript-api. - Summarize and analyze transcripts using OpenAI's GPT models.
- Expose APIs for transcript retrieval and optional processing.
- Python 3.12+
- pip (Python package manager)
git clone https://github.com/your-repo/youtube-transcript-api.git
cd youtube-transcript-apipython -m venv venv
source venv/bin/activate # On Linux/Mac
.\venv\Scripts\activate # On Windowspip install -r requirements.txtCreate a .env file in the root directory:
OPENAI_API_KEY=your_openai_api_key_here
Replace your_openai_api_key_here with your actual OpenAI API key.
Ensure you have Docker and Docker Compose installed on your machine. Then, run the following command to set up the necessary infrastructure:
docker-compose up -dThis will start the required services defined in your docker-compose.yml file.
Run the following command to apply database migrations using yoyo-migrations:
yoyo applyReplace username, password, and youtube_transcripts with your actual database credentials.
Run the FastAPI application using Uvicorn:
uvicorn app.__init__:app --reload- Swagger UI: http://127.0.0.1:8000/docs
- ReDoc: http://127.0.0.1:8000/redoc
-
Fetch Transcript
GET /transcript/{video_id}- Parameters:
video_id(str): The YouTube video ID.
- Response:
{ "video_id": "example_id", "transcript": "Transcript content..." }
- Parameters:
-
Fetch and Summarize Transcript
GET /transcript/{video_id}?summarize=true- Query Parameter:
summarize(bool): Set totrueto include a summary.
- Response:
{ "video_id": "example_id", "transcript": "Transcript content...", "summary": "Summarized content..." }
- Query Parameter:
youtube-transcript-api/
│
├── app/
│ ├── __init__.py # FastAPI app entry point
│ ├── api/ # API routes
│ │ ├── endpoints/
│ │ │ ├── transcript.py # Transcript-related endpoints
│ ├── core/ # Core configurations
│ │ ├── config.py # Config file for environment variables
│ ├── services/ # Business logic
│ │ ├── openai_service.py # Integration with OpenAI API
│ │ ├── transcript_service.py # Logic for transcript fetching
│ ├── utils/ # Utility functions
│
├── tests/ # Tests for your application
├── .env # Environment variables
├── requirements.txt # Dependencies
├── README.md # Project documentation
pytestContributions are welcome! Feel free to open issues or submit pull requests.
This project is licensed under the MIT License.
Josep Servat
For any queries, contact at DM on x @servatj or linkedin https://www.linkedin.com/in/servatj/.