youtube2podcast

Python script to poll a YouTube playlist, download clips as audio and and make them available as an audio podcast.

Designed to run on AWS Lambda, the code downloads new clips to S3 and updates a podcast.rss file suitable for podcast apps.

How to configure

Create a YouTube playlist to contain the videos you'd like to add to your podcast feed. For example create a playlist called 'Podcast this later'
Set the playlist visibility to Unlisted
Go to the playlist and copy its URL, something like https://www.youtube.com/playlist?list=AbCdEFgHiJkLmNoPqRsTuVwXyZ
Create an AWS S3 bucket
Optionally, create an obscure folder in the root of the bucket to reduce visibility
Apply a policy to the bucket to allow public read access and ListBucket for your AWS user while minimising other rights. See sample_bucket_policy.json.
Add a logo file to the bucket/folder, eg. 'logo.png'
Create a new AWS Lambda function.
Create a new execution role and then attach a policy to allow access to the S3 bucket (read and write are needed). See sample_execution_policy.json as an example and remember to update your bucket name.
Set the execution timeout to a few minutes.
Create a Lambda layer for ffmpeg (see create_ffmpeg_layer.sh) and attach it to the Lambda function
Use build.sh to package youtube2podcast into a deployment_package.zip and upload as the Lambda code
Edit the Lambda's environment variables, optionally including a Webhook URL for notifications

AWS_REGION="yr-region-1"
BUCKET_NAME="my-podcast-bucket-01"
CONTENT_PATH="some-obscure-folder-name/"
PLAYLIST_URL="https://www.youtube.com/playlist?list=AbCdEFgHiJkLmNoPqRsTuVwXyZ"
WEBHOOK_TARGET="https://discord.com/api/webhooks/1234567890/somelongstring"

Test the code:

Add a video to your YouTube playlist
Deploy and hit 'Test' on the lambda.
Your S3 bucket should now contain an m4a audio file, a RSS file for the podcast and an index file.

Schedule the Lambda. I used Eventbridge Scheduled events to trigger the Lambda every 15 mins. Assuming no new videos the code only runs for a few seconds and so you could make it more frequent but I don't know if YouTube has any rate limits.
To be able to access the audio files and .RSS file you need to make your S3 bucket public and grant public read access. , See sample_bucket_policy in this repo but ensure you understand what you're configuring - refer to the AWS documentation as required.
Go to your podcast app (I recommend Overcast) and add the URL of your RSS file as a private podcast feed.

How to use youtube2podcast

Browse YouTube and find a video you'd like to listen to
Add the video to your 'Podcast this later' playlist
Wait around 15 mins (depending on your EventBridge schedule)
Refresh your podcast app
Listen to your new podcast episode

Notifications

The code will post {"content": "[video title]"} to a webhook URL if defined as the WEBHOOK_URL environment variable. This approach works with Discord - which lets you setup a private server for free, resulting in an easy way of getting push notifications on your phone.

Known Issues

Currently incompatible with Apple Podcasts (works fine with Overcast ) - likely an RSS format issue
YouTube can change often so [yt-dlp](yt-dlp/yt-dlp: A feature-rich command-line audio/video downloader (github.com)) is updated frequently to handle breaking changes - for this reason, you may need to pip update and re-upload the deployment package
Google seems to be blocking some AWS IP ranges (maybe to prevent AI firms scraping data using AWS) and so you may need to run this elsewhere, such as a raspberry pi.

Troubleshooting

Check CloudWatch for logs output by the Python script. The sample execution policy includes ACLs to let the lambda create a log group in CloudWatch and write to it.

You can also run this code locally and debug using your IDE of choice. Use the AWS CLI to authenticate before execution.

Clone the repo
Create src/.env file containing the relevant environment variables (or populate them some other way)
pip install -r requirements.txt, ideally having created a [venv](venv — Creation of virtual environments — Python 3.12.4 documentation) first.
run lambda_function.py

If you see errors related to blocking by YouTube it's possible that the IPs being used by your Lambda have been blocked/throttled by YouTube. This is apparently to prevent people using the YouTube APIs to pull content for ML training. The easiest work-around is to host the code somewhere else like a home network using a simple cron task to trigger it.

Release Process

The .github/workflows/release.yml workflow file uses build.sh to produce a deployment package zip suitable for updating the Lambda.

The workflow runs on push, where there's a semver-based tag

on:
  push:
    tags: [ 'v*.*.*' ]

When ready to create a release, tag your commit (e.g., git tag v1.0.0 and git push origin v1.0.0). This will trigger the workflow, run your build.sh script, and publish the resultant deployment_package.zip as a release asset.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

youtube2podcast

How to configure

How to use youtube2podcast

Notifications

Known Issues

Troubleshooting

Release Process

About

Releases 5

Packages

Contributors 2

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 41 Commits
.github/workflows		.github/workflows
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
build.sh		build.sh
create_ffmpeg_layer.sh		create_ffmpeg_layer.sh
migrate_s3_mp4a_to_m4a.sh		migrate_s3_mp4a_to_m4a.sh
requirements.txt		requirements.txt
sample_bucket_policy.json		sample_bucket_policy.json
sample_execution_policy.json		sample_execution_policy.json

License

NedryContainmentSolutions/youtube2podcast

Folders and files

Latest commit

History

Repository files navigation

youtube2podcast

How to configure

How to use youtube2podcast

Notifications

Known Issues

Troubleshooting

Release Process

About

Resources

License

Stars

Watchers

Forks

Releases 5

Packages 0

Contributors 2

Languages

Packages