Python script to poll a YouTube playlist, download clips as audio and and make them available as an audio podcast.
Designed to run on AWS Lambda, the code downloads new clips to S3 and updates a podcast.rss file suitable for podcast apps.
- Create a YouTube playlist to contain the videos you'd like to add to your podcast feed. For example create a playlist called 'Podcast this later'
- Set the playlist visibility to
Unlisted
- Go to the playlist and copy its URL, something like https://www.youtube.com/playlist?list=AbCdEFgHiJkLmNoPqRsTuVwXyZ
- Create an AWS S3 bucket
- Optionally, create an obscure folder in the root of the bucket to reduce visibility
- Apply a policy to the bucket to allow public read access and ListBucket for your AWS user while minimising other rights. See
sample_bucket_policy.json
. - Add a logo file to the bucket/folder, eg. 'logo.png'
- Create a new AWS Lambda function.
- Create a new execution role and then attach a policy to allow access to the S3 bucket (read and write are needed). See
sample_execution_policy.json
as an example and remember to update your bucket name. - Set the execution timeout to a few minutes.
- Create a Lambda layer for ffmpeg (see
create_ffmpeg_layer.sh
) and attach it to the Lambda function - Use
build.sh
to package youtube2podcast into a deployment_package.zip and upload as the Lambda code - Edit the Lambda's environment variables, optionally including a Webhook URL for notifications
AWS_REGION="yr-region-1"
BUCKET_NAME="my-podcast-bucket-01"
CONTENT_PATH="some-obscure-folder-name/"
PLAYLIST_URL="https://www.youtube.com/playlist?list=AbCdEFgHiJkLmNoPqRsTuVwXyZ"
WEBHOOK_TARGET="https://discord.com/api/webhooks/1234567890/somelongstring"
- Test the code:
- Add a video to your YouTube playlist
- Deploy and hit 'Test' on the lambda.
- Your S3 bucket should now contain an m4a audio file, a RSS file for the podcast and an index file.
- Schedule the Lambda. I used Eventbridge Scheduled events to trigger the Lambda every 15 mins. Assuming no new videos the code only runs for a few seconds and so you could make it more frequent but I don't know if YouTube has any rate limits.
- To be able to access the audio files and .RSS file you need to make your S3 bucket public and grant public read access. , See
sample_bucket_policy
in this repo but ensure you understand what you're configuring - refer to the AWS documentation as required. - Go to your podcast app (I recommend Overcast) and add the URL of your RSS file as a private podcast feed.
- Browse YouTube and find a video you'd like to listen to
- Add the video to your 'Podcast this later' playlist
- Wait around 15 mins (depending on your EventBridge schedule)
- Refresh your podcast app
- Listen to your new podcast episode
The code will post {"content": "[video title]"}
to a webhook URL if defined as the WEBHOOK_URL environment variable. This approach works with Discord - which lets you setup a private server for free, resulting in an easy way of getting push notifications on your phone.
- Currently incompatible with Apple Podcasts (works fine with Overcast ) - likely an RSS format issue
- YouTube can change often so [yt-dlp](yt-dlp/yt-dlp: A feature-rich command-line audio/video downloader (github.com)) is updated frequently to handle breaking changes - for this reason, you may need to
pip update
and re-upload the deployment package
Check CloudWatch for logs output by the Python script. The sample execution policy includes ACLs to let the lambda create a log group in CloudWatch and write to it.
You can also run this code locally and debug using your IDE of choice. Use the AWS CLI to authenticate before execution.
- Clone the repo
- Create src/.env file containing the relevant environment variables (or populate them some other way)
pip install -r requirements.txt
, ideally having created a [venv](venv — Creation of virtual environments — Python 3.12.4 documentation) first.- run lambda_function.py
The .github/workflows/release.yml
workflow file uses build.sh
to produce a deployment package zip suitable for updating the Lambda.
The workflow runs on push, where there's a semver-based tag
on:
push:
tags: [ 'v*.*.*' ]
When ready to create a release, tag your commit (e.g., git tag v1.0.0
and git push origin v1.0.0
). This will trigger the workflow, run your build.sh
script, and publish the resultant deployment_package.zip
as a release asset.