Skip to content

This Project was done using Apache Airflow (workflow management tool) to build a pipeline for downloading podcasts from internet.

Notifications You must be signed in to change notification settings

jasgithub101/Podcasts-Pipeline

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 

Repository files navigation

Podcasts-Pipeline

This Project was done using Apache Airflow to build a pipeline for downloading podcast episodes, the episodes are stored in a SQLite Database.

Pros of using Airflow:

  • The project runs daily, downloading the new episodes everyday.
  • Each task runs independently, and logs can be monitored.
  • Each task can be run in an order and can also parallelize.
  • This project can be extened eaily using Airflow.

This is the DAG(Directed Acyclic Graph) that shows how the tasks are ordered in the pipeline: image

The data is taken from this xml file.

About

This Project was done using Apache Airflow (workflow management tool) to build a pipeline for downloading podcasts from internet.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages