Skip to content

rykiprince/Movies-ETL

Repository files navigation

Movies-ETL

Overview

Create one function that takes in the three files—Wikipedia data, Kaggle metadata, and the MovieLens rating data—and performs the ETL process by adding the data to a PostgreSQL database. Then, create an automated pipeline that takes in new data, performs the appropriate transformations and loads the data into existing tables.

Write an ETL Function to Read Three Data Files

  • The wiki_movies_df DataFrame

wiki_movies_df

  • The kaggle_metadata DataFrame

kaggle_metadata

  • The ratings DataFrame

ratings

Extract and Transform the Wikipedia Data

  • Cleaned wikipedia movies data as a DataFrame

cleaned_wiki_movies

  • Add the columns from wiki_movies_df DataFrame to a list.

cleaned_wiki_movies_columns

Extract and Transform the Kaggle Data

  • The movies_with_ratings_df DataFrame

movies_with_ratings_df

  • The movies_df DataFrame

movies_df

Create the Movie Database

  • The movies_query counted rows

movies_query.png

  • The ratings_query counted rows

ratings_query

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published