Movies-ETL

Overview

Create one function that takes in the three files—Wikipedia data, Kaggle metadata, and the MovieLens rating data—and performs the ETL process by adding the data to a PostgreSQL database. Then, create an automated pipeline that takes in new data, performs the appropriate transformations and loads the data into existing tables.

Write an ETL Function to Read Three Data Files

The wiki_movies_df DataFrame

The kaggle_metadata DataFrame

The ratings DataFrame

Extract and Transform the Wikipedia Data

Cleaned wikipedia movies data as a DataFrame

Add the columns from wiki_movies_df DataFrame to a list.

Extract and Transform the Kaggle Data

The movies_with_ratings_df DataFrame

The movies_df DataFrame

Create the Movie Database

The movies_query counted rows

The ratings_query counted rows

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
Resources		Resources
.DS_Store		.DS_Store
.gitignore		.gitignore
ETL_clean_kaggle_data.ipynb		ETL_clean_kaggle_data.ipynb
ETL_clean_wiki_movies.ipynb		ETL_clean_wiki_movies.ipynb
ETL_create_database.ipynb		ETL_create_database.ipynb
ETL_function_test.ipynb		ETL_function_test.ipynb
ETL_wiki_movies.ipynb		ETL_wiki_movies.ipynb
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Movies-ETL

Overview

Write an ETL Function to Read Three Data Files

Extract and Transform the Wikipedia Data

Extract and Transform the Kaggle Data

Create the Movie Database

About

Releases

Packages

Languages

License

rykiprince/Movies-ETL

Folders and files

Latest commit

History

Repository files navigation

Movies-ETL

Overview

Write an ETL Function to Read Three Data Files

Extract and Transform the Wikipedia Data

Extract and Transform the Kaggle Data

Create the Movie Database

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages