GitHub - tariq452/Data-Warehouse: Build a data warehouse for analytical operations. to allow Sparkify wants to analyze the data they've been collecting on songs and user activity on their new music streaming app Benefits

Purpose:

Build a data warehouse for analytical operations. to allow Sparkify wants to analyze the data they've been collecting on songs and user activity on their new music streaming app Benefits

Design schema there is two table stage, this table will copy data from S3 to the table stage. Star schema because it's more effective for handling queries

Fact Table

songplays - records in event data associated with song plays i.e. records with page NextSong songplay_id, start_time, user_id, level, song_id, artist_id, session_id, location, user_agent Dimension Tables users - users in the app user_id, first_name, last_name, gender, level songs - songs in music database song_id, title, artist_id, year, duration artists - artists in music database artist_id, name, location, lattitude, longitude time - timestamps of records in songplays broken down into specific units start_time, hour, day, week, month, year, weekday

Etl

the ETL file is tor drop tables if exist and create tables, after that, the script will copy and insert data in tables

load_staging_tables function: this function to load two-stage table from S3 to redshift.

insert_tables this function inserts data from stage tables to star schema tables fact and daimantion.

to start this project follow the below step:

1-create cluster redshift in AWS and add value in dwh.cfg. 2-run create_tables to create tables. 2-run est file to load data to tables

SQL example from songplays

select * from public."songplays" limit 10;

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
README.md		README.md
create_tables.py		create_tables.py
dwh.cfg		dwh.cfg
etl.py		etl.py
sql_queries.py		sql_queries.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Releases

Packages

Languages

tariq452/Data-Warehouse

Folders and files

Latest commit

History

Repository files navigation

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages