Skip to content

This repository contains Projects that I did while pursuing Data Engineering Nanodegree by Udacity

License

Notifications You must be signed in to change notification settings

moni2096/Data-Engineering-Nanodegree-Udacity

Repository files navigation

Introduction

This repository contains my implementation of projects involved in Data Engineering Nanodegree by Udacity.

Getting Started

The program has 5 hands on projects and one capstone project.

  1. Data Modeling with Postgres: In this project, we modeled data for a music streaming company called Sparkify by choosing appropriate modeling schema and data types to model tables for setting up analytics workflow in Postgres.
  2. Data Modeling in Apache Cassandra: As we know in Apache Cassandra the tables are modeled as per the queries that we are going to write we followed similar workflow to model tables for some queries using Apache Cassandra.
  3. Cloud Data Warehouse: In this project we model tables using Amazon Redshift using the same schema that we used in Project 1 since the requirements of the analytics workflow and scaling needs for the company Sparkify changed.
  4. Data Lakes with Spark: In this project we get to understand the importance of Data Lakes and its importance to address the specific needs that an organization might have where Data Lakes might be a good choice to be considered.
  5. Data Pipelines with Airflow: In this project we learn how to ingest data on scheduled basis when the data size grows and ingestion of data on a particular interval is important for an Organization. We model tables in redshift and use Airflow to schedule ingestion into the table at particular interval. We also consider the use case where a downstream report can be generated using Airflow by designing pipeline.
  6. Capstone Project: In this project we combine what we learn and put into practice by solving a real world data proble. Here, I have built a data pipeline using AWS to ingest CryptoCurrency data from API and other sources.

License

Distributed under the MIT License. See LICENSE for more information.

About

This repository contains Projects that I did while pursuing Data Engineering Nanodegree by Udacity

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published