Skip to content

rahul13289/Applied-data-engineering-with-Uber-data

Repository files navigation

APPLYING ETL PIPELINE AND DATA ENGINEERING WITH UBER DATA

image

Introduction

The goal of this project is to perform data analytics on Uber data using various tools and technologies, including GCP Storage, Python, Compute Instance, Mage Data Pipeline Tool, BigQuery, and Looker Studio.

Technology Used

  • Programming Language - Python

Google Cloud Platform

  1. Google Storage
  2. Compute Instance
  3. BigQuery
  4. Looker Studio

GCP Login detail - https://console.cloud.google.com/welcome?project=serious-azimuth-358905

Modern Data Pipeine Tool - https://www.mage.ai/

Looker Studio for dashboard creation - https://lookerstudio.google.com/u/0/navigation/reporting

Website created in mage for applying etl pipeline - https://35.244.17.11/6789/pipeline

ER model designing - https://lucid.app/

Uber ER deisned model - https://lucid.app/publicSegments/view/167b1b16-b9e6-4f1e-ac33-156a052cc5fa/image.png

Dataset Used

TLC Trip Record Data Yellow and green taxi trip records include fields capturing pick-up and drop-off dates/times, pick-up and drop-off locations, trip distances, itemized fares, rate types, payment types, and driver-reported passenger counts.

More info about dataset can be found here:

  1. Website - https://www.nyc.gov/site/tlc/about/tlc-trip-record-data.page
  2. Data Dictionary - https://www.nyc.gov/assets/tlc/downloads/pdf/data_dictionary_trip_records_yellow.pdf

NOTE: REFER THE WORD FILE IN REPOSITORY TO KNOW THE STEPS FOR COMPLETING THE PROJECT

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published