Skip to content

tasyamla/Online-Shopping-Analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Online Shopping Analysis


This repository contains the code and documentation for a data ETL & analytics project aimed at optimizing online shopping business strategies. The project focuses on understanding customer behavior and identifying factors contributing to sales revenue during 2019.

Overview

In this project, online shopping data in CSV format is utilized, and the ETL process includes the use of Python for data manipulation and Kibana for data visualization. The goal is to generate actionable insights to enhance business strategies.

Objectives

  • Automate the ETL process using Apache Airflow with pipelines scheduled for daily execution at 6:30 AM.
  • Load data from a CSV file into PostgreSQL and fetch data from PostgreSQL to save it in a CSV file.
  • Clean and preprocess data, saving the cleaned data back as a CSV file.
  • Import the cleaned data into Elasticsearch for advanced querying and visualization.
  • Validate the data using Great Expectations.
  • Process and visualize the data using Kibana.

Workflow

1. Extract

Data Collection: Gather data from online shopping activities.

2. Transform

  • lean and preprocess the data.
  • Validate data quality using Great Expectations.

3. Load

Import cleaned data into Elasticsearch for indexing and search capabilities.

4. Analyze

Visualize data using Kibana to extract meaningful insights.

5. Conclusion

Draw conclusions and provide recommendations based on the analysis.

Tools and Technologies

  • Apache Airflow: To create and schedule data pipelines.
  • Python: For data manipulation and processing.
  • PostgreSQL: As a relational database to store and manage data.
  • Elasticsearch: To enable fast and scalable data retrieval.
  • Kibana: For interactive data visualization and analysis.
  • Great Expectations: For data validation to ensure data quality and integrity

Link

Dataset


Contact

About

Online Shopping Analysis

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published