Skip to content

yoerisamwel/GWU-Project-4

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

22 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Furniture Ecommerce

By: Alaa Aleryani and Yoeri Samwel

Ecommerce

Built With:

Python
Python
Jupyter Notebook
Jupyter Notebook
Tableau
Tableau
scikit learn
Scikit Learn
Time Series Forecast
Prophet

Overview

The purpose of this project is to:

  • Understand Regional Market and Enhance Personalization: By analyzing past purchase history, understand the unique characteristics, preferences, and behaviors of customers in different geographic areas.
  • Identify Growth Opportunities: identify trends and patterns in sales performance, and market demand.
  • Understanding Seasonal Influences: understand how seasonal factors, and local events impact consumer behavior and purchasing patterns in different regions.

Note: we got our data from Kaggle

Visualizations

For some insights into our dataset, feel free to check our visualization dashboard at Tableau Public

Analysis

We used three machine learning models:

Results 📈 📉

Model 1 👇:

Our first step in the analysis used a Linear Regression model to predict the Original Price by looking at various factors. We chose this model for its simplicity and ease of understanding. It helped us see how different variables influence the price. Although it's straightforward to use, the model's accuracy score of 64% in our tests shows it's not entirely accurate. This result suggests that we might need to consider more advanced methods in future to improve our predictions.

Model 1 Result

This graph visualizes the linear regression model's predictions, showing how the target variable (e.g., 'Original_Price') changes as a function of the 'Quantity' feature, alongside actual data points for comparison.

Model 2 👇:

The Ridge Regression model showed exceptionally high performance on the test data for predicting the Original Price, with an almost negligible error (MAE) of 0.0007296897175481192 and a nearly perfect score (R²) of 0.9999999999955044. However, these near-perfect results suggest the possibility of overfitting, where the model might have learned the training data too closely, including its noise and outliers, rather than capturing the underlying pattern. This concern arises because models that perform too well on the training data often struggle to generalize to new, unseen data, leading to less accurate predictions in real-world scenarios. While the model's high accuracy initially appears impressive, it raises questions about its ability to perform consistently across different datasets.

Model 2 Result

This graph illustrates the Ridge Regression model's predictions, demonstrating the relationship between the target variable (e.g., 'Original_Price') and the 'Quantity' feature, juxtaposed with actual data points for contextual comparison.

Model 3 👇:

For the last model we used Time Series Forecast using Facebook's open source library Prophet, which was released as an open source on February 2017. We chose this model for its ease of use and because it automatically handles missing data, outliers, and holidays. However, because of it's limited features it may not be effective for many forecasting tasks.

Forecasting graph: Model 3 Result1
This graph illustrates a one year sales forecast based on historical data. Giving us a trend with the averages and a high and low amounts that we could expect.
Trend graph: Model 3 Result2
In this trend graph, we could see that the trend line demonstrates a subtle decline with a high level of certainity for the first couple of months. Then the uncertainity boundries increases over time.
Seasonality trend graph: Model 3 Result3
For the daily trend, we see that tuesdays are the high days and wednesdays are the low days. For the yearly trend, we see that February tend to have the highest sales. Then comes April then August.

Summary

This project aimed to explore how different regions have their own unique shopping trends and customer behaviors. The main goal was to figure out what makes each area different, spot where there might be chances to sell more, and make shopping more tailored and enjoyable for customers. By looking at what people have bought in the past, the project tried to help businesses offer more of what customers in specific regions like. It also looked into how changes in seasons or local events affect what people buy in different places.

To do this, we used three types of computer programs or models. The first model was a simple one that tried to predict prices based on other information we know. The second model was a bit more complex and made sure we didn't rely too much on just one piece of information, making our predictions more reliable. The third model was special for looking at how things change over time, like predicting how sales might go up or down during different times of the year. We also made a special online page where anyone can see our findings in a clear and interactive way, making it easier to understand what we found out.

In conclusion, our analysis has provided valuable insights into regional purchasing behaviors, offering a foundational understanding for businesses aiming to tailor their strategies to diverse geographical markets. However, integrating profitability metrics for individual products and categories emerges as a compelling direction for future research. By adopting a more granular approach towards category-specific profitability, businesses could gain a deeper understanding of financial performance across different regions. This refined analysis would not only illuminate top-selling products but also highlight those contributing most significantly to revenue. Consequently, such an approach would enable businesses to craft more nuanced strategies, prioritizing not just volume of sales but also optimizing for higher profit margins.


Contact Info:

Alaa Aleryani:
Alaa's email Alaa's LinkedIn Page

Yoeri Samwel:
Yoeri's email Yoeri's LinkedIn Page