Segmentation-for-E-commerce

Segmentaion for customers, to predict what they would but on their next visit according to their history

Data Description

► InvoiceNo : Invoice number. Nominal, a 6-digit integral number uniquely assigned to each transaction. If this code starts with letter 'c', it indicates a cancellation.
► StockCode : Product (item) code. Nominal, a 5-digit integral number uniquely assigned to each distinct product.
► Description : Product (item) name. Nominal.
► Quantity : The quantities of each product (item) per transaction. Numeric.
► InvoiceDate : Invice Date and time. Numeric, the day and time when each transaction was generated.
► UnitPrice : Unit price. Numeric, Product price per unit in sterling.
► CustomerID : Customer number. Nominal, a 5-digit integral number uniquely assigned to each customer.
► Country : Country name. Nominal, the name of the country where each customer resides.

First

product Clustering

we cluster the products using tf-idf technique to get each term a score based on its importance in such product and created matrix to be applied on K-Means Clustering algorithms

Second

Customer Clustering

We add such product categories through the main dataframe, such that each Customer got a defined Category Some New Features were added to such Dataframe, such as:

Number of visits
total Cashe payed
Min/ Max Cashe payed
Average Cashe payed

After that we Begin the Clustering with K-Means Clustering algorithms which ended to be 11 customer categories

Third

final Dataframe

The final Dataframe will contains all information of each Customer Cluster Dimensions (11,13) ::

11 Clusters
13 Feature (cluster,[count, sum, Avg, min,max], [6 product category])

Such Dataframe will be the Key for the followin Gerat Radar Visulization

Such fabulous chart indicates how Clusters are more to buy from such Category

Customers of Cluster 0 are highly biased to buy products from 0 category
Customers of Cluster 1 are also impressed with category 1 products
Customers of Cluster 2 more to buy products from category 3 than any other category

Finally

We Do the Customer Classification

models Used [Support Vector Machine, Logostic Regression, k-Nearest Neighbors, Decision Tree, Random Forest, Gradient Boosting, Ada Boost]

Which Ends with 87.38 % Prediction accuracy

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
imgs		imgs
Ecommerce final.html		Ecommerce final.html
Ecommerce.ipynb		Ecommerce.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Segmentation-for-E-commerce

Data Description

First

product Clustering

Second

Customer Clustering

Third

final Dataframe

Finally

We Do the Customer Classification

About

Releases

Packages

Languages

Marwan-Mostafa7/Segmentation-for-E-commerce

Folders and files

Latest commit

History

Repository files navigation

Segmentation-for-E-commerce

Data Description

First

product Clustering

Second

Customer Clustering

Third

final Dataframe

Finally

We Do the Customer Classification

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages