Skip to content

This project predicts customer revenue using website traffic data. It classifies customer purchase intent with Linear Discriminant Analysis (LDA) and estimates total spending using Multivariate Adaptive Regression Splines (MARS). The target variable is the logarithm of customer-level sales.

Notifications You must be signed in to change notification settings

Biswas-N/CustomerRevenuePrediction

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 

Repository files navigation

CustomerRevenuePrediction

In many businesses, identifying which customers will make a purchase (and when) and how much will they spend, is a critical exercise. This is true for both brick-and-mortar outlets and online stores. This project's data is website traffic data acquired from an online retailer.

Data URL: Kaggle Link

The challenge: Predict total sales

The data provides information on customer's website site visit behavior. Customers may visit the store multiple times, on multiple days, with or without making a purchase. The variable $revenue$ lists the amount of money that a customer spends on a given visit in the dataset. My main goal for this project is to predict how much money a customer will spend, in total, across all visits to the site, during the allotted one-year time frame (August 2016 to August 2017).

Prediction target

More specifically, I am predicting the transformation of the aggregate customer-level sales value based on the natural log. That is, if a customer has multiple revenue transactions, then the sum of all the revenue generated across all of the transactions, i.e.,:

$$ custRevenue_i = \sum_{j=1}^{k_i} revenue_{ij} \ \ \ \forall i \in customers $$

$$ \text{where } k_i \text{ denotes the number of revenue transactions for customer } i $$

And then transform this variable as follows:

$$ targetRevenue_i= \ln(custRevenue_i + 1) \ \ \ \forall i \in customers $$

Modelling

For this project, I have used $Linear Discriminant Analysis$ and $Multivariate Adaptive Regression Splines$ to predict the following:

  1. $LDA$ for classifying the customer if they will buy something or not.
  2. $MARS$ for predicting how much they might spend, in terms of $logarithmic$ value.

About

This project predicts customer revenue using website traffic data. It classifies customer purchase intent with Linear Discriminant Analysis (LDA) and estimates total spending using Multivariate Adaptive Regression Splines (MARS). The target variable is the logarithm of customer-level sales.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages