Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve MMM Docs #612

Merged
merged 6 commits into from
Apr 5, 2024
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 5 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,6 @@

## Marketing Analytics Tools from [PyMC Labs](https://www.pymc-labs.com)


Unlock the power of **Marketing Mix Modeling (MMM)** and **Customer Lifetime Value (CLV)** analytics with PyMC-Marketing. This open-source marketing analytics tool empowers businesses to make smarter, data-driven decisions for maximizing ROI in marketing campaigns.

## Quick Installation Guide for Marketing Mix Modeling (MMM) & CLV
Expand All @@ -42,11 +41,11 @@ Leverage our Bayesian MMM API to tailor your marketing strategies effectively. B
- **Adstock Transformation**: Optimize the carry-over effects in your marketing channels.
- **Saturation Effects**: Understand the diminishing returns in media investments.
- **Budget Optimization**: Allocate your marketing spend efficiently across various channels for maximum ROI.
- **Experiment Calibration**: Fine-tune your model based on empirical experiments for more unified view of marketing.
- **Experiment Calibration**: Fine-tune your model based on empirical experiments for a more unified view of marketing.

Explore a hands-on [simulated example](https://pymc-marketing.readthedocs.io/en/stable/notebooks/mmm/mmm_example.html) for more insights into MMM with PyMC-Marketing.

### Essential Reading for Marketing Mix Modeling (MMM):
### Essential Reading for Marketing Mix Modeling (MMM)

- [Bayesian Media Mix Modeling for Marketing Optimization](https://www.pymc-labs.com/blog-posts/bayesian-media-mix-modeling-for-marketing-optimization/)
- [Improving the Speed and Accuracy of Bayesian Marketing Mix Models](https://www.pymc-labs.com/blog-posts/reducing-customer-acquisition-costs-how-we-helped-optimizing-hellofreshs-marketing-budget/)
Expand Down Expand Up @@ -77,7 +76,7 @@ Explore our detailed CLV examples using data from the [`lifetimes`](https://gith

PyMC-Marketing is and will always be free for commercial use, licensed under [Apache 2.0](LICENSE). Developed by core developers behind the popular PyMC package and marketing experts, it provides state-of-the-art measurements and analytics for marketing teams.

Due to its open source nature and active contributor base, new features get added constantly. Missing a feature or want to contribute? Fork our repository and submit a pull request. For any questions, feel free to [open an issue](https://github.com/your-repo/issues).
Due to its open-source nature and active contributor base, new features are added constantly. Missing a feature or want to contribute? Fork our repository and submit a pull request. For any questions, feel free to [open an issue](https://github.com/your-repo/issues).

## Marketing AI Assistant: MMM-GPT with PyMC-Marketing

Expand All @@ -95,7 +94,7 @@ For businesses looking to integrate PyMC-Marketing into their operational framew

We provide the following professional services:

- **Custom Models**: We tailor niche marketing anayltics models to fit your organization's unique needs.
- **Build Within PyMC-Marketing**: Our team are experts leveraging the capabilities of PyMC-Marketing to create robust marketing models for precise insights.
- **Custom Models**: We tailor niche marketing analytics models to fit your organization's unique needs.
- **Build Within PyMC-Marketing**: Our team members are experts leveraging the capabilities of PyMC-Marketing to create robust marketing models for precise insights.
- **SLA & Coaching**: Get guaranteed support levels and personalized coaching to ensure your team is well-equipped and confident in using our tools and approaches.
- **SaaS Solutions**: Harness the power of our state-of-the-art software solutions to streamline your data-driven marketing initiatives.
4 changes: 4 additions & 0 deletions docs/source/guide/mmm/mmm_intro.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ One approach might be to use heuristics, i.e. sensible rules of thumb, about wha
Fortunately, with Bayesian modeling, we can do better than this! So-called Media Mix Modeling (MMM) can estimate how effective each advertising channel is in driving our outcome measure of interest, whether that is sales, new customer acquisitions, or any other key performance indicator (KPI). Once we have estimated each channel's effectiveness we can optimize our budget allocation to maximize our KPI.

## What can you do with Media Mix Modeling?

Media Mix Modeling gives rich insights and is used in many ways, but here are some of the highlights:

1. Understand the effectiveness of different media channels in driving customer acquisition. Not only can you learn from data about the most influential media channels for your business, but you can update this understanding over time. By incorporating new marketing and customer acquisition data on an ongoing basis, you can learn about the changing effectiveness of each channel over time.
Expand All @@ -19,6 +20,7 @@ Media Mix Modeling gives rich insights and is used in many ways, but here are so
![](bayesian_mmm_workflow2.png)

## How does Media Mix Modeling work?

In simple terms, we can understand MMMs as regression modeling applied to business data. The goal is to estimate the impact of marketing activities and other drivers on a metric of interest, such as the number of new customers per week.

To do this, we use two main types of predictor variables:
Expand All @@ -28,11 +30,13 @@ To do this, we use two main types of predictor variables:
The basic approach to MMMs uses linear regression to estimate a set of coefficients for the relative importance of each of these predictors, but real-world MMMs commonly incorporate also non-linear factors to more accurately capture the effect of marketing activities on consumer behaviour:

### The reach (or saturation) function

Rather than model our KPI as a linear function of marketing spend, the reach function models the potential saturation of different channels: While the initial money spent on an advertising channel might have a big impact on customer acquisition, further investment will often lead to diminishing returns as people get used to the message. When we think about optimization, modeling this effect is critical. Some channels may be nowhere close to being saturated and yield significant increases in customer acquisitions for spending for that channel. Knowing the saturation of each channel is vital in making future marketing spend decisions.

![](reach-function.png)

### The adstock function

The marketing spend for a given channel may have a short-term effect or long-term impact. Remember that jingle from a TV ad you've seen 20 years ago? That's a great long-term impact. The adstock function captures these time-course effects of different advertising channels. Knowing this is crucial - if we know some channels have short-term effects that quickly decay over time, we could plan to do more frequent marketing. But suppose another channel has a long, drawn-out impact on driving customer acquisitions. In that case, it may be more effective to use that channel more infrequently.

![](adstock_function.png)
Expand Down
2 changes: 2 additions & 0 deletions pymc_marketing/mmm/base.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
"""Base class for Marketing Mix Models (MMM)."""

import warnings
from inspect import (
getattr_static,
Expand Down
3 changes: 2 additions & 1 deletion pymc_marketing/mmm/budget_optimizer.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
# optimization_utils.py
"""Budget optimization module."""

from typing import Dict, List, Optional, Tuple

import numpy as np
Expand Down
148 changes: 138 additions & 10 deletions pymc_marketing/mmm/delayed_saturated_mmm.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
"""Media Mix Model with delayed adstock and logistic saturation class."""

import json
from pathlib import Path
from typing import Any, Dict, List, Optional, Union
Expand Down Expand Up @@ -30,6 +32,13 @@


class BaseDelayedSaturatedMMM(MMM):
"""Base class for a media mix model with delayed adstock and logistic saturation class (see [1]_).

References
----------
.. [1] Jin, Yuxue, et al. “Bayesian methods for media mix modeling with carryover and shape effects.” (2017).
"""

_model_type = "DelayedSaturatedMMM"
version = "0.0.2"

Expand All @@ -45,7 +54,7 @@ def __init__(
yearly_seasonality: Optional[int] = None,
**kwargs,
) -> None:
"""Media Mix Model with delayed adstock and logistic saturation class (see [1]_).
"""Constructor method.

Parameters
----------
Expand All @@ -65,10 +74,6 @@ def __init__(
Number of lags to consider in the adstock transformation, by default 4
yearly_seasonality : Optional[int], optional
Number of Fourier modes to model yearly seasonality, by default None.

References
----------
.. [1] Jin, Yuxue, et al. “Bayesian methods for media mix modeling with carryover and shape effects.” (2017).
"""
self.control_columns = control_columns
self.adstock_max_lag = adstock_max_lag
Expand Down Expand Up @@ -96,9 +101,9 @@ def output_var(self):
def _generate_and_preprocess_model_data( # type: ignore
self, X: Union[pd.DataFrame, pd.Series], y: Union[pd.Series, np.ndarray]
) -> None:
"""
Applies preprocessing to the data before fitting the model.
if validate is True, it will check if the data is valid for the model.
"""Applies preprocessing to the data before fitting the model.

If validate is True, it will check if the data is valid for the model.
sets self.model_coords based on provided dataset

Parameters
Expand Down Expand Up @@ -385,6 +390,7 @@ def build_model(
)

mu_var = intercept + channel_contributions.sum(axis=-1)

if (
self.control_columns is not None
and len(self.control_columns) > 0
Expand Down Expand Up @@ -412,6 +418,7 @@ def build_model(
)

mu_var += control_contributions.sum(axis=-1)

if (
hasattr(self, "fourier_columns")
and self.fourier_columns is not None
Expand Down Expand Up @@ -489,10 +496,12 @@ def channel_contributions_forward_pass(
self, channel_data: npt.NDArray[np.float_]
) -> npt.NDArray[np.float_]:
"""Evaluate the channel contribution for a given channel data and a fitted model, ie. the forward pass.

Parameters
----------
channel_data : array-like
Input channel data. Result of all the preprocessing steps.

Returns
-------
array-like
Expand Down Expand Up @@ -706,13 +715,129 @@ class DelayedSaturatedMMM(
ValidateControlColumns,
BaseDelayedSaturatedMMM,
):
...
"""Media Mix Model with delayed adstock and logistic saturation class (see [1]_).

Given a time series target variable :math:`y_{t}` (e.g. sales on conversions), media variables
:math:`x_{m, t}` (e.g. impressions, clicks or costs) and a set of control covariates :math:`z_{c, t}` (e.g. holidays, special events)
we consider a Bayesian linear model of the form:

.. math::
y_{t} = \\alpha + \\sum_{m=1}^{M}\\beta_{m}f(x_{m, t}) + \\sum_{c=1}^{C}\\gamma_{c}z_{c, t} + \\varepsilon_{t},

where :math:`\\alpha` is the intercept, :math:`f` is a media transformation function and :math:`\\varepsilon_{t}` is the error therm
which we assume is normally distributed. The function :math:`f` encodes the contribution of media on the target variable.
Typically we consider two types of transformation: adstock (carry-over) and saturation effects.

Notes
-----
Here are some important notes about the model:

1. Before fitting the model, we scale the target variable and the media channels using the maximum absolute value of each variable.
This enable us to have a more stable model and better convergence. If control variables are present, we do not scale them!
If needed please do it before passing the data to the model.

2. We allow to add yearly seasonality controls as Fourier modes. You can use the `yearly_seasonality` parameter to specify the number of Fourier modes to include.

3. This class also allow us to calibrate the model using:

- Custom priors for the parameters via the `model_config` parameter. You can also set the likelihood distribution.
- Adding lift tests to the likelihood function via the :meth:`add_lift_test_measurements <pymc_marketing.mmm.delayed_saturated_mmm.DelayedSaturatedMMM.add_lift_test_measurements>` method.

For details on a vanilla implementation in PyMC, see [2]_.

Examples
--------
Here is an example of how to instantiate the model with the default configuration:

.. code-block:: python

import numpy as np
import pandas as pd

from pymc_marketing.mmm import DelayedSaturatedMMM

data_url = "https://raw.githubusercontent.com/pymc-labs/pymc-marketing/main/datasets/mmm_example.csv"
data = pd.read_csv(data_url, parse_dates=["date_week"])

mmm = DelayedSaturatedMMM(
date_column="date_week",
channel_columns=["x1", "x2"],
control_columns=[
"event_1",
"event_2",
"t",
],
adstock_max_lag=8,
yearly_seasonality=2,
)

Now we can fit the model with the data:

.. code-block:: python

# Set features and target
X = data.drop("y", axis=1)
y = data["y"]

# Fit the model
idata = mmm.fit(X, y)

We can also define custom priors for the model:

.. code-block:: python

my_model_config = {
"beta_channel": {
"dist": "LogNormal",
"kwargs": {"mu": np.array([2, 1]), "sigma": 1},
},
"likelihood": {
"dist": "Normal",
"kwargs": {"sigma": {"dist": "HalfNormal", "kwargs": {"sigma": 2}}},
},
}

mmm = DelayedSaturatedMMM(
model_config=my_model_config,
date_column="date_week",
channel_columns=["x1", "x2"],
control_columns=[
"event_1",
"event_2",
"t",
],
adstock_max_lag=8,
yearly_seasonality=2,
)

We can choose the distribution and all parameters and likelihood of the model through the `model_config` parameter.
juanitorduz marked this conversation as resolved.
Show resolved Hide resolved

The `fit` method accepts keyword arguments that are passed to the PyMC sampling method.
For example, to change the number of samples and chains, and using a JAX implementation of NUTS we can do:

.. code-block:: python

sampler_kwargs = {
"draws": 2_000,
"target_accept": 0.9,
"chains": 5,
"random_seed": 42,
}

idata = mmm.fit(X, y, nuts_sampler="numpyro", **sampler_kwargs)

References
----------
.. [1] Jin, Yuxue, et al. “Bayesian methods for media mix modeling with carryover and shape effects.” (2017).
.. [2] Orduz, J. `"Media Effect Estimation with PyMC: Adstock, Saturation & Diminishing Returns" <https://juanitorduz.github.io/pymc_mmm/>`_.
"""

def channel_contributions_forward_pass(
self, channel_data: npt.NDArray[np.float_]
) -> npt.NDArray[np.float_]:
"""Evaluate the channel contribution for a given channel data and a fitted model, ie. the forward pass.
We return the contribution in the original scale of the target variable.

Parameters
----------
channel_data : array-like
Expand All @@ -735,7 +860,8 @@ def channel_contributions_forward_pass(
def get_channel_contributions_forward_pass_grid(
self, start: float, stop: float, num: int
) -> DataArray:
"""Generate a grid of scaled channel contributions for a given grid of share values.
"""Generate a grid of scaled channel contributions for a given grid of shared values.

Parameters
----------
start : float
Expand Down Expand Up @@ -782,6 +908,7 @@ def plot_channel_contributions_grid(
**plt_kwargs: Any,
) -> plt.Figure:
"""Plots a grid of scaled channel contributions for a given grid of share values.

Parameters
----------
start : float
Expand All @@ -793,6 +920,7 @@ def plot_channel_contributions_grid(
absolute_xrange : bool, optional
If True, the x-axis is in absolute values (input units), otherwise it is in
relative percentage values, by default False.

Returns
-------
plt.Figure
Expand Down
2 changes: 2 additions & 0 deletions pymc_marketing/mmm/preprocessing.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
"""Preprocessing methods for the Marketing Mix Model."""

from typing import Any, Callable, List, Tuple, Union

import numpy as np
Expand Down
2 changes: 2 additions & 0 deletions pymc_marketing/mmm/transformers.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
"""Media transformation functions for Marketing Mix Models."""

from enum import Enum
from typing import Any, NamedTuple, Union

Expand Down
2 changes: 2 additions & 0 deletions pymc_marketing/mmm/utils.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
"""Utility functions for the Marketing Mix Modeling module."""

import re
from typing import Any, Callable, Dict, List, Optional, Tuple, Union

Expand Down
2 changes: 2 additions & 0 deletions pymc_marketing/mmm/validating.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
"""Validating methods for MMM classes."""

from typing import Callable, List, Optional, Tuple, Union

import pandas as pd
Expand Down
Loading