diff --git a/README.md b/README.md index edbf21da8..ebd119327 100644 --- a/README.md +++ b/README.md @@ -17,7 +17,6 @@ ## Marketing Analytics Tools from [PyMC Labs](https://www.pymc-labs.com) - Unlock the power of **Marketing Mix Modeling (MMM)** and **Customer Lifetime Value (CLV)** analytics with PyMC-Marketing. This open-source marketing analytics tool empowers businesses to make smarter, data-driven decisions for maximizing ROI in marketing campaigns. ## Quick Installation Guide for Marketing Mix Modeling (MMM) & CLV @@ -42,11 +41,11 @@ Leverage our Bayesian MMM API to tailor your marketing strategies effectively. B - **Adstock Transformation**: Optimize the carry-over effects in your marketing channels. - **Saturation Effects**: Understand the diminishing returns in media investments. - **Budget Optimization**: Allocate your marketing spend efficiently across various channels for maximum ROI. -- **Experiment Calibration**: Fine-tune your model based on empirical experiments for more unified view of marketing. +- **Experiment Calibration**: Fine-tune your model based on empirical experiments for a more unified view of marketing. Explore a hands-on [simulated example](https://pymc-marketing.readthedocs.io/en/stable/notebooks/mmm/mmm_example.html) for more insights into MMM with PyMC-Marketing. -### Essential Reading for Marketing Mix Modeling (MMM): +### Essential Reading for Marketing Mix Modeling (MMM) - [Bayesian Media Mix Modeling for Marketing Optimization](https://www.pymc-labs.com/blog-posts/bayesian-media-mix-modeling-for-marketing-optimization/) - [Improving the Speed and Accuracy of Bayesian Marketing Mix Models](https://www.pymc-labs.com/blog-posts/reducing-customer-acquisition-costs-how-we-helped-optimizing-hellofreshs-marketing-budget/) @@ -77,7 +76,7 @@ Explore our detailed CLV examples using data from the [`lifetimes`](https://gith PyMC-Marketing is and will always be free for commercial use, licensed under [Apache 2.0](LICENSE). Developed by core developers behind the popular PyMC package and marketing experts, it provides state-of-the-art measurements and analytics for marketing teams. -Due to its open source nature and active contributor base, new features get added constantly. Missing a feature or want to contribute? Fork our repository and submit a pull request. For any questions, feel free to [open an issue](https://github.com/your-repo/issues). +Due to its open-source nature and active contributor base, new features are added constantly. Missing a feature or want to contribute? Fork our repository and submit a pull request. For any questions, feel free to [open an issue](https://github.com/your-repo/issues). ## Marketing AI Assistant: MMM-GPT with PyMC-Marketing @@ -95,7 +94,7 @@ For businesses looking to integrate PyMC-Marketing into their operational framew We provide the following professional services: -- **Custom Models**: We tailor niche marketing anayltics models to fit your organization's unique needs. -- **Build Within PyMC-Marketing**: Our team are experts leveraging the capabilities of PyMC-Marketing to create robust marketing models for precise insights. +- **Custom Models**: We tailor niche marketing analytics models to fit your organization's unique needs. +- **Build Within PyMC-Marketing**: Our team members are experts leveraging the capabilities of PyMC-Marketing to create robust marketing models for precise insights. - **SLA & Coaching**: Get guaranteed support levels and personalized coaching to ensure your team is well-equipped and confident in using our tools and approaches. - **SaaS Solutions**: Harness the power of our state-of-the-art software solutions to streamline your data-driven marketing initiatives. diff --git a/docs/source/guide/mmm/mmm_intro.md b/docs/source/guide/mmm/mmm_intro.md index 910c18753..daf49dd6b 100644 --- a/docs/source/guide/mmm/mmm_intro.md +++ b/docs/source/guide/mmm/mmm_intro.md @@ -7,6 +7,7 @@ One approach might be to use heuristics, i.e. sensible rules of thumb, about wha Fortunately, with Bayesian modeling, we can do better than this! So-called Media Mix Modeling (MMM) can estimate how effective each advertising channel is in driving our outcome measure of interest, whether that is sales, new customer acquisitions, or any other key performance indicator (KPI). Once we have estimated each channel's effectiveness we can optimize our budget allocation to maximize our KPI. ## What can you do with Media Mix Modeling? + Media Mix Modeling gives rich insights and is used in many ways, but here are some of the highlights: 1. Understand the effectiveness of different media channels in driving customer acquisition. Not only can you learn from data about the most influential media channels for your business, but you can update this understanding over time. By incorporating new marketing and customer acquisition data on an ongoing basis, you can learn about the changing effectiveness of each channel over time. @@ -19,6 +20,7 @@ Media Mix Modeling gives rich insights and is used in many ways, but here are so ![](bayesian_mmm_workflow2.png) ## How does Media Mix Modeling work? + In simple terms, we can understand MMMs as regression modeling applied to business data. The goal is to estimate the impact of marketing activities and other drivers on a metric of interest, such as the number of new customers per week. To do this, we use two main types of predictor variables: @@ -28,11 +30,13 @@ To do this, we use two main types of predictor variables: The basic approach to MMMs uses linear regression to estimate a set of coefficients for the relative importance of each of these predictors, but real-world MMMs commonly incorporate also non-linear factors to more accurately capture the effect of marketing activities on consumer behaviour: ### The reach (or saturation) function + Rather than model our KPI as a linear function of marketing spend, the reach function models the potential saturation of different channels: While the initial money spent on an advertising channel might have a big impact on customer acquisition, further investment will often lead to diminishing returns as people get used to the message. When we think about optimization, modeling this effect is critical. Some channels may be nowhere close to being saturated and yield significant increases in customer acquisitions for spending for that channel. Knowing the saturation of each channel is vital in making future marketing spend decisions. ![](reach-function.png) ### The adstock function + The marketing spend for a given channel may have a short-term effect or long-term impact. Remember that jingle from a TV ad you've seen 20 years ago? That's a great long-term impact. The adstock function captures these time-course effects of different advertising channels. Knowing this is crucial - if we know some channels have short-term effects that quickly decay over time, we could plan to do more frequent marketing. But suppose another channel has a long, drawn-out impact on driving customer acquisitions. In that case, it may be more effective to use that channel more infrequently. ![](adstock_function.png) diff --git a/pymc_marketing/mmm/base.py b/pymc_marketing/mmm/base.py index 08d1c1b3a..3043e6f01 100644 --- a/pymc_marketing/mmm/base.py +++ b/pymc_marketing/mmm/base.py @@ -1,3 +1,5 @@ +"""Base class for Marketing Mix Models (MMM).""" + import warnings from inspect import ( getattr_static, diff --git a/pymc_marketing/mmm/budget_optimizer.py b/pymc_marketing/mmm/budget_optimizer.py index fae66ee07..fc18126ad 100644 --- a/pymc_marketing/mmm/budget_optimizer.py +++ b/pymc_marketing/mmm/budget_optimizer.py @@ -1,4 +1,5 @@ -# optimization_utils.py +"""Budget optimization module.""" + from typing import Dict, List, Optional, Tuple import numpy as np diff --git a/pymc_marketing/mmm/delayed_saturated_mmm.py b/pymc_marketing/mmm/delayed_saturated_mmm.py index 7bdeffb26..2942666b9 100644 --- a/pymc_marketing/mmm/delayed_saturated_mmm.py +++ b/pymc_marketing/mmm/delayed_saturated_mmm.py @@ -1,3 +1,5 @@ +"""Media Mix Model with delayed adstock and logistic saturation class.""" + import json from pathlib import Path from typing import Any, Dict, List, Optional, Union @@ -30,6 +32,13 @@ class BaseDelayedSaturatedMMM(MMM): + """Base class for a media mix model with delayed adstock and logistic saturation class (see [1]_). + + References + ---------- + .. [1] Jin, Yuxue, et al. “Bayesian methods for media mix modeling with carryover and shape effects.” (2017). + """ + _model_type = "DelayedSaturatedMMM" version = "0.0.2" @@ -45,7 +54,7 @@ def __init__( yearly_seasonality: Optional[int] = None, **kwargs, ) -> None: - """Media Mix Model with delayed adstock and logistic saturation class (see [1]_). + """Constructor method. Parameters ---------- @@ -65,10 +74,6 @@ def __init__( Number of lags to consider in the adstock transformation, by default 4 yearly_seasonality : Optional[int], optional Number of Fourier modes to model yearly seasonality, by default None. - - References - ---------- - .. [1] Jin, Yuxue, et al. “Bayesian methods for media mix modeling with carryover and shape effects.” (2017). """ self.control_columns = control_columns self.adstock_max_lag = adstock_max_lag @@ -96,9 +101,9 @@ def output_var(self): def _generate_and_preprocess_model_data( # type: ignore self, X: Union[pd.DataFrame, pd.Series], y: Union[pd.Series, np.ndarray] ) -> None: - """ - Applies preprocessing to the data before fitting the model. - if validate is True, it will check if the data is valid for the model. + """Applies preprocessing to the data before fitting the model. + + If validate is True, it will check if the data is valid for the model. sets self.model_coords based on provided dataset Parameters @@ -385,6 +390,7 @@ def build_model( ) mu_var = intercept + channel_contributions.sum(axis=-1) + if ( self.control_columns is not None and len(self.control_columns) > 0 @@ -412,6 +418,7 @@ def build_model( ) mu_var += control_contributions.sum(axis=-1) + if ( hasattr(self, "fourier_columns") and self.fourier_columns is not None @@ -489,10 +496,12 @@ def channel_contributions_forward_pass( self, channel_data: npt.NDArray[np.float_] ) -> npt.NDArray[np.float_]: """Evaluate the channel contribution for a given channel data and a fitted model, ie. the forward pass. + Parameters ---------- channel_data : array-like Input channel data. Result of all the preprocessing steps. + Returns ------- array-like @@ -706,13 +715,129 @@ class DelayedSaturatedMMM( ValidateControlColumns, BaseDelayedSaturatedMMM, ): - ... + """Media Mix Model with delayed adstock and logistic saturation class (see [1]_). + + Given a time series target variable :math:`y_{t}` (e.g. sales on conversions), media variables + :math:`x_{m, t}` (e.g. impressions, clicks or costs) and a set of control covariates :math:`z_{c, t}` (e.g. holidays, special events) + we consider a Bayesian linear model of the form: + + .. math:: + y_{t} = \\alpha + \\sum_{m=1}^{M}\\beta_{m}f(x_{m, t}) + \\sum_{c=1}^{C}\\gamma_{c}z_{c, t} + \\varepsilon_{t}, + + where :math:`\\alpha` is the intercept, :math:`f` is a media transformation function and :math:`\\varepsilon_{t}` is the error therm + which we assume is normally distributed. The function :math:`f` encodes the contribution of media on the target variable. + Typically we consider two types of transformation: adstock (carry-over) and saturation effects. + + Notes + ----- + Here are some important notes about the model: + + 1. Before fitting the model, we scale the target variable and the media channels using the maximum absolute value of each variable. + This enable us to have a more stable model and better convergence. If control variables are present, we do not scale them! + If needed please do it before passing the data to the model. + + 2. We allow to add yearly seasonality controls as Fourier modes. You can use the `yearly_seasonality` parameter to specify the number of Fourier modes to include. + + 3. This class also allow us to calibrate the model using: + + - Custom priors for the parameters via the `model_config` parameter. You can also set the likelihood distribution. + - Adding lift tests to the likelihood function via the :meth:`add_lift_test_measurements ` method. + + For details on a vanilla implementation in PyMC, see [2]_. + + Examples + -------- + Here is an example of how to instantiate the model with the default configuration: + + .. code-block:: python + + import numpy as np + import pandas as pd + + from pymc_marketing.mmm import DelayedSaturatedMMM + + data_url = "https://raw.githubusercontent.com/pymc-labs/pymc-marketing/main/datasets/mmm_example.csv" + data = pd.read_csv(data_url, parse_dates=["date_week"]) + + mmm = DelayedSaturatedMMM( + date_column="date_week", + channel_columns=["x1", "x2"], + control_columns=[ + "event_1", + "event_2", + "t", + ], + adstock_max_lag=8, + yearly_seasonality=2, + ) + + Now we can fit the model with the data: + + .. code-block:: python + + # Set features and target + X = data.drop("y", axis=1) + y = data["y"] + + # Fit the model + idata = mmm.fit(X, y) + + We can also define custom priors for the model: + + .. code-block:: python + + my_model_config = { + "beta_channel": { + "dist": "LogNormal", + "kwargs": {"mu": np.array([2, 1]), "sigma": 1}, + }, + "likelihood": { + "dist": "Normal", + "kwargs": {"sigma": {"dist": "HalfNormal", "kwargs": {"sigma": 2}}}, + }, + } + + mmm = DelayedSaturatedMMM( + model_config=my_model_config, + date_column="date_week", + channel_columns=["x1", "x2"], + control_columns=[ + "event_1", + "event_2", + "t", + ], + adstock_max_lag=8, + yearly_seasonality=2, + ) + + As you can see, we can configure all prior and likelihood distributions via the `model_config`. + + The `fit` method accepts keyword arguments that are passed to the PyMC sampling method. + For example, to change the number of samples and chains, and using a JAX implementation of NUTS we can do: + + .. code-block:: python + + sampler_kwargs = { + "draws": 2_000, + "target_accept": 0.9, + "chains": 5, + "random_seed": 42, + } + + idata = mmm.fit(X, y, nuts_sampler="numpyro", **sampler_kwargs) + + References + ---------- + .. [1] Jin, Yuxue, et al. “Bayesian methods for media mix modeling with carryover and shape effects.” (2017). + .. [2] Orduz, J. `"Media Effect Estimation with PyMC: Adstock, Saturation & Diminishing Returns" `_. + """ def channel_contributions_forward_pass( self, channel_data: npt.NDArray[np.float_] ) -> npt.NDArray[np.float_]: """Evaluate the channel contribution for a given channel data and a fitted model, ie. the forward pass. We return the contribution in the original scale of the target variable. + Parameters ---------- channel_data : array-like @@ -735,7 +860,8 @@ def channel_contributions_forward_pass( def get_channel_contributions_forward_pass_grid( self, start: float, stop: float, num: int ) -> DataArray: - """Generate a grid of scaled channel contributions for a given grid of share values. + """Generate a grid of scaled channel contributions for a given grid of shared values. + Parameters ---------- start : float @@ -782,6 +908,7 @@ def plot_channel_contributions_grid( **plt_kwargs: Any, ) -> plt.Figure: """Plots a grid of scaled channel contributions for a given grid of share values. + Parameters ---------- start : float @@ -793,6 +920,7 @@ def plot_channel_contributions_grid( absolute_xrange : bool, optional If True, the x-axis is in absolute values (input units), otherwise it is in relative percentage values, by default False. + Returns ------- plt.Figure diff --git a/pymc_marketing/mmm/preprocessing.py b/pymc_marketing/mmm/preprocessing.py index 125728cd1..ddb768657 100644 --- a/pymc_marketing/mmm/preprocessing.py +++ b/pymc_marketing/mmm/preprocessing.py @@ -1,3 +1,5 @@ +"""Preprocessing methods for the Marketing Mix Model.""" + from typing import Any, Callable, List, Tuple, Union import numpy as np diff --git a/pymc_marketing/mmm/transformers.py b/pymc_marketing/mmm/transformers.py index 07ccda6bf..a4e8b515d 100644 --- a/pymc_marketing/mmm/transformers.py +++ b/pymc_marketing/mmm/transformers.py @@ -1,3 +1,5 @@ +"""Media transformation functions for Marketing Mix Models.""" + from enum import Enum from typing import Any, NamedTuple, Union diff --git a/pymc_marketing/mmm/utils.py b/pymc_marketing/mmm/utils.py index e879933db..3d9b89398 100644 --- a/pymc_marketing/mmm/utils.py +++ b/pymc_marketing/mmm/utils.py @@ -1,3 +1,5 @@ +"""Utility functions for the Marketing Mix Modeling module.""" + import re from typing import Any, Callable, Dict, List, Optional, Tuple, Union diff --git a/pymc_marketing/mmm/validating.py b/pymc_marketing/mmm/validating.py index 1753fa30e..3532c586c 100644 --- a/pymc_marketing/mmm/validating.py +++ b/pymc_marketing/mmm/validating.py @@ -1,3 +1,5 @@ +"""Validating methods for MMM classes.""" + from typing import Callable, List, Optional, Tuple, Union import pandas as pd