Skip to content

Commit

Permalink
Minor improvements [MMM] (#735)
Browse files Browse the repository at this point in the history
  • Loading branch information
juanitorduz authored Jun 12, 2024
1 parent b6a938f commit 5296f0f
Show file tree
Hide file tree
Showing 6 changed files with 80 additions and 34 deletions.
15 changes: 11 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -54,13 +54,15 @@ We provide a `Dockerfile` to build a Docker image for PyMC-Marketing so that is

## In-depth Bayesian Marketing Mix Modeling (MMM) in PyMC

Leverage our Bayesian MMM API to tailor your marketing strategies effectively. Based on the research [Jin, Yuxue, et al. “Bayesian methods for media mix modeling with carryover and shape effects.” (2017)](https://research.google/pubs/pub46001/), and integrating the expertise from core PyMC developers, our API provides:
Leverage our Bayesian MMM API to tailor your marketing strategies effectively. Leveraging on top of the research article [Jin, Yuxue, et al. “Bayesian methods for media mix modeling with carryover and shape effects.” (2017)](https://research.google/pubs/pub46001/), and extending it by integrating the expertise from core PyMC developers, our API provides:

- **Custom Priors and Likelihoods**: Tailor your model to your specific business needs by including domain knowledge via prior distributions.
- **Adstock Transformation**: Optimize the carry-over effects in your marketing channels.
- **Saturation Effects**: Understand the diminishing returns in media investments.
- **Customize adstock and saturation functions:** You can select from a variety of adstock and saturation functions. You can even implement your own custom functions.
- **Time-varying Intercept:** Capture time-varying baseline contributions in your model (using modern and efficient Gaussian processes approximation methods).
- **Visualization and Model Diagnostics**: Get a comprehensive view of your model's performance and insights.
- **Choose among many inference algorithms**: We provide the option to choose between various NUTS samplers (e.g. BlackJax, NumPyro and Nutpie). See the [example notebook](https://www.pymc-marketing.io/en/stable/notebooks/general/other_nuts_samplers.html) for more details.
- **Out-of-sample Predictions**: Forecast future marketing performance with credible intervals. Use this for simulations and scenario planning.
- **Budget Optimization**: Allocate your marketing spend efficiently across various channels for maximum ROI.
- **Experiment Calibration**: Fine-tune your model based on empirical experiments for a more unified view of marketing.
Expand All @@ -69,12 +71,14 @@ Leverage our Bayesian MMM API to tailor your marketing strategies effectively. B

```python
import pandas as pd
from pymc_marketing.mmm import DelayedSaturatedMMM
from pymc_marketing.mmm import MMM

data_url = "https://raw.githubusercontent.com/pymc-labs/pymc-marketing/main/data/mmm_example.csv"
data = pd.read_csv(data_url, parse_dates=['date_week'])
data = pd.read_csv(data_url, parse_dates=["date_week"])

mmm = DelayedSaturatedMMM(
mmm = MMM(
adstock="geometric",
saturation="logistic",
date_column="date_week",
channel_columns=["x1", "x2"],
control_columns=[
Expand Down Expand Up @@ -106,6 +110,9 @@ Once the model is fitted, we can further optimize our budget allocation as we ar

Explore a hands-on [simulated example](https://pymc-marketing.readthedocs.io/en/stable/notebooks/mmm/mmm_example.html) for more insights into MMM with PyMC-Marketing.

${\color{red}\textbf{Warning!}}$ We will deprecate the `DelayedSaturatedMMM` class in the next releases.
Please use the `MMM` class instead.

### Essential Reading for Marketing Mix Modeling (MMM)

- [Bayesian Media Mix Modeling for Marketing Optimization](https://www.pymc-labs.com/blog-posts/bayesian-media-mix-modeling-for-marketing-optimization/)
Expand Down
8 changes: 4 additions & 4 deletions docs/source/notebooks/mmm/mmm_budget_allocation_example.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -252,9 +252,9 @@
"### Sigmoid Function\n",
"The sigmoid function is formulated as:\n",
"\n",
"$\n",
"$$\n",
"\\beta \\cdot \\frac{\\exp(-\\lambda x)}{1 + \\exp(-\\lambda x)}\n",
"$\n",
"$$\n",
"\n",
"Key Elements:\n",
"* β (beta): Denotes the Asymptotic Maximum or Ceiling Value. It is the point that the function approaches as the input x becomes immense.\n",
Expand All @@ -263,9 +263,9 @@
"### Michaelis-Menten Function\n",
"The Michaelis-Menten function is formulated as:\n",
"\n",
"$\n",
"$$\n",
"\\frac{\\alpha \\times x}{\\lambda + x}\n",
"$\n",
"$$\n",
"\n",
"Key Elements:\n",
"* α (Alpha or Vmax): It represents the maximum contribution (y) a channel can make, also recognized as the plateau point.\n",
Expand Down
2 changes: 1 addition & 1 deletion pymc_marketing/constants.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,4 +11,4 @@
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
DAYS_IN_YEAR = 365.25
DAYS_IN_YEAR: float = 365.25
38 changes: 36 additions & 2 deletions pymc_marketing/mmm/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -251,6 +251,12 @@ def preprocess(
return data_cp

def get_target_transformer(self) -> Pipeline:
"""Return the target transformer pipeline used for preprocessing the target variable.
Returns
-------
Pipeline
"""
try:
return self.target_transformer # type: ignore
except AttributeError:
Expand Down Expand Up @@ -528,6 +534,16 @@ def _format_model_contributions(self, var_contribution: str) -> DataArray:
return contributions.sum(contracted_dims) if contracted_dims else contributions

def plot_components_contributions(self, **plt_kwargs: Any) -> plt.Figure:
"""Plot the target variable and the posterior predictive model components in
the scaled space.
**plt_kwargs
Additional keyword arguments to pass to `plt.subplots`.
Returns
-------
plt.Figure
"""
channel_contributions = self._format_model_contributions(
var_contribution="channel_contributions"
)
Expand Down Expand Up @@ -610,6 +626,7 @@ def plot_components_contributions(self, **plt_kwargs: Any) -> plt.Figure:
ax.plot(
np.asarray(self.X[self.date_column]),
np.asarray(self.preprocessed_data["y"]), # type: ignore
label="scaled target",
color="black",
)
ax.legend(title="components", loc="center left", bbox_to_anchor=(1, 0.5))
Expand All @@ -621,6 +638,12 @@ def plot_components_contributions(self, **plt_kwargs: Any) -> plt.Figure:
return fig

def compute_channel_contribution_original_scale(self) -> DataArray:
"""Compute the channel contributions in the original scale of the target variable.
Returns
-------
DataArray
"""
channel_contribution = az.extract(
data=self.fit_result, var_names=["channel_contributions"], combined=False
)
Expand Down Expand Up @@ -838,6 +861,19 @@ def _get_channel_contributions_share_samples(self) -> DataArray:
def plot_channel_contribution_share_hdi(
self, hdi_prob: float = 0.94, **plot_kwargs: Any
) -> plt.Figure:
"""Plot the share of channel contributions in a forest plot.
Parameters
----------
hdi_prob : float, optional
HDI value to be displayed, by default 0.94
**plot_kwargs
Additional keyword arguments to pass to `az.plot_forest`.
Returns
-------
plt.Figure
"""
channel_contributions_share: DataArray = (
self._get_channel_contributions_share_samples()
)
Expand Down Expand Up @@ -981,5 +1017,3 @@ class BaseValidateMMM(
ValidateChannelColumns,
):
"""Base class with some validation of the inputs."""

pass
40 changes: 24 additions & 16 deletions pymc_marketing/mmm/delayed_saturated_mmm.py
Original file line number Diff line number Diff line change
Expand Up @@ -44,20 +44,15 @@
add_lift_measurements_to_likelihood,
scale_lift_measurements,
)
from pymc_marketing.mmm.preprocessing import (
MaxAbsScaleChannels,
MaxAbsScaleTarget,
)
from pymc_marketing.mmm.preprocessing import MaxAbsScaleChannels, MaxAbsScaleTarget
from pymc_marketing.mmm.tvp import create_time_varying_intercept, infer_time_index
from pymc_marketing.mmm.utils import (
_get_distribution_from_dict,
apply_sklearn_transformer_across_dim,
create_new_spend_data,
generate_fourier_modes,
)
from pymc_marketing.mmm.validating import (
ValidateControlColumns,
)
from pymc_marketing.mmm.validating import ValidateControlColumns

__all__ = ["BaseMMM", "MMM", "DelayedSaturatedMMM"]

Expand All @@ -71,8 +66,9 @@ class BaseMMM(BaseValidateMMM):
.. [1] Jin, Yuxue, et al. “Bayesian methods for media mix modeling with carryover and shape effects.” (2017).
"""

_model_type = "DelayedSaturatedMMM"
version = "0.0.2"
_model_name: str = "BaseMMM"
_model_type: str = "BaseValidateMMM"
version: str = "0.0.3"

def __init__(
self,
Expand Down Expand Up @@ -593,6 +589,16 @@ def default_model_config(self) -> dict:
def _get_fourier_models_data(self, X) -> pd.DataFrame:
"""Generates fourier modes to model seasonality.
Parameters
----------
X : Union[pd.DataFrame, pd.Series], shape (n_obs, n_features)
Input data for the model. To generate the Fourier modes, it must contain a date column.
Returns
-------
pd.DataFrame
Fourier modes (sin and cos with different frequencies) as columns in a dataframe.
References
----------
https://www.pymc.io/projects/examples/en/latest/time_series/Air_passengers-Prophet_with_Bayesian_workflow.html
Expand Down Expand Up @@ -680,7 +686,7 @@ def load(cls, fname: str):
If the inference data that is loaded doesn't match with the model.
"""

filepath = Path(str(fname))
filepath = Path(fname)
idata = az.from_netcdf(filepath)
model_config = cls._model_config_formatting(
json.loads(idata.attrs["model_config"])
Expand Down Expand Up @@ -958,7 +964,7 @@ class MMM(
""" # noqa: E501

_model_type = "MMM"
version = "0.0.2"
version = "0.0.1"

def channel_contributions_forward_pass(
self, channel_data: npt.NDArray[np.float_]
Expand Down Expand Up @@ -1095,6 +1101,8 @@ def plot_channel_contributions_grid(
absolute_xrange : bool, optional
If True, the x-axis is in absolute values (input units), otherwise it is in
relative percentage values, by default False.
**plt_kwargs
Keyword arguments to pass to `plt.subplots()`
Returns
-------
Expand Down Expand Up @@ -2024,14 +2032,13 @@ def allocate_budget_to_maximize_response(
inverse_scaled_channel_spend = self.channel_transformer.inverse_transform(
np.array([list(self.optimal_allocation_dict.values())])
)
original_scale_allocation_dict = {
k: v
for k, v in zip(
original_scale_allocation_dict = dict(
zip(
self.optimal_allocation_dict.keys(),
inverse_scaled_channel_spend[0],
strict=False,
)
}
)

synth_dataset = self._create_synth_dataset(
df=self.X,
Expand Down Expand Up @@ -2209,8 +2216,9 @@ def plot_allocated_contribution_by_channel(


class DelayedSaturatedMMM(MMM):
_model_type = "MMM"
_model_name = "DelayedSaturatedMMM"
version = "0.0.2"
version = "0.0.3"

def __init__(
self,
Expand Down
11 changes: 4 additions & 7 deletions tests/mmm/test_delayed_saturated_mmm.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,11 +23,7 @@

from pymc_marketing.mmm.components.adstock import DelayedAdstock
from pymc_marketing.mmm.components.saturation import MichaelisMentenSaturation
from pymc_marketing.mmm.delayed_saturated_mmm import (
MMM,
BaseMMM,
DelayedSaturatedMMM,
)
from pymc_marketing.mmm.delayed_saturated_mmm import MMM, BaseMMM, DelayedSaturatedMMM

seed: int = sum(map(ord, "pymc_marketing"))
rng: np.random.Generator = np.random.default_rng(seed=seed)
Expand Down Expand Up @@ -340,8 +336,9 @@ def test_fit(self, toy_X: pd.DataFrame, toy_y: pd.Series) -> None:
adstock="geometric",
saturation="logistic",
)
assert mmm.version == "0.0.2"
assert mmm._model_type == "DelayedSaturatedMMM"
assert mmm.version == "0.0.3"
assert mmm._model_type == "BaseValidateMMM"
assert mmm._model_name == "BaseMMM"
assert mmm.model_config is not None
n_channel: int = len(mmm.channel_columns)
n_control: int = len(mmm.control_columns)
Expand Down

0 comments on commit 5296f0f

Please sign in to comment.