Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Minor improvements [MMM] #735

Merged
merged 15 commits into from
Jun 12, 2024
15 changes: 11 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -54,13 +54,15 @@ We provide a `Dockerfile` to build a Docker image for PyMC-Marketing so that is

## In-depth Bayesian Marketing Mix Modeling (MMM) in PyMC

Leverage our Bayesian MMM API to tailor your marketing strategies effectively. Based on the research [Jin, Yuxue, et al. “Bayesian methods for media mix modeling with carryover and shape effects.” (2017)](https://research.google/pubs/pub46001/), and integrating the expertise from core PyMC developers, our API provides:
Leverage our Bayesian MMM API to tailor your marketing strategies effectively. Leveraging on top of the research article [Jin, Yuxue, et al. “Bayesian methods for media mix modeling with carryover and shape effects.” (2017)](https://research.google/pubs/pub46001/), and extending it by integrating the expertise from core PyMC developers, our API provides:

- **Custom Priors and Likelihoods**: Tailor your model to your specific business needs by including domain knowledge via prior distributions.
- **Adstock Transformation**: Optimize the carry-over effects in your marketing channels.
- **Saturation Effects**: Understand the diminishing returns in media investments.
- **Customize adstock and saturation functions:** You can select from a variety of adstock and saturation functions. You can even implement your own custom functions.
- **Time-varying Intercept:** Capture time-varying baseline contributions in your model (using modern and efficient Gaussian processes approximation methods).
- **Visualization and Model Diagnostics**: Get a comprehensive view of your model's performance and insights.
- **Choose among many inference algorithms**: We provide the option to choose between various NUTS samplers (e.g. BlackJax, NumPyro and Nutpie). See the [example notebook](https://www.pymc-marketing.io/en/stable/notebooks/general/other_nuts_samplers.html) for more details.
- **Out-of-sample Predictions**: Forecast future marketing performance with credible intervals. Use this for simulations and scenario planning.
- **Budget Optimization**: Allocate your marketing spend efficiently across various channels for maximum ROI.
- **Experiment Calibration**: Fine-tune your model based on empirical experiments for a more unified view of marketing.
Expand All @@ -69,12 +71,14 @@ Leverage our Bayesian MMM API to tailor your marketing strategies effectively. B

```python
import pandas as pd
from pymc_marketing.mmm import DelayedSaturatedMMM
from pymc_marketing.mmm import MMM

data_url = "https://raw.githubusercontent.com/pymc-labs/pymc-marketing/main/data/mmm_example.csv"
data = pd.read_csv(data_url, parse_dates=['date_week'])
data = pd.read_csv(data_url, parse_dates=["date_week"])

mmm = DelayedSaturatedMMM(
mmm = MMM(
adstock="geometric",
saturation="logistic",
date_column="date_week",
channel_columns=["x1", "x2"],
control_columns=[
Expand Down Expand Up @@ -106,6 +110,9 @@ Once the model is fitted, we can further optimize our budget allocation as we ar

Explore a hands-on [simulated example](https://pymc-marketing.readthedocs.io/en/stable/notebooks/mmm/mmm_example.html) for more insights into MMM with PyMC-Marketing.

${\color{red}\textbf{Warning!}}$ We will deprecate the `DelayedSaturatedMMM` class in the next releases.
Please use the `MMM` class instead.

### Essential Reading for Marketing Mix Modeling (MMM)

- [Bayesian Media Mix Modeling for Marketing Optimization](https://www.pymc-labs.com/blog-posts/bayesian-media-mix-modeling-for-marketing-optimization/)
Expand Down
8 changes: 4 additions & 4 deletions docs/source/notebooks/mmm/mmm_budget_allocation_example.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -252,9 +252,9 @@
"### Sigmoid Function\n",
"The sigmoid function is formulated as:\n",
"\n",
"$\n",
"$$\n",
"\\beta \\cdot \\frac{\\exp(-\\lambda x)}{1 + \\exp(-\\lambda x)}\n",
"$\n",
"$$\n",
"\n",
"Key Elements:\n",
"* β (beta): Denotes the Asymptotic Maximum or Ceiling Value. It is the point that the function approaches as the input x becomes immense.\n",
Expand All @@ -263,9 +263,9 @@
"### Michaelis-Menten Function\n",
"The Michaelis-Menten function is formulated as:\n",
"\n",
"$\n",
"$$\n",
"\\frac{\\alpha \\times x}{\\lambda + x}\n",
"$\n",
"$$\n",
"\n",
"Key Elements:\n",
"* α (Alpha or Vmax): It represents the maximum contribution (y) a channel can make, also recognized as the plateau point.\n",
Expand Down
2 changes: 1 addition & 1 deletion pymc_marketing/constants.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,4 +11,4 @@
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
DAYS_IN_YEAR = 365.25
DAYS_IN_YEAR: float = 365.25
38 changes: 36 additions & 2 deletions pymc_marketing/mmm/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -251,6 +251,12 @@ def preprocess(
return data_cp

def get_target_transformer(self) -> Pipeline:
"""Return the target transformer pipeline used for preprocessing the target variable.

Returns
-------
Pipeline
"""
try:
return self.target_transformer # type: ignore
except AttributeError:
Expand Down Expand Up @@ -528,6 +534,16 @@ def _format_model_contributions(self, var_contribution: str) -> DataArray:
return contributions.sum(contracted_dims) if contracted_dims else contributions

def plot_components_contributions(self, **plt_kwargs: Any) -> plt.Figure:
"""Plot the target variable and the posterior predictive model components in
the scaled space.

**plt_kwargs
Additional keyword arguments to pass to `plt.subplots`.

Returns
-------
plt.Figure
"""
channel_contributions = self._format_model_contributions(
var_contribution="channel_contributions"
)
Expand Down Expand Up @@ -610,6 +626,7 @@ def plot_components_contributions(self, **plt_kwargs: Any) -> plt.Figure:
ax.plot(
np.asarray(self.X[self.date_column]),
np.asarray(self.preprocessed_data["y"]), # type: ignore
label="scaled target",
color="black",
)
ax.legend(title="components", loc="center left", bbox_to_anchor=(1, 0.5))
Expand All @@ -621,6 +638,12 @@ def plot_components_contributions(self, **plt_kwargs: Any) -> plt.Figure:
return fig

def compute_channel_contribution_original_scale(self) -> DataArray:
"""Compute the channel contributions in the original scale of the target variable.

Returns
-------
DataArray
"""
channel_contribution = az.extract(
data=self.fit_result, var_names=["channel_contributions"], combined=False
)
Expand Down Expand Up @@ -838,6 +861,19 @@ def _get_channel_contributions_share_samples(self) -> DataArray:
def plot_channel_contribution_share_hdi(
self, hdi_prob: float = 0.94, **plot_kwargs: Any
) -> plt.Figure:
"""Plot the share of channel contributions in a forest plot.

Parameters
----------
hdi_prob : float, optional
HDI value to be displayed, by default 0.94
**plot_kwargs
Additional keyword arguments to pass to `az.plot_forest`.

Returns
-------
plt.Figure
"""
channel_contributions_share: DataArray = (
self._get_channel_contributions_share_samples()
)
Expand Down Expand Up @@ -981,5 +1017,3 @@ class BaseValidateMMM(
ValidateChannelColumns,
):
"""Base class with some validation of the inputs."""

pass
40 changes: 24 additions & 16 deletions pymc_marketing/mmm/delayed_saturated_mmm.py
Original file line number Diff line number Diff line change
Expand Up @@ -44,20 +44,15 @@
add_lift_measurements_to_likelihood,
scale_lift_measurements,
)
from pymc_marketing.mmm.preprocessing import (
MaxAbsScaleChannels,
MaxAbsScaleTarget,
)
from pymc_marketing.mmm.preprocessing import MaxAbsScaleChannels, MaxAbsScaleTarget
from pymc_marketing.mmm.tvp import create_time_varying_intercept, infer_time_index
from pymc_marketing.mmm.utils import (
_get_distribution_from_dict,
apply_sklearn_transformer_across_dim,
create_new_spend_data,
generate_fourier_modes,
)
from pymc_marketing.mmm.validating import (
ValidateControlColumns,
)
from pymc_marketing.mmm.validating import ValidateControlColumns

__all__ = ["BaseMMM", "MMM", "DelayedSaturatedMMM"]

Expand All @@ -71,8 +66,9 @@
.. [1] Jin, Yuxue, et al. “Bayesian methods for media mix modeling with carryover and shape effects.” (2017).
"""

_model_type = "DelayedSaturatedMMM"
version = "0.0.2"
_model_name: str = "BaseMMM"
_model_type: str = "BaseValidateMMM"
version: str = "0.0.3"

def __init__(
self,
Expand Down Expand Up @@ -593,6 +589,16 @@
def _get_fourier_models_data(self, X) -> pd.DataFrame:
"""Generates fourier modes to model seasonality.

Parameters
----------
X : Union[pd.DataFrame, pd.Series], shape (n_obs, n_features)
Input data for the model. To generate the Fourier modes, it must contain a date column.

Returns
-------
pd.DataFrame
Fourier modes (sin and cos with different frequencies) as columns in a dataframe.

References
----------
https://www.pymc.io/projects/examples/en/latest/time_series/Air_passengers-Prophet_with_Bayesian_workflow.html
Expand Down Expand Up @@ -680,7 +686,7 @@
If the inference data that is loaded doesn't match with the model.
"""

filepath = Path(str(fname))
filepath = Path(fname)
idata = az.from_netcdf(filepath)
model_config = cls._model_config_formatting(
json.loads(idata.attrs["model_config"])
Expand Down Expand Up @@ -958,7 +964,7 @@
""" # noqa: E501

_model_type = "MMM"
version = "0.0.2"
version = "0.0.1"

def channel_contributions_forward_pass(
self, channel_data: npt.NDArray[np.float_]
Expand Down Expand Up @@ -1095,6 +1101,8 @@
absolute_xrange : bool, optional
If True, the x-axis is in absolute values (input units), otherwise it is in
relative percentage values, by default False.
**plt_kwargs
Keyword arguments to pass to `plt.subplots()`

Returns
-------
Expand Down Expand Up @@ -2024,14 +2032,13 @@
inverse_scaled_channel_spend = self.channel_transformer.inverse_transform(
np.array([list(self.optimal_allocation_dict.values())])
)
original_scale_allocation_dict = {
k: v
for k, v in zip(
original_scale_allocation_dict = dict(

Check warning on line 2035 in pymc_marketing/mmm/delayed_saturated_mmm.py

View check run for this annotation

Codecov / codecov/patch

pymc_marketing/mmm/delayed_saturated_mmm.py#L2035

Added line #L2035 was not covered by tests
zip(
self.optimal_allocation_dict.keys(),
inverse_scaled_channel_spend[0],
strict=False,
)
}
)

synth_dataset = self._create_synth_dataset(
df=self.X,
Expand Down Expand Up @@ -2209,8 +2216,9 @@


class DelayedSaturatedMMM(MMM):
_model_type = "MMM"
_model_name = "DelayedSaturatedMMM"
version = "0.0.2"
version = "0.0.3"

def __init__(
self,
Expand Down
11 changes: 4 additions & 7 deletions tests/mmm/test_delayed_saturated_mmm.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,11 +23,7 @@

from pymc_marketing.mmm.components.adstock import DelayedAdstock
from pymc_marketing.mmm.components.saturation import MichaelisMentenSaturation
from pymc_marketing.mmm.delayed_saturated_mmm import (
MMM,
BaseMMM,
DelayedSaturatedMMM,
)
from pymc_marketing.mmm.delayed_saturated_mmm import MMM, BaseMMM, DelayedSaturatedMMM

seed: int = sum(map(ord, "pymc_marketing"))
rng: np.random.Generator = np.random.default_rng(seed=seed)
Expand Down Expand Up @@ -340,8 +336,9 @@ def test_fit(self, toy_X: pd.DataFrame, toy_y: pd.Series) -> None:
adstock="geometric",
saturation="logistic",
)
assert mmm.version == "0.0.2"
assert mmm._model_type == "DelayedSaturatedMMM"
assert mmm.version == "0.0.3"
assert mmm._model_type == "BaseValidateMMM"
assert mmm._model_name == "BaseMMM"
assert mmm.model_config is not None
n_channel: int = len(mmm.channel_columns)
n_control: int = len(mmm.control_columns)
Expand Down
Loading