You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm currently using PyMC Marketing for a Marketing Mix Modeling project and noticed a potential issue related to scaling and contribution attribution in the final model. Here's the situation:
Issue:
I see that channels with very low spend are contributing disproportionately to sales in the final model, whereas high-spend channels are underperforming in terms of their contribution. For instance:
A high-spend channel is ranked 5th out of 9 in terms of contribution to sales.
Meanwhile, a low-spend channel is ranked 1st in contribution to sales.
Scaling Context:
I noticed that the MaxAbsScaler is applied independently at the channel level during preprocessing (as per the default max_abs_scale_channel_data function in the codebase). This scales each channel by its own maximum absolute value, which means the total spend proportions across channels are not preserved.
Question:
Could the independent scaling of channels (via MaxAbsScaler) be causing a distortion in the contribution estimates, especially for channels with very low spend?
If so, how should I modify the scaling approach to ensure that the percent share of spend across channels is preserved and reflected in the contribution estimates?
Would switching to global scaling (e.g., scaling relative to the highest spend across all channels or total spend across all channels) improve the interpretation and reliability of the contributions?
Example Data:
Here’s a simplified snapshot of my original spend data vs. scaled spend data:
Original Spend
Cost_CTV 3,058,184
Cost_Display 2,426,354
Cost_E-Commerce 815,907
Cost_OLV 9,671,693
Cost_Paid_Search 938,310
Cost_Streaming_Audio 667,339
Cost_Social 15,844,830
Cost_OOH 2,248,606
Cost_Radio 79,950
Scaled Spend Values (MaxAbsScaler):
Cost_CTV 25.77
Cost_Display 24.08
Cost_E-Commerce 64.73
Cost_OLV 41.73
Cost_Paid_Search 41.76
Cost_Streaming_Audio 6.96
Cost_Social 37.56
Cost_OOH 17.98
Cost_Radio 2.58
This discrepancy seems to disproportionately impact low-spend channels, amplifying their contributions while diminishing the role of high-spend channels.
Your expertise and guidance would mean a lot as I try to better understand this behavior. Thank you for your time and support!
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
I'm currently using PyMC Marketing for a Marketing Mix Modeling project and noticed a potential issue related to scaling and contribution attribution in the final model. Here's the situation:
Here’s a simplified snapshot of my original spend data vs. scaled spend data:
Original Spend
Scaled Spend Values (MaxAbsScaler):
This discrepancy seems to disproportionately impact low-spend channels, amplifying their contributions while diminishing the role of high-spend channels.
Your expertise and guidance would mean a lot as I try to better understand this behavior. Thank you for your time and support!
Beta Was this translation helpful? Give feedback.
All reactions