-
Notifications
You must be signed in to change notification settings - Fork 235
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Covariates to ParetoNBDModel
#463
Conversation
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## main #463 +/- ##
==========================================
- Coverage 91.36% 90.07% -1.30%
==========================================
Files 21 21
Lines 2073 2025 -48
==========================================
- Hits 1894 1824 -70
- Misses 179 201 +22 ☔ View full report in Codecov by Sentry. |
Tests are all passing, albeit incomplete. The CI seems to have timed out on an MMM test. |
Ready for review? |
I'm gonna save the covariate scaling and new customer modifications for future PRs. Not all tests are parametrized yet for covariates, and predictive methods will fail for out-of-sample predictions. |
@ricardoV94 I could use your help with an
Otherwise, this is ready for review. |
""" | ||
Utility function to process covariates into model parameters. | ||
""" | ||
# TODO: broadcasting needs to be resolved for out-of-sample data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you want to add a Warning to handle this TODO in case is not part of the PR?
@ColtAllen This is very cool! I need (and will!) to catch up with the development! I left some basic comments on the PR. I hope I can contribute more once I read the model details 🙏 |
tests/clv/models/test_pareto_nbd.py
Outdated
@@ -265,6 +356,8 @@ def test_expected_probability_alive(self): | |||
est_prob_alive_t = self.model.expected_probability_alive(future_t=4.5) | |||
assert est_prob_alive.mean() > est_prob_alive_t.mean() | |||
|
|||
self.covar_model.expected_probability_alive() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test something about this. Doesn't the same question of covariates show up here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
xarray
issue needs to be resolved first (see my latest comment).
tests/clv/models/test_pareto_nbd.py
Outdated
@@ -290,6 +383,8 @@ def test_expected_purchase_probability(self, test_n, test_t): | |||
rtol=0.001, | |||
) | |||
|
|||
self.model.expected_purchase_probability(test_n, test_t, self.data) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you mean self.covar_model
? Also need to test something about the output
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
xarray
issue needs to be resolved first (see my latest comment).
I don't like option 2. I would say 1 if you want to get this merged quickly, and 3 otherwise (or later) |
What code can I look at that runs this error? The dev notebook/ a specific test? |
Tests are failing at these lines: |
I don't know how important this is to resolve, but PyCharm is highlighting this inconsistency:
|
Description
This PR provides the option of using time-invariant covariates with
ParetoNBDModel
. This is a draft PR for now because I still need to parametrize some tests and add a few other things, but more importantly get opinions on two points:Covariates require scaling for model convergence, preferably with
MinMaxScaler
so the coefficients are interpretable. This can easily be added as a utility function, and the covariate Min/Max values saved asmodel.idata
attributes for out-of-sample predictions. Should a scaler be added in this PR, or a future one? Should it be added toParetoNBDModel
, or the baseCLVModel
?Covariates for existing customers will also require modifications to the "new customer" methods. There are three options for this:
ValueError
indicating these methods aren't supported with covariates (easy, but not my preference.)alpha
andbeta
parameters together for new customer estimates, which is a subset of the next option.Regardless of the approach, should it added to this PR, or another one? This one is already rather lengthy.
Related Issue
ParetoNBDModel
#183, Add Time-Invariant Covariates to ParetoNBD and BG/NBD Models #134Checklist
Modules affected
Type of change
📚 Documentation preview 📚: https://pymc-marketing--463.org.readthedocs.build/en/463/