-
Hi everyone, I used the CLV quickstart tutorial from the pymc-marketing website (and scanned as well the Lifetimes library documentation) to do my first CLV. Now, I would like to validate if it's accurate enough to make decisions based on it, but I cannot find a way to validate it with the current library. I'm using also the MMMGPT to help me and I did the Introductory course of the intuitivebayes.com. Thus, I would like to use the trace in the InferenceData Object to create a PPC plot and do a WAIC as well to help me see if my trained model is good enough as well as to improve by iteration. My issue is that the trace isn't available with the BetaGeoModel model or its idata. (I suppose that I will get the same issue with the Gamma-Gamma model because it doesn't have the trace as well in its idata object) What should be my next steps? Is there another way to validate those models? PS: I'm more an hacker than a coder and I'm more a business guy than a Data Scientist... but I'm willing to learn! Thanks |
Beta Was this translation helpful? Give feedback.
Replies: 4 comments 4 replies
-
Hello @lojacobs I'm not sure if there is full support |
Beta Was this translation helpful? Give feedback.
-
Hey @lojacobs, This is related to #352. The only CLV models supporting PPCs are
A distribution block needs to be built for PPCs for CLV models are also rather nuanced. I'm planning to open a PR soon for this, but in the meantime if you wish to hack out a PPC for import arviz as az
import pandas as pd
from pymc-marketing import clv
model = clv.ParetoNBDModel(data)
model.build_model()
model.fit()
ppc_freq = model.distribution_customer_population(random_seed=45)[0][0][...,1]
obs_freq = model.idata.observed_data['likelihood'][...,1]
pd.DataFrame(
{
"model estimations": ppc_freq.to_pandas().value_counts().sort_index(),
"observed": obs_freq.to_pandas().value_counts().sort_index()
}
).head(15).plot(kind="bar", title = "Histogram of Purchase Counts per Customer") These az.plot_ecdf(ppc_freq,obs_freq, confidence_bands = True).set_title( "Posterior Predictive ECDF Plot")
az.plot_ecdf(ppc_freq,obs_freq, confidence_bands = True, difference=True).set_title("Posterior Predictive Difference Plot") |
Beta Was this translation helpful? Give feedback.
-
What do you mean? both the BG/NBD and Pareto/NBD models have
I'm working on a PR for an |
Beta Was this translation helpful? Give feedback.
-
No; both models are for the same use case. In fact, the whole reason the BG/NBD model was originally developed back in 2005 was because the math under the hood for Pareto/NBD was too complex to implement until fairly recently. Pareto/NBD takes more time to fit than BG/NBD, but has more functionality. Both models will perform similarly if you're only interested in For true indicators, you would use |
Beta Was this translation helpful? Give feedback.
No; both models are for the same use case. In fact, the whole reason the BG/NBD model was originally developed back in 2005 was because the math under the hood for Pareto/NBD was too complex to implement until fairly recently.
Pareto/NBD takes more time to fit than BG/NBD, but has more functionality. Both models will perform similarly if you're only interested in
expected_purchases
, but BG/NBD assumes all one-time customers are 100% still active. If this is not a valid assumption for your use case, use Pareto/NBD, which also has a parameter to predict if customers will still be active x …