Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rename clv_summary to rfm_summary and extend functionality #479

Merged
merged 10 commits into from
Jan 15, 2024
Merged

Rename clv_summary to rfm_summary and extend functionality #479

merged 10 commits into from
Jan 15, 2024

Conversation

ColtAllen
Copy link
Collaborator

@ColtAllen ColtAllen commented Jan 8, 2024

Description

This PR adds some enhancements to the existing clv_summary function and renames it to rfm_summary, which is a better descriptor for what this function does. Specific changes are described in #469.

The RFM Analysis interpretation of recency was not added in this PR. Instead of creating an additional column unnecessary for predictive modeling, it would make more sense to create that column (which is simply T-recency within a future model for Bayesian RFM segmentation.

Related Issue

Checklist

Modules affected

  • MMM
  • CLV

Type of change

  • New feature / enhancement
  • Bug fix
  • Documentation
  • Maintenance
  • Other (please specify):

📚 Documentation preview 📚: https://pymc-marketing--479.org.readthedocs.build/en/479/

Copy link

codecov bot commented Jan 8, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (49e1e80) 90.83% compared to head (d416a08) 90.87%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #479      +/-   ##
==========================================
+ Coverage   90.83%   90.87%   +0.04%     
==========================================
  Files          21       21              
  Lines        1974     1983       +9     
==========================================
+ Hits         1793     1802       +9     
  Misses        181      181              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@ricardoV94 ricardoV94 changed the title rfm_summary Utility Function Rename clv_summary to rfm_summary and extend functionality Jan 8, 2024
Copy link
Contributor

@ricardoV94 ricardoV94 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small suggestion to simplify the docstring.

Should we keep clv_summary with a deprecation warning so we don't break people's code? Something like:

https://github.com/pymc-devs/pytensor/blob/c5b96d925d2005fe5f7be3883f7022f05cd29cc3/pytensor/graph/replace.py#L325-L327

Also is this not used in any of our public notebooks? Those would need to be updated.

pymc_marketing/clv/utils.py Outdated Show resolved Hide resolved
@ricardoV94 ricardoV94 added major API breaking changes and removed priority: high labels Jan 8, 2024
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@ColtAllen
Copy link
Collaborator Author

Should we keep clv_summary with a deprecation warning so we don't break people's code?

Also is this not used in any of our public notebooks? Those would need to be updated.

Done and done 👍

Using a different dataset required updating all the charts and results in the notebook. I also used the opportunity to fix #466.

Copy link

review-notebook-app bot commented Jan 11, 2024

View / edit / reply to this conversation on ReviewNB

ricardoV94 commented on 2024-01-11T14:48:36Z
----------------------------------------------------------------

We should put and read the dataset from here: https://github.com/pymc-labs/pymc-marketing/tree/main/datasets


ColtAllen commented on 2024-01-11T15:44:42Z
----------------------------------------------------------------

If rfm_summary is to be included in this Quickstart (and I think it should because it's an important data preprocessing step) then we can't use that dataset. The purpose of this function is to format raw transactions for modeling, and that dataset is already formatted.

ricardoV94 commented on 2024-01-11T15:48:44Z
----------------------------------------------------------------

Just replace the one that's there with the unformatted one? The dataset is there just for the purpose of being used in examples.

Copy link

review-notebook-app bot commented Jan 11, 2024

View / edit / reply to this conversation on ReviewNB

ricardoV94 commented on 2024-01-11T14:48:37Z
----------------------------------------------------------------

There may be an error in the rfm function. The estimates of a and alpha are rather different than what we had before


ColtAllen commented on 2024-01-11T15:45:51Z
----------------------------------------------------------------

No error - just a different dataset (or rather, a different subset of CDNOW_MASTER.txt, which is rather large for testing).

ricardoV94 commented on 2024-01-11T15:49:13Z
----------------------------------------------------------------

Okay. Is this subset fixed or does it change when the NB is rerun?

ricardoV94 commented on 2024-01-11T15:50:59Z
----------------------------------------------------------------

I see you have a random_seed, perfect!

Copy link

review-notebook-app bot commented Jan 11, 2024

View / edit / reply to this conversation on ReviewNB

ricardoV94 commented on 2024-01-11T14:48:38Z
----------------------------------------------------------------

Here we can also see large differences compared to before


ColtAllen commented on 2024-01-11T15:45:59Z
----------------------------------------------------------------

See above

Copy link

review-notebook-app bot commented Jan 11, 2024

View / edit / reply to this conversation on ReviewNB

ricardoV94 commented on 2024-01-11T14:48:39Z
----------------------------------------------------------------

Line break in representin/ng


ColtAllen commented on 2024-01-11T15:51:09Z
----------------------------------------------------------------

This linebreak does not appear in Jupyter, PyCharm, or even the docs build preview. I'm not sure why ReviewNB isn't displaying properly here.

ricardoV94 commented on 2024-01-11T15:52:38Z
----------------------------------------------------------------

Must be a glitch then. The docs preview should be enough

Copy link

review-notebook-app bot commented Jan 11, 2024

View / edit / reply to this conversation on ReviewNB

ricardoV94 commented on 2024-01-11T14:48:40Z
----------------------------------------------------------------

Estimates also changed considerably in this model


ColtAllen commented on 2024-01-11T15:46:45Z
----------------------------------------------------------------

Different dataset; see above.

@ricardoV94
Copy link
Contributor

@ColtAllen I seem some large differences in the NB results. I suspect there may be a bug in the new function, or in what the model is expecting

Copy link
Collaborator Author

If rfm_summary is to be included in this Quickstart (and I think it should because it's an important data preprocessing step) then we can't use that dataset. The purpose of this function is to format raw transactions for modeling, and that dataset is already formatted.


View entire conversation on ReviewNB

Copy link
Collaborator Author

No error - just a different dataset (or rather, a different subset of CDNOW_MASTER.txt, which is rather large for testing).


View entire conversation on ReviewNB

Copy link
Collaborator Author

See above


View entire conversation on ReviewNB

Copy link
Collaborator Author

Different dataset; see above.


View entire conversation on ReviewNB

Copy link
Contributor

Just replace the one that's there with the unformatted one? The dataset is there just for the purpose of being used in examples.


View entire conversation on ReviewNB

Copy link
Contributor

Okay. Is this subset fixed or does it change when the NB is rerun?


View entire conversation on ReviewNB

Copy link
Contributor

I see you have a random_seed, perfect!


View entire conversation on ReviewNB

Copy link
Collaborator Author

This linebreak does not appear in Jupyter, PyCharm, or even the docs build preview. I'm not sure why ReviewNB isn't displaying properly here.


View entire conversation on ReviewNB

Copy link
Contributor

Must be a glitch then. The docs preview should be enough


View entire conversation on ReviewNB

@ricardoV94 ricardoV94 added docs Improvements or additions to documentation and removed major API breaking changes labels Jan 11, 2024
@ricardoV94
Copy link
Contributor

ricardoV94 commented Jan 11, 2024

Pre-commit is also complaining. Otherwise besides adding the dataset to a more user-facing location, everything looks on spot!

@ColtAllen
Copy link
Collaborator Author

@ColtAllen
Copy link
Collaborator Author

Pre-commit

Did isort auto-correct the order of imports? ruff-format only seems to be working properly in the Git CI. I'm not getting any corrections or exceptions raised prior to pushing to the repo.

@ricardoV94
Copy link
Contributor

Did isort auto-correct the order of imports? ruff-format only seems to be working properly in the Git CI. I'm not getting any corrections or exceptions raised prior to pushing to the repo.

No idea, but all looks green now!

@ricardoV94
Copy link
Contributor

Did you change the import in the NB?

@ColtAllen
Copy link
Collaborator Author

Did you change the import in the NB?

It's fixed now.

@ricardoV94 ricardoV94 merged commit 103878d into pymc-labs:main Jan 15, 2024
12 checks passed
@ColtAllen ColtAllen deleted the rfm_summary branch February 9, 2024 16:21
juanitorduz added a commit that referenced this pull request Mar 8, 2024
* current status as method

* format

* Update version.txt

* Implement different convolution modes (#454)

* Add PR template

* Update pull_request_template.md

* Fix issues in index example

* Update .pre-commit-config.yaml

* Update .pre-commit-config.yaml

* move from other PR

* put legend on side

* Optimisation in customer_lifetime_value when discount_rate == 0 (#468)

* Optimisation in customer_lifetime_value when discount_rate == 0

cf #467

* Update utils.py

* Update README.md

* add support for pre-commit-ci

* add isort

* modify autosummary templates

* Rename `clv_summary` to `rfm_summary` and extend functionality (#479)

* clv_summary adapted into rfm_summary

* added clv_summary with warning

* moved dataset from testing folder

* Update version.txt

* improve ruff

* [pre-commit.ci] pre-commit autoupdate

updates:
- [github.com/astral-sh/ruff-pre-commit: v0.1.11 → v0.1.14](astral-sh/ruff-pre-commit@v0.1.11...v0.1.14)
- [github.com/pre-commit/pre-commit-hooks: v3.2.0 → v4.5.0](pre-commit/pre-commit-hooks@v3.2.0...v4.5.0)

* resolve conflict

* Add baselined saturation (#498)

* add baselined saturation with test and plots

* refactor docs

* add the reparam

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* verify parametrization is equivalent under change of baseline

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add a note for setting x0

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* make it clear how r_ref is calculated

* fix typo

* fix docstrings

* improve test by making sure transform is gives identical saturation and cac0

* add comment in the docstring

* add blank line in the code-block

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* Swap Before and After convolution modes as per #489 (#501)

* Add support for string mode args

* Swap before and after and make mode explicit

* Use Union due Python 3.9

* Style

* resolve conflict

* add dim_name arg

* add seed to tests and test methods

* add slice as type hint

* use slice in docstring

* defaults to mean for each channel

* add non-negative check

* ax as last arg

* change weeks -> time

* parameterize quantiles

* separate out and add to docs

* rerun the baseline images

* mock the prior

* add new images from latest env

* migrate to toml instead of ci/cd

* test only is axes

* remove the images

---------

Co-authored-by: Juan Orduz <juan.orduz@wolt.com>
Co-authored-by: Abdalaziz Rashid <abdalaziz.rashid@outlook.com>
Co-authored-by: Ricardo Vieira <ricardo.vieira1994@gmail.com>
Co-authored-by: Ricardo Vieira <28983449+ricardoV94@users.noreply.github.com>
Co-authored-by: vincent-grosbois <vincent.grosbois@gmail.com>
Co-authored-by: juanitorduz <juanitorduz@gmail.com>
Co-authored-by: Oriol (ProDesk) <oriol.abril.pla@gmail.com>
Co-authored-by: Colt Allen <10178857+ColtAllen@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Maxim Kochurov <max.kochurov@pymc-labs.com>
wd60622 added a commit that referenced this pull request Apr 8, 2024
* current status as method

* format

* Update version.txt

* Implement different convolution modes (#454)

* Add PR template

* Update pull_request_template.md

* Fix issues in index example

* Update .pre-commit-config.yaml

* Update .pre-commit-config.yaml

* move from other PR

* put legend on side

* Optimisation in customer_lifetime_value when discount_rate == 0 (#468)

* Optimisation in customer_lifetime_value when discount_rate == 0

cf #467

* Update utils.py

* Update README.md

* add support for pre-commit-ci

* add isort

* modify autosummary templates

* Rename `clv_summary` to `rfm_summary` and extend functionality (#479)

* clv_summary adapted into rfm_summary

* added clv_summary with warning

* moved dataset from testing folder

* Update version.txt

* improve ruff

* [pre-commit.ci] pre-commit autoupdate

updates:
- [github.com/astral-sh/ruff-pre-commit: v0.1.11 → v0.1.14](astral-sh/ruff-pre-commit@v0.1.11...v0.1.14)
- [github.com/pre-commit/pre-commit-hooks: v3.2.0 → v4.5.0](pre-commit/pre-commit-hooks@v3.2.0...v4.5.0)

* resolve conflict

* Add baselined saturation (#498)

* add baselined saturation with test and plots

* refactor docs

* add the reparam

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* verify parametrization is equivalent under change of baseline

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add a note for setting x0

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* make it clear how r_ref is calculated

* fix typo

* fix docstrings

* improve test by making sure transform is gives identical saturation and cac0

* add comment in the docstring

* add blank line in the code-block

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* Swap Before and After convolution modes as per #489 (#501)

* Add support for string mode args

* Swap before and after and make mode explicit

* Use Union due Python 3.9

* Style

* resolve conflict

* add dim_name arg

* add seed to tests and test methods

* add slice as type hint

* use slice in docstring

* defaults to mean for each channel

* add non-negative check

* ax as last arg

* change weeks -> time

* parameterize quantiles

* separate out and add to docs

* rerun the baseline images

* mock the prior

* add new images from latest env

* migrate to toml instead of ci/cd

* test only is axes

* remove the images

---------

Co-authored-by: Juan Orduz <juan.orduz@wolt.com>
Co-authored-by: Abdalaziz Rashid <abdalaziz.rashid@outlook.com>
Co-authored-by: Ricardo Vieira <ricardo.vieira1994@gmail.com>
Co-authored-by: Ricardo Vieira <28983449+ricardoV94@users.noreply.github.com>
Co-authored-by: vincent-grosbois <vincent.grosbois@gmail.com>
Co-authored-by: juanitorduz <juanitorduz@gmail.com>
Co-authored-by: Oriol (ProDesk) <oriol.abril.pla@gmail.com>
Co-authored-by: Colt Allen <10178857+ColtAllen@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Maxim Kochurov <max.kochurov@pymc-labs.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLV docs Improvements or additions to documentation enhancement New feature or request maintenance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Wrong monetary value
2 participants