Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug fix for woe_1d, plus minor cosmetic and test issues #98

Merged
merged 4 commits into from
Jul 6, 2023

Conversation

lorenjan
Copy link
Contributor

A few minor fixes:

  • The woe_1d function wasn't running y.reset_index() in the case where its y input was already a pd.Series. This meant that it would return incorrect values when indexes in X and y were noncontinuous. Added a check for this condition to tests/test_metrics.py.
  • The scikit-learn WOEEncoder uses ln(%bad / %good) instead of ln(%good / %bad) for some reason, which flips all of the signs. A previous fix switched the computation of WOE in the summary table to match. However, this fix caused the IV values to be negative in summary tables, which I found surprising. I added an abs() call to the IV calculation to correct this cosmetic defect.
  • Tests were failing in Pandas 2.0 due to the use of DataFrame.append, which has been removed. I fixed it by changing this to a pd.concat call.

The woe_1d function wasn't running y.reset_index() in
the case where its y input was already a pd.Series. This
meant that it would return incorrect values when indexes
in X and y were noncontinuous.

Added a check for this condition to tests/test_metrics.py
as well.
The scikit-learn WOEEncoder uses ln(%bad / %good) instead of
ln(%good / %bad) for some reason, which flips all of the signs.
A previous fix switched the computation of WOE in the summary
table to match. However, this fix caused the IV values to be
negative in summary tables, which I found surprising.

I added an abs() call to the IV calculation to correct this
cosmetic defect.
Tests were failing in Pandas 2.0 due to the use of
DataFrame.append, which has been removed. I fixed it
by changing this to a pd.concat call.
@ReinierKoops
Copy link

Hi @lorenjan , thank you for your PR! Could you sync your changes with the master branch?

@ReinierKoops
Copy link

Fix is in the process of being merged #109

Copy link

@ReinierKoops ReinierKoops left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like a great solution for now. Thanks @lorenjan

@ReinierKoops ReinierKoops requested review from silaozen and removed request for silaozen July 6, 2023 08:40
@ReinierKoops ReinierKoops merged commit 439b7e0 into ing-bank:main Jul 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants