-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Created benford_correlation(x) function #689
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @gubertoli!
Thank you very much for your pull request! We are super happy for new code, especially if they add more feature calculators! The PR is well documented and adds a very reasonable feature calculator.
So, welcome in the open-source community and welcome to tsfresh
!
As you said that this is your first commit to a "major" package, I was especially nitpicky :-) So do not be discouraged by the number of review comments.
One thing which is however crucial is that you add a test for your calculator. Please have a look into the file test_feature_calculators.py
and add one for your calculator there. Test it for small, large, negative, nan, floating-point, integer, whatever-you-can-think-of numbers. I will not approve the PR without tests ;-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, looks very good now!
Once I understood why the result of the equally distributed numbers is nan and if this is expected, we can merge!
Thanks again @gubertoli for your work! Appreciate the changes! |
Thank you @nils-braun for the patience and all the processual knowledge shared during this PR. |
Hi,
I have been using tsfresh for about 2 months, one attribute that came up to my mind in my cases is to use the deviation from Newcomb-Benford distribution, specifically the correlation between a data array and the expected distribution.
Newcomb-Benford distribution considers a log distribution of the first digits from data.
This analysis has been used for fraud and anomaly detection. More uses - https://scholar.google.com/scholar?q=benford+law+time+series
This is my first pull request to a major package, please if anything is wrong, please let me know.