-
-
Notifications
You must be signed in to change notification settings - Fork 18.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: use same value counts for roll_var to remove floating point artifacts #46431
ENH: use same value counts for roll_var to remove floating point artifacts #46431
Conversation
Looks like there's a network issue in the test. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we need a battery of tests that exercise this code.
you can use something like the doc examples as tests. they should fail before and pass after.
need to exercise np.inf and np.nan as well (as you are including this in the code)'
pls parameterize over both std and var
Isn't there logic that needs to be added to |
Here's my thoughts: Suppose we have x number of same values, then the window enters into some nans. Now If the new number is still the same, So in summary, I think it's OK that |
I will write some tests in the upcoming days, but I'm not sure where I should put them. Do I need to add some in If only doc test is needed, do I just have to write some examples in rolling.var, std 's docs (in fact I already update the old examples in the docs)? |
Gotcha. Not sure if there's a condition where
It would be good to have an example like
But the |
The case where only Basically, the counting needn't care about removed values, as long as we know how many And we only need to make sure every time we see a new number in |
The test failed in |
Not so familiar with git. After I clicked "sync fork" in pycharm, my branch was added commits from others which may be the cause of these test failures(?) 😓 My last version failed in |
I would recommend following the contributing guide to set up git correctly (not sure how Pycharm decided to do this): https://pandas.pydata.org/docs/development/contributing.html#working-with-the-code Namely you probably need to merge the main branch into your branch
|
4a976e5
to
ac1cec4
Compare
…ating_point_artifacts merge upsteam
@mroeschke I reset my branch to the earliest state and merge with main and then force-pushed it. It really works. |
…ating_point_artifacts
@mroeschke looks like |
d1b6f9b
to
7909ffe
Compare
do not merge to anything bigger wait for review and merge here |
The former tests here require "all rolling kurt for all equal values should return Nan" (which referenced a wrong issue id?#18804) pandas/pandas/tests/window/test_rolling_skew_kurt.py Lines 221 to 230 in b3b5e2a
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looked pretty good, just needs to resolve the merge conflict
…ating_point_artifacts # Conflicts: # doc/source/whatsnew/v1.5.0.rst
I put these changes in Other enhancements section, not sure if it's right. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you make an additional test that tests something like [5, 7, 5, 7, 5, 7] e.g. nothing is repeated yet its likey degenerate.
2258681
to
738c45b
Compare
…ating_point_artifacts
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM on green
@auderson could you merge main one more time? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm @mroeschke merge on green
…ating_point_artifacts # Conflicts: # doc/source/whatsnew/v1.5.0.rst
12f123d
to
4110084
Compare
Awesome thanks @auderson (failure unrelated)! |
Nice👍 |
…facts (pandas-dev#46431) * fix merge * bug fix: nobs == 0 == min_periods * use Py_ssize_t in place of int64_t * use tm.assert_series_equal * add more checks in roll numba methods * add more checks in roll numba methods * cancel check exact * test 32bit: undo Py_ssize_t * add explanation for test_rolling_var_same_value_count_logic * comments updated * add equal-zero test for test_rolling_var_numerical_issues & test_rolling_var_floating_artifact_precision * update docs * add more tests * update whats new * update whats new * update doc Co-authored-by: auderson <liao.renjie@techfin.ai> Co-authored-by: Jeff Reback <jeff@reback.net>
ENH: use same value counts for roll_var to remove floating point artifacts
Please refer to #42064 (comment) for the explaination of this new method.
doc/source/whatsnew/vX.X.X.rst
file if fixing a bug or adding a new feature.