-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
fixed bug in DataFrame.diff - issue #10907 #10930
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This is my first PR here. Could someone help me pass this Travis test? After I made the changes in the code, what should I do? |
try running the tests locally first eg nosetests pandas/test_frame.py |
the consolidate has to go in core/internals.py (where diff is defined) |
@@ -10771,6 +10771,11 @@ def test_diff(self): | |||
assert_series_equal(the_diff['A'], | |||
tf['A'] - tf['A'].shift(1)) | |||
|
|||
df = pd.DataFrame({'y': pd.Series([2]), 'z': pd.Series([3])}) | |||
df.insert(0, 'x', 1) | |||
self.assertEqual(df.diff(axis=1)['x']==np.nan, False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
construct an expected frame the use
assert_frame_equal
When I run tests by calling Is there anything else I should do before test it? |
see the docs here: http://pandas.pydata.org/pandas-docs/stable/contributing.html |
@@ -2513,7 +2513,7 @@ def putmask(self, **kwargs): | |||
return self.apply('putmask', **kwargs) | |||
|
|||
def diff(self, **kwargs): | |||
return self.apply('diff', **kwargs) | |||
return self.consolidate().apply('diff', **kwargs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instead of this
add to .apply
an argument consolidate=True
, and:
if consolidate:
self._consolidate_inplace()
This should fix this issue and not break anything else.
@@ -10771,6 +10771,13 @@ def test_diff(self): | |||
assert_series_equal(the_diff['A'], | |||
tf['A'] - tf['A'].shift(1)) | |||
|
|||
df = pd.DataFrame({'y': pd.Series([2]), 'z': pd.Series([3])}) | |||
df.insert(0, 'x', 1) | |||
the_diff = df.consolidate().diff(axis=1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just do
result = df.diff(axis=1)
expected = DataFrame(....)
assert_frame_equal(result, expected)
Thanks so much for your help! I have made the changes. Now I understand the code structure better. Hopefully I'll be able to contribute more in the future. |
@@ -10771,6 +10771,12 @@ def test_diff(self): | |||
assert_series_equal(the_diff['A'], | |||
tf['A'] - tf['A'].shift(1)) | |||
|
|||
df = pd.DataFrame({'x': pd.Series([1]),'y': pd.Series([2]), 'z': pd.Series([3])}) | |||
result = df.diff(axis=1).astype(float) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't modify result
the astype might be hiding something
Done. Should be able to merge now since it passes tests locally. There are too many Travis build tasks pending now. |
needs a whatsnew note |
This should do it now, right? |
@@ -10771,6 +10771,12 @@ def test_diff(self): | |||
assert_series_equal(the_diff['A'], | |||
tf['A'] - tf['A'].shift(1)) | |||
|
|||
df = pd.DataFrame({'x': pd.Series([1]),'y': pd.Series([2]), 'z': pd.Series([3])}) | |||
result = df.diff(axis=1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add the issue number as a comment here.
pls squash as well. |
…t consolidated (+1 squashed commit) Squashed commits: [6fe71d3] moved changes to correct place and fixed test_diff (+1 squashed commit) Squashed commits: [2bf3c2b] moved change to where diff is defined and updated test (+1 squashed commit) Squashed commits: [6715d7f] added unit test to test this fix (+1 squashed commit) Squashed commits: [f06fa5e] fixed bug in DataFrame.diff
Thanks. Done. |
ok, pls squash to a single commit. ping when green. |
It has conflicts that I don't know how to solve if I squash...could we leave it as it is this time? Both commits are descriptive to me. |
ping me when green. typically we use just a single commit on things like this. this is just the way pandas has always worked. |
@jreback could you merge? |
will get to soon |
@@ -10771,6 +10771,13 @@ def test_diff(self): | |||
assert_series_equal(the_diff['A'], | |||
tf['A'] - tf['A'].shift(1)) | |||
|
|||
# issue 10907 | |||
df = pd.DataFrame({'x': pd.Series([1]),'y': pd.Series([2]), 'z': pd.Series([3])}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is doesn't actually test anything (and it works), but it doesn't replicate the original issue problem. IIRC you had that their, but it has changed.
This is the same as the original. Now in the testing code, I've created the same DataFrame that's originally created via DataFrame.insert. The core problem rises from DataFrame.diff instead of DataFrame.insert. Please take a quick comparison between the outdated diff in this PR and the new push in this PR. Thanks. |
If you do a assertDataFrameEqual to the original test code and the current test code. You'll see those DataFrames are equivalent. |
your test passes on 0.16.2 you haven't proven that you fixed the bug (you did) but your test doesn't work |
Ok didn't know that. I'll revise the test soon. Thanks for double checking.
|
…+1 squashed commit) Squashed commits: [810cbda] DOC: Added a note in whatsnew and doc-string for fixing issue 10907 (+1 squashed commit) Squashed commits: [f9220a2] DOC: Added a note in whatsnew and doc-string for fixing issue 10907 (+1 squashed commit) Squashed commits: [0f1836f] added consolidate as param in doc-string
@jreback Could you double check new push? |
@jreback I don't think this fix works if you have mixed types, since consolidate can't consolidate those blocks. This is much less common, but I could see someone with mixed ints and floats that wants to shift |
@hhuuggoo yes, this is prob just wrong. it needs to create the indexer then do a take. Can you create another issue with an example? thxs |
yup! |
Fixed issue #10907