-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
Maintain inplace operation on series #36498
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Maintain inplace operation on series #36498
Conversation
Hello @samilAyoub! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found: There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻 Comment last updated at 2020-09-25 19:30:33 UTC |
….com/samilAyoub/pandas into maintain_inplace_operation_on_series
@@ -93,8 +93,6 @@ def _wrap_inplace_method(method): | |||
|
|||
def f(self, other): | |||
result = method(self, other) | |||
# Delete cacher | |||
self._reset_cacher() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was added to fix an existing bug (see #30501) so removing it would likely cause a regression
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could this have been the incorrect location for the original fix? it appears there were a couple of iterations in #30501.
@@ -1297,3 +1297,10 @@ def test_dataframe_div_silenced(): | |||
) | |||
with tm.assert_produces_warning(None): | |||
pdf1.div(pdf2, fill_value=0) | |||
|
|||
|
|||
def test_inplace_arithmetic_operation_on_series_updating_parent_dataframe(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this test passes for me on master. is there another case that fails?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes I think need to test df["A"] is s
which is also True on 1.1.2
>>> pd.__version__
'1.1.2'
>>>
>>> import pandas.testing as tm
>>>
>>> df = pd.DataFrame({"A": [1, 2, 3]})
>>> s = df["A"]
>>> s += 1
>>> tm.assert_series_equal(df["A"], s)
>>>
>>> s
0 2
1 3
2 4
Name: A, dtype: int64
>>>
>>> df["A"]
0 2
1 3
2 4
Name: A, dtype: int64
>>>
>>> df
A
0 1
1 2
2 3
>>> df["A"] is s
True
>>>
but the output of __repr__ and also other ops is incorrect
so should also compare using assert_frame_equal
>>> tm.assert_frame_equal(df, pd.DataFrame({"A": [2, 3, 4]}))
Traceback (most recent call last):
...
AssertionError: DataFrame.iloc[:, 0] (column name="A") are different
DataFrame.iloc[:, 0] (column name="A") values are different (100.0 %)
[index]: [0, 1, 2]
[left]: [1, 2, 3]
[right]: [2, 3, 4]
>>>
but this should pass with the fix here
It looks like the underlying problem is
|
but that also occurs in 1.0.5 (and on 0.25.3)
so maybe not directly related to regression? |
@@ -888,22 +888,6 @@ def test_identity_slice_returns_new_object(self): | |||
original_series[:3] = [7, 8, 9] | |||
assert all(sliced_series[:3] == [7, 8, 9]) | |||
|
|||
def test_loc_copy_vs_view(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see #15631 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
umm pls revert this test.
@jbrockmendel I think ready if you can have a look. Thanks. |
So fixing this would fix two bugs instead of one. Have you taken a look at what it would take to fix this? I suspect it may be simpler than the implementation here. |
df = pd.DataFrame({"A": [1, 2, 3]}) | ||
s = df["A"] | ||
s += 1 | ||
assert s is df["A"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we also assert df['A'].to_numpy().base is s.to_numpy().base
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess if we do this we no longer need the cacher machinery. At first glance his appears a big change so will open a separate PR as an alternative to this one.
@@ -888,22 +888,6 @@ def test_identity_slice_returns_new_object(self): | |||
original_series[:3] = [7, 8, 9] | |||
assert all(sliced_series[:3] == [7, 8, 9]) | |||
|
|||
def test_loc_copy_vs_view(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
umm pls revert this test.
@simonjayhawkins this may need some more work, don't block on for 1.1.3 |
It appears to me that the issue is that an inplace operation on the Series deletes the weakref to the dataframe, i.e. the fix in #30501 was incorrect
This is simple. revert the incorrect fix. remove the test that has since been added for a issue that was not actually fixed. and finally ensure that dropped columns are not added back to original dataframe (i.e. what imo should have been done in #30501 originally). Including fixes for the deleted test, i.e. issue #15631 would only make the change more invasive? I'm not sure I understand how a more invasive change would be a backport candidate. |
@simonjayhawkins your appraisal seems reasonable for backporting purposes. i'll take a look a the |
can we just revert the original. (and pls merge master as well) |
@simonjayhawkins this PR is stale, but I think if we revert the original commit is best here. |
if we do that, I think we would still need to disable or remove test_loc_copy_vs_view see #15631 (comment) |
right that's what i mean revert that patch |
I opened a PR with a revert of the original patch: #37497 |
test_loc_copy_vs_view will fail. |
Ah, yes, thanks for the note, added an xfail |
This PR can probably be claused now with #37508 merged? @samilAyoub sorry about sidetracking this PR, but wanted to go fast yesterday to be able to get a fix into 1.1.4 release today |
Thanks @samilAyoub for working on this. |
black pandas
git diff upstream/master -u -- "*.py" | flake8 --diff