-
-
Notifications
You must be signed in to change notification settings - Fork 18.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
REGR: setting column with setitem should not modify existing array inplace #35266
REGR: setting column with setitem should not modify existing array inplace #35266
Conversation
b03f45a
to
a5ae33a
Compare
@@ -705,6 +705,15 @@ def test_iloc_setitem_categorical_updates_inplace(self): | |||
expected = pd.Categorical(["C", "B", "A"]) | |||
tm.assert_categorical_equal(cat, expected) | |||
|
|||
# __setitem__ under the other hand does not work in-place |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you we split this test into 2 tests
# overwrite column with new array | ||
df["EA"] = pd.array([1, 2, 3], dtype="Int64") | ||
assert original_arr is not df.EA.array | ||
tm.assert_extension_array_equal(original_arr, pd.array([1, 2, None], dtype="Int64")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you do
expected =
result =
@jbrockmendel quickly tried this, and so changing diff --git a/pandas/core/internals/managers.py b/pandas/core/internals/managers.py
index c82670106..859e2b149 100644
--- a/pandas/core/internals/managers.py
+++ b/pandas/core/internals/managers.py
@@ -1055,11 +1055,16 @@ class BlockManager(PandasObject):
unfit_mgr_locs = []
unfit_val_locs = []
removed_blknos = []
+ blocks = list(self.blocks)
for blkno, val_locs in libinternals.get_blkno_placements(blknos, group=True):
- blk = self.blocks[blkno]
+ blk = blocks[blkno]
blk_locs = blklocs[val_locs.indexer]
if blk.should_store(value):
- blk.set(blk_locs, value_getitem(val_locs))
+ if blk.is_extension:
+ new_blk = blk.make_block_same_class(value)
+ blocks[blkno] = new_blk
+ else:
+ blk.set(blk_locs, value_getitem(val_locs))
else:
unfit_mgr_locs.append(blk.mgr_locs.as_array[blk_locs])
unfit_val_locs.append(val_locs)
@@ -1081,8 +1086,10 @@ class BlockManager(PandasObject):
new_blknos[~is_deleted] = np.arange(self.nblocks - len(removed_blknos))
self._blknos = new_blknos[self._blknos]
self.blocks = tuple(
- blk for i, blk in enumerate(self.blocks) if i not in set(removed_blknos)
+ blk for i, blk in enumerate(blocks) if i not in set(removed_blknos)
)
+ else:
+ self.blocks = tuple(blocks) |
@@ -1589,7 +1589,7 @@ def should_store(self, value: ArrayLike) -> bool: | |||
|
|||
def set(self, locs, values): | |||
assert locs.tolist() == [0] | |||
self.values[:] = values | |||
self.values = values |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isnt this going to break views?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That might be, but that would mean it was broken in 1.0 as well, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assuming the [:]
was added after 1.0, then I guess so, yes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is the [:]
(i.e. behavior in master) in a released 1.0.x?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PR is referenced in the issue and the top post: #32831 -> this is only in master
@jbrockmendel @jreback I also opened an alternative PR to consider: #35271, which is a more targeted revert of the original #32831 bug fix (simpler, but also not preserving the original fix that caused the regression). |
Given there are clearly some side effects here (failing tests related to |
I'll take fresh look at this. As of now this is the only item on my todo list before 1.1. LMK if there's anything else that needs my immediate attention. |
Closes #33457
This changes
ExtensionBlock.setitem
to how it was before #32831: updating the Block itself inplace, but not the array.@jbrockmendel I know you won't like that this update the Block inplace (it's also not exactly what you suggested at #33457 (comment)). But this was the smallest change I could find to fix the issue.
The alternative is probably changing
BlockManager.iset
to not use Block.set in case of an existing ExtensionBlock, butBlockManager.iset
is quite complex ..