-
-
Notifications
You must be signed in to change notification settings - Fork 18.7k
Closed
Labels
BugStringsString extension data type and string dataString extension data type and string datareplacereplace methodreplace method
Milestone
Description
For all other dtypes (I think, just checked with the one below), if the value to replace with in replace()
doesn't fit into the calling series, then we "upcast" to object dtype and then do the replacement anyway.
Simple example with an integer series:
>>> ser = pd.Series([1, 2])
>>> ser.replace(1, "str")
0 str
1 2
dtype: object
However, for the future string dtype, and then trying to replace a value with a non-string, we do not cast to object dtype currently, but raise instead:
>>> pd.options.future.infer_string = True
>>> ser = pd.Series(["a", "b"])
>>> ser.replace("a", 1)
...
File ~/scipy/repos/pandas/pandas/core/internals/blocks.py:713, in Block.replace(self, to_replace, value, inplace, mask)
709 elif self._can_hold_element(value):
710 # TODO(CoW): Maybe split here as well into columns where mask has True
711 # and rest?
712 blk = self._maybe_copy(inplace)
--> 713 putmask_inplace(blk.values, mask, value)
714 return [blk]
716 elif self.ndim == 1 or self.shape[0] == 1:
...
File ~/scipy/repos/pandas/pandas/core/arrays/string_.py:746, in __setitem__(self, key, value)
...
TypeError: Invalid value '1' for dtype 'str'. Value should be a string or missing value, got 'int' instead.
Making replace()
strict (preserve dtype) in general is a much bigger topic, so I think for now we should just keep the current behaviour of upcasting to object dtype when needed.
Metadata
Metadata
Assignees
Labels
BugStringsString extension data type and string dataString extension data type and string datareplacereplace methodreplace method