-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: fix concat of Sparse with non-sparse dtypes #34338
BUG: fix concat of Sparse with non-sparse dtypes #34338
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good.
I'm not totally sure about the expected behavior of concat([Sparse, Categorical])
. I suspect it was never properly discussed / intentionally implemented. Happy to just match 1.0.3 behavior for now htough.
Co-authored-by: Tom Augspurger <TomAugspurger@users.noreply.github.com>
OK, will go forward with this PR as is (matching the previous behaviour) to unblock the other concat PRs, but will open a set of follow-up issues. |
@@ -1063,7 +1063,8 @@ def astype(self, dtype=None, copy=True): | |||
""" | |||
dtype = self.dtype.update_dtype(dtype) | |||
subtype = dtype._subtype_with_str | |||
sp_values = astype_nansafe(self.sp_values, subtype, copy=copy) | |||
# TODO copy=False is broken for astype_nansafe with int -> float | |||
sp_values = astype_nansafe(self.sp_values, subtype, copy=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Opened #34456 for this (and added link in the comment)
if is_sparse(arr.dtype) and not is_sparse(dtype): | ||
# problem case: SparseArray.astype(dtype) doesn't follow the specified | ||
# dtype exactly, but converts this to Sparse[dtype] -> first manually | ||
# convert to dense array |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Opened #34457 for this
And opened #34459 for the general "what should concat(sparse, categorical) do?" question |
pandas/core/dtypes/concat.py
Outdated
@@ -20,7 +21,7 @@ | |||
) | |||
from pandas.core.dtypes.generic import ABCCategoricalIndex, ABCRangeIndex, ABCSeries | |||
|
|||
from pandas.core.arrays import ExtensionArray | |||
from pandas.core.arrays import ExtensionArray, SparseArray |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I already pushed a fix.
I don't really understand why, though, as SparseArray
is included in the pandas.core.arrays
init, just as ExtensionArray.
Closes #34336
This adds tests with the behaviour as it was on 0.25.3 / 1.0.3, and some changes to get back to that behaviour.
(but whether this behaviour is fully "sane", I am not sure ..)