-
-
Notifications
You must be signed in to change notification settings - Fork 18.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
EA: _can_hold_element, _validate_fill_value #36226
Comments
What would be the difference between Also, before adding it to the EA interface, I think we should discuss for a moment the use case. I think right now In practice (and for the reasons that
which contrasts the default dtypes:
Personally, I kind of like this stricter behaviour of the EA dtypes. For example also numpy doesn't allow such dtype changes on setitem. |
_can_hold_element would be accessed from Block, so would need to be part of the official interface. validate_fill_value is more a pattern that would be useful to provide default implementations for other methods which include can_hold_element |
I also like the dont-silently-copy/cast behavior, but right now we are getting it because of a _can_hold_element implementation that is just wrong. Moreover we are getting it inconsistently. If you wanted to write an EA to get the raise-instead-of-cast Series behavior you could do:
|
@jreback @jorisvandenbossche @TomAugspurger any objection to adding a Spitballing the default implementation, we could either start with the existing (wrong)
|
Adding this seems reasonable.
Do we know if this is going to be a scalar (as defined by the type?). I'm
wondering if the default implementation could be something like
`isinstance(element, self.dtype.type)`. Wouldn't work for all EAs
(categorical would want something like `element in self.categories`) but it
seems OK for a default.
…On Fri, Oct 16, 2020 at 4:20 PM jbrockmendel ***@***.***> wrote:
@jreback <https://github.com/jreback> @jorisvandenbossche
<https://github.com/jorisvandenbossche> @TomAugspurger
<https://github.com/TomAugspurger> any objection to adding a
_can_hold_element to the EA interface? This is a) a blocker for merging
DTBlock into ExtensionBlock and b) likely the source of some other bugs
Spitballing the default implementation, we could either start with the
existing (wrong) return True in ExtensionBlock, or could do something like
def _can_hold_element(self, element: Any) -> bool:
"""
Check if the given element can be set (via setitem) into an array with this type and dtype.
Parameters
-----------
element : Any
Returns
--------
bool
Notes
-----
This check ignores all length considerations, e.g. may return True even if len(self) == 0
"""
if not is_list_like(element): # <-- may need something more configurable than this
element = [element]
try:
type(self)._from_sequence(element, dtype=self.dtype)
except (ValueError, TypeError):
return False
return True
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#36226 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAKAOIXLUMY67EX5NI5JMPLSLC2JLANCNFSM4RAK73OA>
.
|
Block._can_hold_element is also used for arraylike |
I want to repeat my question from above #36226 (comment) related to the precise use case of this. Assume we would decide to have EAs to be strict regarding dtype (and not let it upcast on setitem). Do we then actually need this method in the interface? I know that the datetime-like EAs don't have this strict behaviour, and that's not something we can simply change. But if it's only for them, we could also leave that as a special case to deal with in ExtensionBlock, instead of adding it to the general EA interface. |
I think we need to distinguish between where casting can occur. DTA/TDA/PA will upcast
Are you suggesting that what In the status quo we have EABlock._can_hold_element always returning True, and we don't learn that it is wrong until we try to do the setitem and it raises. Working in the core.indexing code, particularly in Another frequently-discussed method/capability is deciding what dtype we need to promote to for certain operations. It might make sense to have a can_hold_element-like method that returns the upcast dtype needed when can_hold is Another option is adding _can_hold_element to ExtensionArray but not make it part of the interface (at least not yet). |
Yes, or at least that is what I think we should discuss / decide. |
We're in the process of moving away from the _can_hold_element pattern, closing. |
TL;DR: we should add
_validate_fill_value
to the EA interface and define_can_hold_element
in terms of it.ATM
ExtensionBlock._can_hold_element
incorrectly always returns True. This needs to be an EA method that returns True if and only ifself[0] = value
is allowed on a non-empty array.This method can in turn to defined in terms of a more broadly useful
_validate_fill_value
that we use DTA/TDA/PA/PandasArray/Categorical:The text was updated successfully, but these errors were encountered: