-
-
Notifications
You must be signed in to change notification settings - Fork 18k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DISC: remove Index and EA subclasses #43002
Comments
I think having index methods in the array and treating the Index class as a container seems logical and has potential to simplify things and allow using user created ExtensionArrays as indexes (or as inputs to a generic index class), which would be a win. Having immutable arrays could be even more powerful. In that case the index would just be a container that requires an immutable array as its input. Having arrays being immutable would also make some operations on DataFrame value faster, e.g. |
@TomAugspurger @jorisvandenbossche we discussed this briefly the other day. can i get you to post any relevant thoughts while they're still fresh-ish |
Personally I don't care much about the index subclasses as such (I agree thinking about the Index class as a simple container of an array is nice implementation wise), but I think the main discussion point is the user API and not so much the implementation. Currently the Index subclasses provide additional dtype specific methods. And as you mention in the third point in the top post, this would mean that those methods will have to be accesses via accessors instead. This is huge breaking change, though (I suppose mostly for DatetimeIndex, as probably the most used index subclass with custom functionality), that will require a long deprecation cycle. So I think we should first discuss this user API question (although in theory we could also still provide this in a backwards compatible manner from a single Index class ...). |
The other aspect that I mentioned in our chat is that I question a bit the need for NumericIndex in our public interface (since this class doesn't even add additional public methods/attributes). |
Getting rid of EA subclasses would cause trouble for any EAs that want to cache things. |
There has been discussion in eg #39133 about having EA-backed Indexes held in base-class Index objects, similar discussion regarding the new NumericIndex.
I do not like the idea of having some non-object dtypes return the base class Index, but would be on board for making all dtypes use the base class. i.e. only have Index and deprecate/remove all the subclasses. I'm envisioning roughly
dti.is_year_start
becomesidx.foo.is_year_start
Related but independent, we could do the same with ExtensionArray subclasses. Instead of having authors implement methods on their EA subclass, they implement it on the ExtensionDtype subclass, which the EA object then dispatches to.
The text was updated successfully, but these errors were encountered: