-
-
Notifications
You must be signed in to change notification settings - Fork 18k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
API: hide NumericIndex from public top-level namespace in favor of pd.Index #44819
API: hide NumericIndex from public top-level namespace in favor of pd.Index #44819
Conversation
statsmodels often needs an easy way to detect index types. This is relied heavily upon when working with time series data where we are strict about the types of indices that are allowed, and if an end user passes a disallowed index, it is replaced by an |
IIRC one aspect that made it harder was that some attributes were not always defined for all possible index types. Maybe this would simplify things so that indices like |
Can you try to explain why you think it is not that simple (I think you can check the dtype of the index object?)
This specific discussion is about numeric Index classes, and the Int64Index and Float64Index have no additional attributes compared to the base
All index classes (including PeriodIndex) should have a |
I'm not sure why I was complaining so much. At the time it felt painful, but it seems it wasn't that bad |
@jorisvandenbossche if you can rebase this and will look as needed for the RC |
@bashtage , I looked at your PR at bit. In general, it's not necessary to use the more specific index constructors if you just instead supply the dtype. So, instead of Same goes for instance checks. Instead of I think this is part of what @jorisvandenbossche is trying to do: The |
(I'll comment on the PR later today) |
@bashtage as a specific example, I think you could replace the whole def is_int_index(index: pd.Index):
return isinstance(index, pd.Index) and isinstance(index.dtype, np.dtype) and np.issubdtype(index.dtype, np.integer) which doesn't check for specific subclasses and should also already work for older and newer pandas versions alike (or use the |
For this PR, as mentioned in the top post (and being discussed a bit in #41272), the main issue we need to discuss is how to enable using different dtypes (the non 64-bit dtypes):
|
My preferred option is to keep this as it is. There are already something like 12 Index types in the main namespace and I don't see the big deal with having this one also. |
@topper-123 could you also say something about your preference what to do would be if we would decide not exposing it publicly (i.e. related to my last comment above)? |
IMO, it seems strange to not publically expose the concrete type of the most common This is different from saying that it should be in the top level of pandas. I am arguing that it should be in a location that is considered public. |
The idea is that the concrete type will be |
looks like some build errors in the docs: https://github.com/pandas-dev/pandas/runs/4610430149?check_suite_focus=true |
@jorisvandenbossche if you do have a chance to rebase as this is blocking the RC |
# Conflicts: # pandas/tests/indexes/test_numpy_compat.py
I've rebased, was trivial fortunately. |
this is the current way and so that is fine. I agree that this is a fairly big change and likely better wait till 2.0 |
@phofl https://github.com/pandas-dev/pandas/runs/4676142657?check_suite_focus=true some failures here :-<> |
Hm yeah this was missing. Had to remove thet whatsnew regarding NumericIndex |
Should also remove from userguide |
thanks @jorisvandenbossche and @phofl |
I just noticed #43930, so it seems there is excellent progress towards making Index accept all array types. In that case I think this PR goes in the same direction, which is great. Shouldn't CategoricalIndex et. al. likewise be pulled into Index also, so all the index sub classes in the main namespace can be avoided? |
yep that's the end goal here |
👍 |
Starting a PR for this to bump the discussion in #41272 (comment)
This PR already removes
NumericIndex
from the top-level namespace, but does not yet enable getting a NumericIndex otherwise. I we want to go forward with this proposal, we will have to choose between either a keyword inpd.Index
to preserve the passed dtype (pd.Index([..], dtype="int32")
currently gives an int64 index), or accept that this will only be possible in pandas 2.0. Let's have that high-level discussion at #41272.