Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

COMPAT: Adjust CFTimeIndex.get_loc for pandas 2.0 deprecation enforcement #7361

Merged
merged 4 commits into from
Dec 14, 2022

Conversation

mroeschke
Copy link
Contributor

With the upcoming pandas 2.0 release, we are enforcing a deprecation in pd.Index.get_loc to remove method and tolerance. Just in case get_loc isn't 100% compatible with get_indexer, I structured the elif to use get_loc as much as possible. Let me know if any other adjustments are needed

@mroeschke mroeschke changed the title COMPAT: Adjust CFTimeIndex.get_loc for pandas 2.0 deprecation enforce… …ment COMPAT: Adjust CFTimeIndex.get_loc for pandas 2.0 deprecation enforcement Dec 6, 2022
Copy link
Member

@spencerkclark spencerkclark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for looking out for this @mroeschke!

I wonder if it's appropriate to simply remove these arguments altogether following pandas? Since we were calling Index.get_loc under the hood, the deprecation warning would have already been raised for anyone who used them (I suppose except in the case that a string key was provided, where the arguments would be ignored):

>>> import cftime; import xarray as xr
>>> times = xr.cftime_range("2000", periods=3)
>>> times.get_loc(cftime.DatetimeGregorian(2000, 1, 1), method="nearest")
/Users/spencer/software/xarray/xarray/coding/cftimeindex.py:468: FutureWarning: Passing method to CFTimeIndex.get_loc is deprecated and will raise in a future version. Use index.get_indexer([item], method=...) instead.
  return pd.Index.get_loc(self, key, method=method, tolerance=tolerance)
0

I don't think this would break anything internally (I believe we already use get_indexer in Dataset.sel for instance).

@mroeschke
Copy link
Contributor Author

I wonder if it's appropriate to simply remove these arguments altogether following pandas?

Sure, happy to remove this method and tolerance from CFTimeIndex.get_loc. Would that require a deprecation in xarray or can they simply be removed if CFTimeIndex.get_loc is "private"?

@spencerkclark
Copy link
Member

Technically I think we would consider get_loc to be public (both due to how it is named and in that it is at least primitively documented). That said, I don't know how many people use the lower level methods of CFTimeIndexes like get_loc outside of xarray data structures.

My inclination is that it would not require a deprecation cycle, given the warning had already been being raised as in the example above. It's true though that this warning was not being emitted in the case that key was a string, so perhaps it's a little borderline. Maybe @dcherian or @mathause, do you have any thoughts?

@dcherian
Copy link
Contributor

dcherian commented Dec 12, 2022

I don't know how many people use the lower level methods of CFTimeIndexes like get_loc outside of xarray data structures.

I sometimes do this with a general pd.Index but it's pretty rare. In any case, I think any user using get_loc would be fairly advanced and can figure out the workaround pretty easily.

@aulemahal does xclim or its related packages use CFTimeIndex.get_loc at all to convert a label to an integer index?(EDIT: a search didn't turn up anything)

@aulemahal
Copy link
Contributor

Hmm, I don't think so! Thanks for the heads up.

@mroeschke
Copy link
Contributor Author

I guess this could be a "Breaking Change" in the next release's What's New. Let me know if that's the direction you're still leaning

@spencerkclark
Copy link
Member

Yes, I think that's what we'll end up going with @mroeschke. Thanks for your patience as we worked it out.

@mroeschke
Copy link
Contributor Author

Great added the whatsnew note and just removed the arguments

Copy link
Member

@spencerkclark spencerkclark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Thanks!

@mathause mathause merged commit 80c3e8e into pydata:main Dec 14, 2022
@mathause
Copy link
Collaborator

Thanks @mroeschke

@mroeschke mroeschke deleted the pandas/get_loc_depr branch December 14, 2022 21:11
dcherian added a commit to dcherian/xarray that referenced this pull request Jan 18, 2023
* main: (41 commits)
  v2023.01.0 whats-new (pydata#7440)
  explain keep_attrs in docstring of apply_ufunc (pydata#7445)
  Add sentence to open_dataset docstring (pydata#7438)
  pin scipy version in doc environment (pydata#7436)
  Improve performance for backend datetime handling (pydata#7374)
  fix typo (pydata#7433)
  Add lazy backend ASV test (pydata#7426)
  Pull Request Labeler - Workaround sync-labels bug (pydata#7431)
  see also : groupby in resample doc and vice-versa (pydata#7425)
  Some alignment optimizations (pydata#7382)
  Make `broadcast` and `concat` work with the Array API (pydata#7387)
  remove `numbagg` and `numba` from the upstream-dev CI (pydata#7416)
  [pre-commit.ci] pre-commit autoupdate (pydata#7402)
  Preserve original dtype when accessing MultiIndex levels (pydata#7393)
  [pre-commit.ci] pre-commit autoupdate (pydata#7389)
  [pre-commit.ci] pre-commit autoupdate (pydata#7360)
  COMPAT: Adjust CFTimeIndex.get_loc for pandas 2.0 deprecation enforcement (pydata#7361)
  Avoid loading entire dataset by getting the nbytes in an array (pydata#7356)
  `keep_attrs` for pad (pydata#7267)
  Bump pypa/gh-action-pypi-publish from 1.5.1 to 1.6.4 (pydata#7375)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants