Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Port fix from pandas-dev/pandas#55283 to cftime resample #8393

Merged
merged 4 commits into from
Nov 2, 2023

Conversation

spencerkclark
Copy link
Member

The remaining failing cftime resample tests in #8091 happen to be related to a bug that was fixed in the pandas implementation, pandas-dev/pandas#55283, leading answers to change in some circumstances. This PR ports that bug fix to xarray's implementation of resample for data indexed by a CFTimeIndex.

A simple example where answers change in pandas is the following:

Previously

>>> import numpy as np; import pandas as pd
>>> index = pd.date_range("2000", periods=5, freq="5D")
>>> series = pd.Series(np.arange(index.size), index=index)
>>> series.resample("2D", closed="right", label="right", offset="1s").mean()
2000-01-01 00:00:01    0.0
2000-01-03 00:00:01    NaN
2000-01-05 00:00:01    1.0
2000-01-07 00:00:01    NaN
2000-01-09 00:00:01    NaN
2000-01-11 00:00:01    2.0
2000-01-13 00:00:01    NaN
2000-01-15 00:00:01    3.0
2000-01-17 00:00:01    NaN
2000-01-19 00:00:01    NaN
2000-01-21 00:00:01    4.0
Freq: 2D, dtype: float64

Currently

>>> import numpy as np; import pandas as pd
>>> index = pd.date_range("2000", periods=5, freq="5D")
>>> series = pd.Series(np.arange(index.size), index=index)
>>> series.resample("2D", closed="right", label="right", offset="1s").mean()
2000-01-01 00:00:01    0.0
2000-01-03 00:00:01    NaN
2000-01-05 00:00:01    NaN
2000-01-07 00:00:01    1.0
2000-01-09 00:00:01    NaN
2000-01-11 00:00:01    2.0
2000-01-13 00:00:01    NaN
2000-01-15 00:00:01    NaN
2000-01-17 00:00:01    3.0
2000-01-19 00:00:01    NaN
2000-01-21 00:00:01    4.0
Freq: 2D, dtype: float64

This PR allows us to reproduce this change in xarray for data indexed by a CFTimeIndex. The bin edges were incorrect in the previous case; see pandas-dev/pandas#52064 (comment) for @MarcoGorelli's nice explanation as to why.

@dcherian
Copy link
Contributor

Thanks Spencer, the RTD failure looks like bad rst formatting.

/home/docs/checkouts/readthedocs.org/user_builds/xray/checkouts/8393/doc/whats-new.rst:52: ERROR: Unexpected indentation.
/home/docs/checkouts/readthedocs.org/user_builds/xray/checkouts/8393/doc/whats-new.rst:46: WARNING: Inline interpreted text or phrase reference start-string without end-string.

@keewis keewis added the run-upstream Run upstream CI label Oct 31, 2023
doc/whats-new.rst Outdated Show resolved Hide resolved
Co-authored-by: Justus Magin <keewis@users.noreply.github.com>
@dcherian dcherian added the plan to merge Final call for comments label Oct 31, 2023
@dcherian dcherian merged commit d933578 into pydata:main Nov 2, 2023
25 of 30 checks passed
@spencerkclark spencerkclark deleted the port-pandas-resample-fix branch November 2, 2023 09:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
plan to merge Final call for comments run-upstream Run upstream CI topic-pandas-like
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants