Skip to content

New feature request: skip_window(s) for df.rolling #25510

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Turanga1 opened this issue Mar 1, 2019 · 1 comment
Closed

New feature request: skip_window(s) for df.rolling #25510

Turanga1 opened this issue Mar 1, 2019 · 1 comment
Labels
Enhancement Window rolling, ewma, expanding

Comments

@Turanga1
Copy link

Turanga1 commented Mar 1, 2019

With a DatetimeIndex I can specify the rolling window using an offset alias, but if I want to skip the first (incomplete) window, I would need to calculate the number of periods in the window. Rolling is hence using different units for the window and min_period functionality. I would like to be able to skip all periods in the first n windows.

Possible implementations could be:
min_periods='7D'
min_periods=window
skip_windows=n
skip_window=True.

n=1 is probably good enough for most use cases.

Code Sample, a copy-pastable example if possible

# Generate example dataframe:
idx = pd.date_range("2019-03-01", periods=10000, freq='5T')
df = pd.DataFrame(np.sin(np.arange(0,100,0.01)), index=idx)

# Plot data
plt.plot(df)

# Plot rolling mean with vertical offset for visual separation
plt.plot(df.rolling('7D').mean() + 0.2)

# Plot rolling mean with time offset equal to 1 window
periods = pd.to_timedelta('7D')//df.index.freq
plt.plot(df.rolling('7D', min_periods=periods).mean())

plt.show()

Problem description

'min_periods' accepts only integer values. A min_periods value less than the number of periods in the window is not representative as there are too few observations. The documentation is very confusing with respect to time series since the "offset" apparently does not refer to an offset alias: "For a window that is specified by an offset, min_periods will default to 1. Otherwise, min_periods will default to the size of the window."

[paste the output of pd.show_versions() here below this line]
INSTALLED VERSIONS

commit: None
python: 3.6.5.final.0
python-bits: 64
OS: Windows
OS-release: 10
machine: AMD64
processor: Intel64 Family 6 Model 142 Stepping 9, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None
LOCALE: None.None

pandas: 0.23.1
pytest: 3.5.1
pip: 18.1
setuptools: 39.1.0
Cython: 0.28.2
numpy: 1.14.3
scipy: 1.1.0
pyarrow: None
xarray: None
IPython: 6.4.0
sphinx: 1.7.4
patsy: 0.5.0
dateutil: 2.7.2
pytz: 2018.4
blosc: None
bottleneck: 1.2.1
tables: 3.4.3
numexpr: 2.6.5
feather: None
matplotlib: 2.2.2
openpyxl: 2.5.3
xlrd: 1.1.0
xlwt: 1.3.0
xlsxwriter: 1.0.4
lxml: 4.2.1
bs4: 4.6.0
html5lib: 1.0.1
sqlalchemy: 1.2.7
pymysql: None
psycopg2: None
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None

@mroeschke
Copy link
Member

Thanks for the suggestion, but I believe this can be done by defining a custom BaseIndexer subclass to skip a particular window so I don't think this is likely to be added directly. Closing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement Window rolling, ewma, expanding
Projects
None yet
Development

No branches or pull requests

3 participants