Skip to content

rolling( window='10D') does not work for df with MultiIndex #15584

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
fjanoos opened this issue Mar 5, 2017 · 4 comments · Fixed by #28297
Closed

rolling( window='10D') does not work for df with MultiIndex #15584

fjanoos opened this issue Mar 5, 2017 · 4 comments · Fixed by #28297
Labels
API Design Groupby Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Milestone

Comments

@fjanoos
Copy link

fjanoos commented Mar 5, 2017

# Your code here

tdf.rolling( '10D' ).mean()
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-49-707b0ff61efc> in <module>()
     10 tdf = df.groupby(level=1).get_group( 64413 )
     11 
---> 12 tdf.rolling( '10D' ).mean()
     13 
     14 tdf.reset_index().set_index('time').rolling( '10D' ).mean()

/home/firdaus/.conda/envs/tsconda/lib/python3.5/site-packages/pandas/core/generic.py in rolling(self, window, min_periods, freq, center, win_type, on, axis)
   5502                                    min_periods=min_periods, freq=freq,
   5503                                    center=center, win_type=win_type,
-> 5504                                    on=on, axis=axis)
   5505 
   5506         cls.rolling = rolling

/home/firdaus/.conda/envs/tsconda/lib/python3.5/site-packages/pandas/core/window.py in rolling(obj, win_type, **kwds)
   1797         return Window(obj, win_type=win_type, **kwds)
   1798 
-> 1799     return Rolling(obj, **kwds)
   1800 
   1801 

/home/firdaus/.conda/envs/tsconda/lib/python3.5/site-packages/pandas/core/window.py in __init__(self, obj, window, min_periods, freq, center, win_type, axis, on, **kwargs)
     76         self.win_type = win_type
     77         self.axis = obj._get_axis_number(axis) if axis is not None else None
---> 78         self.validate()
     79 
     80     @property

/home/firdaus/.conda/envs/tsconda/lib/python3.5/site-packages/pandas/core/window.py in validate(self)
   1055 
   1056         elif not is_integer(self.window):
-> 1057             raise ValueError("window must be an integer")
   1058         elif self.window < 0:
   1059             raise ValueError("window must be non-negative")

ValueError: window must be an integer

Problem description

The offset feature of specifying timelike windows in 'rolling' doesn't work if the dataframe has multindex with level_0 = 'time' and level_1 = something else.

Expected Output

tdf.reset_index().set_index('time').rolling( '10D' ).mean()

Works correctly.

Output of pd.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 3.5.2.final.0 python-bits: 64 OS: Linux OS-release: 3.16.0-0.bpo.4-amd64 machine: x86_64 processor: byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8

pandas: 0.19.2.post+ts3
nose: None
pip: 9.0.1
setuptools: 27.2.0
Cython: 0.25.2
numpy: 1.11.3
scipy: 0.18.1
statsmodels: 0.6.1
xarray: None
IPython: 5.1.0
sphinx: 1.5.1
patsy: 0.4.1
dateutil: 2.6.0
pytz: 2016.10
blosc: None
bottleneck: 1.2.0
tables: 3.3.0
numexpr: 2.6.1
matplotlib: 1.5.3
openpyxl: 2.4.0
xlrd: 1.0.0
xlwt: None
xlsxwriter: 0.9.6
lxml: None
bs4: 4.5.3
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: 1.1.4
pymysql: None
psycopg2: None
jinja2: 2.9.4
boto: None
pandas_datareader: None

@jreback
Copy link
Contributor

jreback commented Mar 5, 2017

can you show your frame construction itself.

@jreback
Copy link
Contributor

jreback commented Mar 5, 2017

I suspect you are doing something like this

In [7]: df = pd.DataFrame({'value': range(60)},index=pd.MultiIndex.from_product([range(6),pd.date_range('20160101',periods=10)],names=['one','two']))

In [4]: df.rolling(window='10d').sum()
ValueError: window must be an integer

In [5]: df.rolling(window='10d',level='two').sum()

So I thought we had an issue about this, but guess not. We need to:

@jreback jreback modified the milestones: 0.20.0, Next Major Release Mar 5, 2017
@fjanoos
Copy link
Author

fjanoos commented Mar 5, 2017

            time_index = pd.date_range( start, end, freq='1D',  )
            mi = pd.MultiIndex( levels=[time_index, [0]],
                             labels=[np.arange( len( time_index ) ), np.zeros( len( time_index ) )],
                             names=['time', 'int_index'] )

@hipoglucido
Copy link

I am still getting this issue in version 1.5.2. Are there any plans to fix this? Thanks in advance

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API Design Groupby Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants