Skip to content

add rolling_apply method or function #641

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jhamman opened this issue Oct 27, 2015 · 13 comments · Fixed by #668
Closed

add rolling_apply method or function #641

jhamman opened this issue Oct 27, 2015 · 13 comments · Fixed by #668

Comments

@jhamman
Copy link
Member

jhamman commented Oct 27, 2015

Pandas has a generic rolling_apply function. It would be nice to support a similar api on xray objects. The api I have in mind is something like:

# DataArray.rolling_apply(window, func, min_periods=None, freq=None,
#                         center=False, args=(), kwargs={})
da.rolling_apply(7, np.mean)
@shoyer
Copy link
Member

shoyer commented Oct 27, 2015

Yeah, I've been thinking about this one for a while (see also #130)

One possibly improved API over pandas is to make a rolling a method that produces Rolling object (similar to a GroupBy). You could write something like array.rolling(time=7).mean() to do aggregation and could also iterate like for label, windowed in array.rolling(time=7).

@jhamman
Copy link
Member Author

jhamman commented Oct 27, 2015

I like the idea of a Rolling object. It would great if we could leverage that option from the Pandas api when it comes around.

@shoyer
Copy link
Member

shoyer commented Oct 27, 2015

Don't hold your breath on the pandas API changes :). This will take some dedicated effort to make it happen in pandas. Frankly, it's probably easier to do it from scratch in xray where we don't have an old API with which to worry about retaining compatibility.

@jhamman
Copy link
Member Author

jhamman commented Nov 11, 2015

@shoyer - I'm going to give creating a Rolling object a go. I have a paper I want to write using this functionality so there is a carrot motivating me.

@jhamman jhamman self-assigned this Nov 11, 2015
@shoyer
Copy link
Member

shoyer commented Nov 11, 2015

@jhamman Great, looking forward to it!

@jreback
Copy link

jreback commented Nov 25, 2015

@shoyer breath holding :) pandas-dev/pandas#11603

@jreback
Copy link

jreback commented Nov 25, 2015

ohh, @shoyer you are thinking about defining __iter__ on the Rolling, for a custom aggregation? or other reason

@shoyer
Copy link
Member

shoyer commented Nov 25, 2015

@jreback yes, for custom iteration like how you can use __iter__ on groupby. It's not hard to do and it seems like it makes sense for consistency.

@jreback
Copy link

jreback commented Nov 25, 2015

right, I think I will open a new issue for that. its actually a bit tricky as the iteration is done in cython itself, and its a marginal calculation anyhow (e.g. you just keep adding the new value, subtracting values that fall off the window).

@shoyer
Copy link
Member

shoyer commented Nov 25, 2015

@jreback probably should move discussion here back to the pandas issue :). I don't see any reason why the iteration for moving windows (with __iter__) should be to be done in Cython. Basically it is just repeated slicing in a loop, e.g., something like

def __iter__(self):
   for n in range(...):
       yield self.obj[n : n + self.window] 

@jreback
Copy link

jreback commented Nov 25, 2015

it's not how it's implemented

that is MUCH slower that marginal calculations

@shoyer
Copy link
Member

shoyer commented Nov 25, 2015

Yes, of course :). Sometimes still a useful way to think about things,
though maybe it's better not to encourage it. I think it's a similar
situation for an explicit Python loop vs Cython for groupby aggregations,
but the Python loop actually works OK much of the time (e.g., we use it in
xray because we haven't written nd grouped aggregated in Cython or Numba
yet).

On Wed, Nov 25, 2015 at 3:43 PM, Jeff Reback notifications@github.com
wrote:

it's not how it's implemented

that is MUCH slower that marginal calculations


Reply to this email directly or view it on GitHub
#641 (comment).

@jreback
Copy link

jreback commented Nov 25, 2015

yep, agreed. anyhow I created a new issue for it pandas-dev/pandas#11704

@jhamman jhamman mentioned this issue Dec 2, 2015
4 tasks
@jhamman jhamman removed their assignment Jan 4, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants