Skip to content

ENH: recurrence relation calculations #4567

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jreback opened this issue Aug 14, 2013 · 11 comments
Closed

ENH: recurrence relation calculations #4567

jreback opened this issue Aug 14, 2013 · 11 comments
Labels
Algos Non-arithmetic algos: value_counts, factorize, sorting, isin, clip, shift, diff API Design Enhancement Numeric Operations Arithmetic, Comparison, and Logical operations

Comments

@jreback
Copy link
Contributor

jreback commented Aug 14, 2013

I think an implementation of a general recurrent relation might be interesting. These
are not able to evaluated in numpy/numexpr directly

see here:
http://stackoverflow.com/questions/9984373/calculate-two-coupled-equation-with-numpy
http://stackoverflow.com/questions/21336794/python-recursive-vectorization-with-timeseries/21338665
http://stackoverflow.com/questions/23996260/python-pandas-using-the-previous-value-in-dataframe/23997030#23997030

R[i<=k] = start(i) 
R[i] = R[i-k]*a(i) + b(i) 

in the case where a(i) and b(i) are constants then this can be completely cythonized
if a and/or b are passed functions then the shell can still be cythonized (with the function called at each iteration)

Example

values = df['input']

In [62]: result = np.empty(len(values))

In [65]: result[0] = values[0]

In [66]: for i in xrange(len(result)):
   ....:     if i > 0:
   ....:         result[i] = result[i-1]*.8
   ....:         

In [67]: result
Out[67]: array([ 5.    ,  4.    ,  3.2   ,  2.56  ,  2.048 ,  1.6384])

I propose that pandas have a recurrence wrapper (that can then call a pure constant cython version and a cython version that takes functions)
Probably just for a 1-dim array atm

In theory this can also be done by a rolling_apply that has memory (or I guess that is sort of the same thing)

related #4546
https://groups.google.com/forum/#!topic/pydata/0MCWhwurOWs

@timmie
Copy link
Contributor

timmie commented Aug 15, 2013

Here a python function solving this partly:

def apply_sequencial(series, func, dtype_internal=float, dtype_res=dtype, precis=None):
    result_ser = series.copy()
    #result_ser = result_ser.astype(dtype_internal)
    result_ser .astype(dtype_internal)
    try:
        for i in xrange(len(result_ser)):
            if i > 0:
                input = (result_ser[i-1])
                print input, result_ser[i]
                result_ser[i] = func(input)
    except OverflowError:
        result_ser = result_ser.astype(dtype_internal)
        for i in xrange(len(result_ser)):
            if i > 0:
                input = (result_ser[i-1])
                result_ser[i] = func(input)
    if dtype_res:       
        result_ser = result_ser.astype(dtype_res)
    elif result_ser.dtype == dtype_internal:
        result_ser = result_ser.astype(series.dtype)
    else:
        result_ser = result_ser.astype(series.dtype)

    if precis:
        from numpy import round
        result_ser = round(result_ser, precis)

    return result_ser

and can be called like:

import pandas as pd
import numpy as np
ts = pd.Series(range(1, 7), index=pd.date_range('1/1/2000', periods=6))
data = (range(1, 13))
data = np.array(data)
df = pd.DataFrame(data, index=ts.index, columns=['A', 'B'])
df['C'] = apply_sequencial(df.A, func=(lambda x: (x * (0.8))), dtype_res=float, precis=4)

Maybe a hack from performance point but works ;-)

@dirkbike
Copy link

I think it's important to make a recurrent relation apply to an entire data frame and not just a single series, otherwise this enhancement becomes equivalent to the rolling_apply function as mentioned earlier.

@jreback
Copy link
Contributor Author

jreback commented Aug 19, 2013

of course, that is just an example of the impl; should for sure handle 2-d

@dirkbike
Copy link

Or even n-dimensions. The recurrence relation could take (n-1)-dimensional slices of the data frame along the remaining dimension and provide the designated apply function with the previous slice. That apply function would then return a new (n-1)-dimensional data frame representing the new current slice.

@jreback
Copy link
Contributor Author

jreback commented Aug 19, 2013

you could do that with a custom function, when I mean 2-d I mean it basically respects the axis parameter, like apply

@cpcloud
Copy link
Member

cpcloud commented Aug 19, 2013

Should get a working 1d version before we start generalizing

@MichaelWS
Copy link
Contributor

This would be a great piece. I use talib to do tons of financial calculations that are recurrence relations

@jreback jreback modified the milestones: 0.15.0, 0.14.0 Feb 18, 2014
@jreback jreback modified the milestones: 0.16.0, Next Major Release Mar 6, 2015
@datapythonista datapythonista modified the milestones: Contributions Welcome, Someday Jul 8, 2018
@jbrockmendel
Copy link
Member

this could be neat, but probably should live outside of pandas

@jreback
Copy link
Contributor Author

jreback commented Sep 19, 2020

these are actually trivially implemented with numba :->

@jreback jreback closed this as completed Sep 19, 2020
@glopesdev
Copy link

@jreback would you elaborate how these are trivially implemented with numba?

@jreback
Copy link
Contributor Author

jreback commented Jul 14, 2021

look in the window functions we already have ewm implemented this way

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Algos Non-arithmetic algos: value_counts, factorize, sorting, isin, clip, shift, diff API Design Enhancement Numeric Operations Arithmetic, Comparison, and Logical operations
Projects
None yet
Development

No branches or pull requests

8 participants