Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stranbo integration #11

Open
gvdr opened this issue Oct 2, 2023 · 12 comments
Open

Stranbo integration #11

gvdr opened this issue Oct 2, 2023 · 12 comments

Comments

@gvdr
Copy link

gvdr commented Oct 2, 2023

Leaving an issue here to consider whether we might integrate the functionalities I have in Stranbo and arfima.

Stranbo is designed for generating time series with "complicated" sarimax (integer $i$) processes. Complicated means many seasonal effects (think, daily + weekly + monthly + yearly, ..., each one with its own d, ar, and ma coefficients) and/or many auxiliary components (again, each one with its own d, ar, and ma coefficients).

It is slower than ARFIMA, a little bit due to the generality, and a tad due to the fact that I don't use any @fastmath tricks (I got scared because of this kind of things, but maybe here it's safe?).

The high-picture view of how Stranbo work is:

  1. define all processes components as classes
  2. detine a process as an vector of components
  3. gather all coefficients, and expand the polynomials for x, z, and the auxiliary components using Polynomials
  4. For each x[t]
    4.1 use a modified getindex to look in the past (modified so to handle negative indices without padding)
    4.2 dot product coefficients and backward shifted (and truncated) series
    4.3 sum up
  5. (eventually add anomalies...)

I believe that 4 is in mostly shared between the two packages (but I can learn a bit to improve even more). I don't think the polynomial stuff is in ARFIMA (I believe you have hardcoded the loops, right?).

@Datseris
Copy link
Member

Datseris commented Oct 6, 2023

That all seems fine to me. They way I have done it so far does not require the existence of classes.

The default value for everything is nothing. If it is not nothing, it is a vector of coefficients. Multiple dispatch can differentiate between the two. Is there a particular reason to introduce new Types (Julia does not have classes, so I am guessing you mean Types in point 1). For example, to enable the "moving average" component of the process, you simply assign the "moving average" keyword the value of a Static Vector instead of nothing.

I don't think the polynomial stuff is in ARFIMA (I believe you have hardcoded the loops, right?).

I am not sure I understand what you mean by "hardcoded" (because I don't know what polynomial stuff means in this context). This is the main loop:

ARFIMA.jl/src/ARFIMA.jl

Lines 97 to 102 in 1972247

# hotloop1: so that dispach doesn't happen dynamically on SVector{d}:
function _hotloop1!(X, noise, differencing, d, N)
for i in d+1:N+d
@inbounds X[i] = bdp(differencing, X, i) + noise[d+i]
end
end

does it count as "hardcoded"? How much I loop depends on the user input, so I am not sure whether this qualifies as hardcoded or not.

@gvdr
Copy link
Author

gvdr commented Oct 7, 2023

The need for types (sorry) comes from trying to have multiple seasonality effects. Each seasonal component may have its own ar and ma coefficients. One $(2,0,3)s$ component, for example, will bring to the equation a
$1-\phi_1B^s - \phi_2B^{2s}$ for the ar and a $1+\psi_1B^s + \psi_2B^{2s} + \psi_3B^{3*s}$ for the ma component (see here for example). If we want a sarima with weekly, monthly, and yearly seasonal components, we might in general need 6 new array of coefficients. An arima without seasonal component corresponds to a sarima with seasonality 1 in this fashion.

The most expressive way I could find to allow user to define potentially complicated sarimax was to define a building block type, and allowing for composition of these blocks. So, a sarima with seasonalities equal to 1, 4, 6, 9, would be written as:

this_sarima = [
sarima(ar = [...coefficients...], ma = [...coefficients...]),
sarima(ar = [...coefficients...], ma = [...coefficients...], s = 4),
sarima(ar = [...coefficients...], ma = [...coefficients...], s = 6),
sarima(ar = [...coefficients...], ma = [...coefficients...], s = 9),
]

where all the coefficients can be different (and in different number, even empty).

@Datseris
Copy link
Member

Datseris commented Oct 7, 2023

Oh okay. I see. Then this means that the s component (for seasonal) should have dedicated supertype and subtypes, to handle all of this simpler. However, I stand that the AR or MA components don't need dedicated types, and one can use static vectors as I have been doing so far. Right?

@gvdr
Copy link
Author

gvdr commented Oct 7, 2023

Absolutely right. I'm using static arrays to store the coefficients as well ☺️ (with a convenience constructor function that takes a regular vector and convert it to a SVector before calling the type constructor, so users don't need to deal directly with static vectors, unless they want to).

@Datseris
Copy link
Member

Datseris commented Oct 8, 2023

Okay, so then what stops us from merging the two packages?

COuldn't we have something like a "main" function sarfimax that takes in s, ar, d, ma, x (with d for the integration), and just does everuthing?

@gvdr
Copy link
Author

gvdr commented Oct 8, 2023

Something similar seems doable.

With the minor cavaet that, for the moment, in Stranbo the functions sarima / sarimax construct the components, they don't run the simulation (that is left to sample(VectorOfComponents)). It would be easy to a sarfimax constructor. I would need to define a sample method for the sarfimax type, which will rely on your code.

Does it make sense to have "component" which is fractionally integrated? Do fractional integration admits seasonality? That is, would the user want to write something like:

this_sarfimax = [
sarima(d = 1, ar = ..., ma = ...),
sarima(d = 1, ar = ..., ma = ..., s = 4),
sarfima(d = 0.3, ar = ..., ma = ..., s = 6)
]

sample(this_sarfimax, 100_000)

or would this be nonsensical?

@Datseris
Copy link
Member

Datseris commented Oct 9, 2023

What's the advantage of not running the simulation and making the user go the extra step of calling sample? Doesn't this unecessarily complicate things? Is there anything else that is possible to do with this VectorOfComponents besides just getting the timeseries?


I don't think it makes sense to have d::Real with seasonal components. If d is not an integer and s is not nothing, we should throw an error.

@gvdr
Copy link
Author

gvdr commented Oct 9, 2023

What's the advantage of not running the simulation and making the user go the extra step of calling sample? Doesn't this unecessarily complicate things?

I'm no sure. With an example, let's define a (1,1,1)x(1,1,0)5x(1,0,1)10.
For me the choice seems to be between:

# my syntax
a_sarima_proc = [
sarima(d = 1, ar = [.8,.3,.2], ma = [.3,.1]), # non seasonal component
sarima(d = 1, ar = [.4], s = 5), # first seasonal effect
sarima(ar = [.5,.3], ma = [.4,.3,.1], s = 10) # second seasonal effect
]

sample(a_sarima_proc,100_000)

and

# standard syntax
sarima(
    d = [1,1,0],
    ar = [ [.8,.3,.2],  [.4], [.5,.3]],
    ma = [[.3,.1], nothing,  [.4,.3,.1]],
    s = [1,5,10]
)

For me the first is way more readable.

Is there anything else that is possible to do with this VectorOfComponents besides just getting the timeseries?

For one, it is easier to manipulate. For example, if I want to add a new seasonality effect I can just do:

new_effect = sarima(d = 2, ma = [0.4], s = 20)
new_sarima_proc = vcat(a_sarima_proc, new_effect)
sample(new_sarima_proc,100_000)

And there is some experimental feature I'm working out. So, it should be it is possible to do something like the following, to obtain 3 correlated series (eheh, I was not expecting it to just work).

using Distributions, LinearAlgebra

# we build a var-covar matrix
Σ = rand(3,3)
Σ = Σ .* Σ' 

# sample n points
WNoise = MvNormal([.0,.0,.0], I(3)+Σ)
z = rand(WNoise,100_000)

# define equivalent seasonality effects
seasonalities = [
    sarima(d = 1, ar = [.4], s = 5), # first seasonal effect
    sarima(ar = [.5,.3], ma = [.4,.3,.1], s = 10) # second seasonal effect
]

# define base processes

b1 = sarima(ar = [.5], ma = [.5], dₙ = z[1,:])
b2 = sarima(ar = [.5], ma = [.5], dₙ = z[2,:])
b3 = sarima(ar = [.5], ma = [.5], dₙ = z[3,:])

three_trajectories = sample.(
    [vcat(b1,seasonalities),
     vcat(b2,seasonalities),
     vcat(b3,seasonalities)],
     100_000
)

# try different seasonal processes

new_seasonalities = [
    sarima(ar = [.2], ma = [.1,.01,.001], s = 7), # first seasonal effect
    sarima(ar = [.5,.3], ma = [.4,.3,.1], s = 10) # second seasonal effect
]

other_three_trajectories = sample.(
    [vcat(b1,new_seasonalities),
     vcat(b2,new_seasonalities),
     vcat(b3,new_seasonalities)],
     100_000
)

I'm sure all of this can be done in other syntaxes, but I think this one is rather clear and easy to use.

@Datseris
Copy link
Member

Okay, the vector syntax is fine.

But then, the only thing you can utilize for ARFIMA.jl is the fractional integration, right? But the fractional integration doesn't play well with seasonal. So the fractional integration will be used with vectors that always have only one component.

Can we define the convenience sample(::Component) dispatch as well?

And what is the plan forwards now? Where do we put all methods? Do we make a new package? SARFIMAX or something like that? Do we add the seasonal stuff here?

@gvdr
Copy link
Author

gvdr commented Oct 11, 2023

Can we define the convenience sample(::Component) dispatch as well?

Sure, that's already there:

julia> using Stranbo
julia> test = sarima(ar = [.8], ma = [2.])
SARIMA{Float64}(1, 0, [0.8], [2.0], Distributions.Normal{Float64}=0.0, σ=1.0))
julia> sample(test,1000)
1000-element Vector{Float64}:
 -0.3901653040936471
 -2.545535084952462
 -3.169123738677272
  1.3247596769491126
  1.1210945293432513
  1.7294814531680687
  4.1878387343420105
  
 -1.6600430819823186
 -4.6308613372374134
 -8.743526206444377
 -8.697082865177684
 -2.354005619615946
  0.013866600868192025

what is the plan forwards now? Where do we put all methods? Do we make a new package? SARFIMAX or something like that?

Open to any solution. Calling it sarfimax might mean we need to change it again if we include something else? But I'm terrible with names! :-D

the only thing you can utilize for ARFIMA.jl is the fractional integration, right?

That, and I should take a look closely at how you handle the bdp(). I've a similar operator, but I fear that is the performance bottleneck for me.

@Datseris
Copy link
Member

How much of the source code here can you use? Can you add your seasonal functionality to this package? And we rename this package into AutoregressiveSimulations.jl and release it a-new? If you think using the source code here is too hard, we can start a new package AutoregressiveSimulations.jl from scratch.

package name is up to debate of course.

@gvdr
Copy link
Author

gvdr commented Oct 18, 2023

Sorry, had very busy days, I'll get to this asap :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants