Stranbo integration #11

gvdr · 2023-10-02T01:56:15Z

Leaving an issue here to consider whether we might integrate the functionalities I have in Stranbo and arfima.

Stranbo is designed for generating time series with "complicated" sarimax (integer $i$) processes. Complicated means many seasonal effects (think, daily + weekly + monthly + yearly, ..., each one with its own d, ar, and ma coefficients) and/or many auxiliary components (again, each one with its own d, ar, and ma coefficients).

It is slower than ARFIMA, a little bit due to the generality, and a tad due to the fact that I don't use any @fastmath tricks (I got scared because of this kind of things, but maybe here it's safe?).

The high-picture view of how Stranbo work is:

define all processes components as classes
detine a process as an vector of components
gather all coefficients, and expand the polynomials for x, z, and the auxiliary components using Polynomials
For each x[t]
4.1 use a modified getindex to look in the past (modified so to handle negative indices without padding)
4.2 dot product coefficients and backward shifted (and truncated) series
4.3 sum up
(eventually add anomalies...)

I believe that 4 is in mostly shared between the two packages (but I can learn a bit to improve even more). I don't think the polynomial stuff is in ARFIMA (I believe you have hardcoded the loops, right?).

The text was updated successfully, but these errors were encountered:

Datseris · 2023-10-06T08:36:39Z

That all seems fine to me. They way I have done it so far does not require the existence of classes.

The default value for everything is nothing. If it is not nothing, it is a vector of coefficients. Multiple dispatch can differentiate between the two. Is there a particular reason to introduce new Types (Julia does not have classes, so I am guessing you mean Types in point 1). For example, to enable the "moving average" component of the process, you simply assign the "moving average" keyword the value of a Static Vector instead of nothing.

I don't think the polynomial stuff is in ARFIMA (I believe you have hardcoded the loops, right?).

I am not sure I understand what you mean by "hardcoded" (because I don't know what polynomial stuff means in this context). This is the main loop:

ARFIMA.jl/src/ARFIMA.jl

Lines 97 to 102 in 1972247

    
           # hotloop1: so that dispach doesn't happen dynamically on SVector{d}: 
        
           function _hotloop1!(X, noise, differencing, d, N) 
        
               for i in d+1:N+d 
        
                   @inbounds X[i] = bdp(differencing, X, i) + noise[d+i] 
        
               end 
        
           end

does it count as "hardcoded"? How much I loop depends on the user input, so I am not sure whether this qualifies as hardcoded or not.

gvdr · 2023-10-07T05:49:22Z

The need for types (sorry) comes from trying to have multiple seasonality effects. Each seasonal component may have its own ar and ma coefficients. One $(2,0,3)s$ component, for example, will bring to the equation a
$1-\phi_1B^s - \phi_2B^{2s}$ for the ar and a $1+\psi_1B^s + \psi_2B^{2s} + \psi_3B^{3*s}$ for the ma component (see here for example). If we want a sarima with weekly, monthly, and yearly seasonal components, we might in general need 6 new array of coefficients. An arima without seasonal component corresponds to a sarima with seasonality 1 in this fashion.

The most expressive way I could find to allow user to define potentially complicated sarimax was to define a building block type, and allowing for composition of these blocks. So, a sarima with seasonalities equal to 1, 4, 6, 9, would be written as:

this_sarima = [
sarima(ar = [...coefficients...], ma = [...coefficients...]),
sarima(ar = [...coefficients...], ma = [...coefficients...], s = 4),
sarima(ar = [...coefficients...], ma = [...coefficients...], s = 6),
sarima(ar = [...coefficients...], ma = [...coefficients...], s = 9),
]

where all the coefficients can be different (and in different number, even empty).

Datseris · 2023-10-07T07:13:44Z

Oh okay. I see. Then this means that the s component (for seasonal) should have dedicated supertype and subtypes, to handle all of this simpler. However, I stand that the AR or MA components don't need dedicated types, and one can use static vectors as I have been doing so far. Right?

gvdr · 2023-10-07T21:53:31Z

Absolutely right. I'm using static arrays to store the coefficients as well ☺️ (with a convenience constructor function that takes a regular vector and convert it to a SVector before calling the type constructor, so users don't need to deal directly with static vectors, unless they want to).

Datseris · 2023-10-08T07:18:19Z

Okay, so then what stops us from merging the two packages?

COuldn't we have something like a "main" function sarfimax that takes in s, ar, d, ma, x (with d for the integration), and just does everuthing?

gvdr · 2023-10-08T22:49:32Z

Something similar seems doable.

With the minor cavaet that, for the moment, in Stranbo the functions sarima / sarimax construct the components, they don't run the simulation (that is left to sample(VectorOfComponents)). It would be easy to a sarfimax constructor. I would need to define a sample method for the sarfimax type, which will rely on your code.

Does it make sense to have "component" which is fractionally integrated? Do fractional integration admits seasonality? That is, would the user want to write something like:

this_sarfimax = [
sarima(d = 1, ar = ..., ma = ...),
sarima(d = 1, ar = ..., ma = ..., s = 4),
sarfima(d = 0.3, ar = ..., ma = ..., s = 6)
]

sample(this_sarfimax, 100_000)

or would this be nonsensical?

Datseris · 2023-10-09T07:07:29Z

What's the advantage of not running the simulation and making the user go the extra step of calling sample? Doesn't this unecessarily complicate things? Is there anything else that is possible to do with this VectorOfComponents besides just getting the timeseries?

I don't think it makes sense to have d::Real with seasonal components. If d is not an integer and s is not nothing, we should throw an error.

gvdr · 2023-10-09T21:45:57Z

What's the advantage of not running the simulation and making the user go the extra step of calling sample? Doesn't this unecessarily complicate things?

I'm no sure. With an example, let's define a (1,1,1)x(1,1,0)5x(1,0,1)10.
For me the choice seems to be between:

# my syntax
a_sarima_proc = [
sarima(d = 1, ar = [.8,.3,.2], ma = [.3,.1]), # non seasonal component
sarima(d = 1, ar = [.4], s = 5), # first seasonal effect
sarima(ar = [.5,.3], ma = [.4,.3,.1], s = 10) # second seasonal effect
]

sample(a_sarima_proc,100_000)

and

# standard syntax
sarima(
    d = [1,1,0],
    ar = [ [.8,.3,.2],  [.4], [.5,.3]],
    ma = [[.3,.1], nothing,  [.4,.3,.1]],
    s = [1,5,10]
)

For me the first is way more readable.

Is there anything else that is possible to do with this VectorOfComponents besides just getting the timeseries?

For one, it is easier to manipulate. For example, if I want to add a new seasonality effect I can just do:

new_effect = sarima(d = 2, ma = [0.4], s = 20)
new_sarima_proc = vcat(a_sarima_proc, new_effect)
sample(new_sarima_proc,100_000)

And there is some experimental feature I'm working out. So, ~~it should be~~ it is possible to do something like the following, to obtain 3 correlated series (eheh, I was not expecting it to just work).

using Distributions, LinearAlgebra

# we build a var-covar matrix
Σ = rand(3,3)
Σ = Σ .* Σ' 

# sample n points
WNoise = MvNormal([.0,.0,.0], I(3)+Σ)
z = rand(WNoise,100_000)

# define equivalent seasonality effects
seasonalities = [
    sarima(d = 1, ar = [.4], s = 5), # first seasonal effect
    sarima(ar = [.5,.3], ma = [.4,.3,.1], s = 10) # second seasonal effect
]

# define base processes

b1 = sarima(ar = [.5], ma = [.5], dₙ = z[1,:])
b2 = sarima(ar = [.5], ma = [.5], dₙ = z[2,:])
b3 = sarima(ar = [.5], ma = [.5], dₙ = z[3,:])

three_trajectories = sample.(
    [vcat(b1,seasonalities),
     vcat(b2,seasonalities),
     vcat(b3,seasonalities)],
     100_000
)

# try different seasonal processes

new_seasonalities = [
    sarima(ar = [.2], ma = [.1,.01,.001], s = 7), # first seasonal effect
    sarima(ar = [.5,.3], ma = [.4,.3,.1], s = 10) # second seasonal effect
]

other_three_trajectories = sample.(
    [vcat(b1,new_seasonalities),
     vcat(b2,new_seasonalities),
     vcat(b3,new_seasonalities)],
     100_000
)

I'm sure all of this can be done in other syntaxes, but I think this one is rather clear and easy to use.

Datseris · 2023-10-11T07:49:40Z

Okay, the vector syntax is fine.

But then, the only thing you can utilize for ARFIMA.jl is the fractional integration, right? But the fractional integration doesn't play well with seasonal. So the fractional integration will be used with vectors that always have only one component.

Can we define the convenience sample(::Component) dispatch as well?

And what is the plan forwards now? Where do we put all methods? Do we make a new package? SARFIMAX or something like that? Do we add the seasonal stuff here?

gvdr · 2023-10-11T08:33:39Z

Can we define the convenience sample(::Component) dispatch as well?

Sure, that's already there:

julia> using Stranbo
julia> test = sarima(ar = [.8], ma = [2.])
SARIMA{Float64}(1, 0, [0.8], [2.0], Distributions.Normal{Float64}(μ=0.0, σ=1.0))
julia> sample(test,1000)
1000-element Vector{Float64}:
 -0.3901653040936471
 -2.545535084952462
 -3.169123738677272
  1.3247596769491126
  1.1210945293432513
  1.7294814531680687
  4.1878387343420105
  ⋮
 -1.6600430819823186
 -4.6308613372374134
 -8.743526206444377
 -8.697082865177684
 -2.354005619615946
  0.013866600868192025

what is the plan forwards now? Where do we put all methods? Do we make a new package? SARFIMAX or something like that?

Open to any solution. Calling it sarfimax might mean we need to change it again if we include something else? But I'm terrible with names! :-D

the only thing you can utilize for ARFIMA.jl is the fractional integration, right?

That, and I should take a look closely at how you handle the bdp(). I've a similar operator, but I fear that is the performance bottleneck for me.

Datseris · 2023-10-11T09:12:03Z

How much of the source code here can you use? Can you add your seasonal functionality to this package? And we rename this package into AutoregressiveSimulations.jl and release it a-new? If you think using the source code here is too hard, we can start a new package AutoregressiveSimulations.jl from scratch.

package name is up to debate of course.

gvdr · 2023-10-18T18:44:47Z

Sorry, had very busy days, I'll get to this asap :-)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stranbo integration #11

Stranbo integration #11

gvdr commented Oct 2, 2023

Datseris commented Oct 6, 2023

gvdr commented Oct 7, 2023 •

edited

Loading

Datseris commented Oct 7, 2023

gvdr commented Oct 7, 2023 •

edited

Loading

Datseris commented Oct 8, 2023 •

edited

Loading

gvdr commented Oct 8, 2023 •

edited

Loading

Datseris commented Oct 9, 2023

gvdr commented Oct 9, 2023 •

edited

Loading

Datseris commented Oct 11, 2023

gvdr commented Oct 11, 2023

Datseris commented Oct 11, 2023

gvdr commented Oct 18, 2023

Stranbo integration #11

Stranbo integration #11

Comments

gvdr commented Oct 2, 2023

Datseris commented Oct 6, 2023

gvdr commented Oct 7, 2023 • edited Loading

Datseris commented Oct 7, 2023

gvdr commented Oct 7, 2023 • edited Loading

Datseris commented Oct 8, 2023 • edited Loading

gvdr commented Oct 8, 2023 • edited Loading

Datseris commented Oct 9, 2023

gvdr commented Oct 9, 2023 • edited Loading

Datseris commented Oct 11, 2023

gvdr commented Oct 11, 2023

Datseris commented Oct 11, 2023

gvdr commented Oct 18, 2023

gvdr commented Oct 7, 2023 •

edited

Loading

gvdr commented Oct 7, 2023 •

edited

Loading

Datseris commented Oct 8, 2023 •

edited

Loading

gvdr commented Oct 8, 2023 •

edited

Loading

gvdr commented Oct 9, 2023 •

edited

Loading