Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: Automatic K optimization for Fourier Terms #207

Open
AshwinPuri13 opened this issue Oct 18, 2019 · 4 comments
Open

Feature Request: Automatic K optimization for Fourier Terms #207

AshwinPuri13 opened this issue Oct 18, 2019 · 4 comments

Comments

@AshwinPuri13
Copy link

If I wish to fit a regression with Fourier terms then to find the optimal K I need to do something like this:

library(fable)
library(dplyr)
library(tidyr)

mbl = tsibbledata::ansett %>%
  tsibble::fill_gaps() %>%
  model(arima1 = ARIMA(Passengers ~ fourier(K = 1) + PDQ(0,0,0)),
        arima2 = ARIMA(Passengers ~ fourier(K = 2) + PDQ(0,0,0)),
        arima3 = ARIMA(Passengers ~ fourier(K = 3) + PDQ(0,0,0)))

metrics = mbl %>%
  glance()

mbl_best = metrics %>%
  select(Airports, Class, .model, AICc) %>%
  group_by(Airports, Class) %>%
  slice(which.min(AICc)) %>%
  left_join(mbl %>%
              gather('.model', 'model', -Airports, -Class),
            by = c('.model', 'Airports', 'Class')) %>%
  as_mable(key = c('Airports', 'Class'), models = 'model')

It would be more convenient for K to be automatically determined through something like this:

model(arima = ARIMA(Passengers ~ Fourier(K = 1:3) + PDQ(0,0,0)

On that note, when I look at the source code for ARIMA it appears that when fitting a regression + ARIMA the number of differences is determined after the regression. Because of this, it seems entirely possible that the arima1, arima2 and arima3 models I fit could potentially have a different number of differencing. If this is indeed the case perhaps determining K through cross validation is better?

Thanks!

@robjhyndman
Copy link
Member

Automating the choice of K could be a feature we look at in a future release. It is very unlikely to affect the order of differencing, so I think using AICc for selection is safe enough.

@mitchelloharawild
Copy link
Member

This is something which will need to be added on a model by model basis, as each model will have different methods of model selection.

@JaySumners
Copy link

Could we iteratively select the best K based on the whatever criteria is used in the base model? My idea is to fit fourier series of different K linearly to the response and select the one with the best criteria measure as passed by the base model. Is there a case where we wouldn't want to fit it linearly? I'll admit that re-estimation after fitting the rest of the model would be good, but that this might provided directionality for the user that doesn't know which K to select.

@juan-g-p
Copy link

As in interim solution, I am trying to fit multiple moders in a loop manner so that I do not have to repeat the formula so many times.

Yet I am struggling (I do not have that much of a background in tidy R).

Could you help?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants