-
Notifications
You must be signed in to change notification settings - Fork 904
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
#1101 Implemented min_train_series_length for Theta and FourTheta #1111
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks!
Concerning the adaptation when the seasonality is not known beforehand, I agree with your observation that we should leave "data" (time series) and models decoupled. So I think we can live with that for now.
@@ -76,3 +76,51 @@ def test_best_model(self): | |||
self.assertTrue( | |||
mape(val_series, forecast_best) <= mape(val_series, forecast_random) | |||
) | |||
|
|||
def test_min_train_series_length_with_seasonality(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for adding unit tests :)
Codecov Report
@@ Coverage Diff @@
## master #1111 +/- ##
==========================================
- Coverage 93.76% 93.75% -0.01%
==========================================
Files 80 80
Lines 8207 8204 -3
==========================================
- Hits 7695 7692 -3
Misses 512 512
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
CHANGELOG.md
Outdated
@@ -34,6 +34,7 @@ Darts is still in an early development phase and we cannot always guarantee back | |||
- An issue with arguments being reverted for the `metric` function of gridsearch and backtest [#989](https://github.com/unit8co/darts/pull/989) by [Clara Grotehans](https://github.com/ClaraGrthns). | |||
- An error checking whether `fit()` has been called in global models [#944](https://github.com/unit8co/darts/pull/944) by [Julien Herzen](https://github.com/hrzn). | |||
- An error in Gaussian Process filter happening with newer versions of sklearn [#963](https://github.com/unit8co/darts/pull/963) by [Julien Herzen](https://github.com/hrzn). | |||
- Implemented the min_train_series_length method for the FourTheta and Theta models that overwrites the minimum default of 3 training samples by 2*seasonal_period when appropriate [#1101](https://github.com/unit8co/darts/pull/1101) by [Rijk van der Meulen](https://github.com/rijkvandermeulen) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rijkvandermeulen here too, could you move this to the Unreleased
section? Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Jup - done!
This PR implements the min_train_series_length() method for the FourTheta and Theta models that overwrites the minimum default of 3 training samples by 2*seasonal_period when appropriate.
Summary
When seasonality is used in the Theta and FourTheta models the minimum number of training samples should be 2*seasonal_period. Since the min_train_series_length() method wasn't implemented for the theta models, however, it always defaulted to 3 (for seasonal as well as non-seasonal models).
Other Information
Note that there is one case that the newly implemented min_train_series_length() method does not address: namely when season_mode != SeasonalityMode.NONE and seasonality_period==None. In this case, for us to be able to determine the minimum number of training samples, we need to first retrieve the seasonality_period from the time series. We can actually do this (for example by using the check_seasonality() function in utils.statistics). However, I wasn't entirely sure whether this would be desirable. The issue that I saw was that you lose your strict decoupling of models and time series. I.e., you cannot retrieve the min_train_series_length property from the model alone anymore - you'd have to pass the time series itself as well. Another possibility would be to make the seasonal_period a mandatory argument (probably not desired though I guess).
Looking forward to hearing your thoughts and feedback
Best,
Rijk