-
Notifications
You must be signed in to change notification settings - Fork 904
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extend backtesting methods to retrain after configurable number of time steps #623
Comments
Hi, I would like to contribute to this issue. The solution I have in mind is very straightforward with minimal impact I think. The main method impacted will be |
Hi @chefPony, thanks for your suggestion. That looks very good to me! Feel free to go ahead and open a PR (please check the guidelines before). Many thanks! |
Hi @hrzn, I would like to propose a more general approach, which I find particularly useful for retraining. The way to do that is to pass a callable to determine on which conditions you should retrain the model, leaving the user all the flexibility, i.e. something like: if (not self._fit_called
or (retrain
and retrain_on_condition(pred_time, train_series, past_covariates, future_covariates))
):
# retrain the model here Where the function import pandas as pd
from darts import TimeSeries
def retrain_on_condition(
pred_time: pd.Timestamp = None,
train_series: TimeSeries = None,
past_covariates: TimeSeries = None,
future_covariates: TimeSeries = None
):
"""
Examples:
return pred_time.hour == 0 # retrain every midnight (for freq="H")
return pred_time.day == 1 # retrain every month start (for freq="D")
return (pred_time.hour == 0) & (pred_time.day == 1) # retrain every month start (for freq="H")
return (pred_time.hour == 0) & (pred_time.dayofweek == 0) # retrain every week (for freq="H")
"""
return (pred_time.hour == 0) & (pred_time.day == 1) # retrain every month start (for freq="H") This adds a layer of complexity for the user, yet it is very flexible to similate and backtest what would happen in a production environment since it leaves the possibility to implement logics on the data series. As an example, we retrain models every month first OR when we get some kind of alert on the data quality/drift |
Hi @FBruzzesi, thanks for the suggestion. That sounds like a good idea too! A paramount objective in Darts is to try to always keep it simple for users by default. So how about for instance mixing several of those ideas as follows:
Would this sound good? We'd be happy to receive a PR around that :) |
Thanks for the feedback @hrzn, I completely agree that allowing for both implementations ( I can work on this issue, but before assigning it I would like to understand correctly few small implementation details, namely:
|
Thanks, overall what you propose for filling in default arguments definitely makes sense (and I think that should be possible). A couple of small remarks:
Concerning testing, I think you can write a new test function e.g., in |
Yes yes! You are absolutely right, it's already covered in such case, I was just trying to put everything together in the same functionality.
Maybe I am missing something but, while
Yes of course, but let's say I want to retrain on some condition for def retrain(past_covariates: TimeSeries) -> bool:
return past_covariates["some_col_name"].values()[-1] < some_threshold_value where the Addressing test(s): I may need some help; actually all we need to test for is that the Finally, I started working on the retrain function decorator:
|
Am I missing something or you would still need to pass all the arguments to if _retrain_wrapper(counter, pred_time, train_series, past_covariates, future_covariates) or not self._fit_called:
self._fit_wrapper(
series=train,
past_covariates=past_covariates,
future_covariates=future_covariates,
) If yes, we could just extend the signature of the user provided callable with from inspect import signature
def extend_signature(fun):
def retrain_wrapper(**kwargs):
fun_param = {param:value for param, value in kwargs.items() if param in signature(fun).parameters}
return fun(**fun_param)
return retrain_wrapper
@extend_signature
def retrain_every_5(counter):
return (counter % 5 == 0)
retrain_every_5(counter=10) #=> True
retrain_every_5(counter=5, pred_time=pd.Timestamp.today()) #=> True
retrain_every_5(pred_time=pd.Timestamp.today()) #=> Error missing parameter counter |
Yes correct, that's exactly what I was thinking, passing all the agreed upon arguments and ignoring the ones not in the function original signature. That's the reasoning behind this
Please let me know if you are working on this or I can proceed with the pull request. I am left with tests implementation and the other minor details asked above. |
Go ahead ;) |
|
Released in v0.22.0 🚀 |
See: #135 (comment)
The text was updated successfully, but these errors were encountered: