Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat/encoder improvement #1338

Merged
merged 29 commits into from
Dec 22, 2022
Merged

Feat/encoder improvement #1338

merged 29 commits into from
Dec 22, 2022

Conversation

dennisbader
Copy link
Collaborator

@dennisbader dennisbader commented Nov 6, 2022

Summary

  • extend encoder support to all forecasting models supporting covariates
  • adds covariate lags to encoders
  • encoders now generate the correct minimum required covariate time spans for all models (assuming that user did not supply any covariates). This is done with respect to parameters input_chunk_length, output_chunk_length, covariates_lags (past_covariates_lags, future_covariates_lags). Required some unit test adaptions to capture the new logic.
  • improve encoder documentation including presentation
  • make one shot regression models work properly with encoders

Additional improvements:

  • KalmanForecaster is now a TransferrableFutureCovariatesLocalForecastingModel, meaning that predict() accepts a different target than used when fitting the model
  • simplify the way Explainer handles encoders thanks to the encoder improvements.

Additional fixes:

  • LocalForecastingModel slicing of integer indexed future covariates

TODO (in future PR)

  • make Encoders compatible with BaseDataTransformer for pipelines
  • generate encodings at beginning of historical_forecasts to increase efficiency

@dennisbader dennisbader requested a review from hrzn as a code owner November 6, 2022 13:04
@codecov-commenter
Copy link

codecov-commenter commented Nov 6, 2022

Codecov Report

Base: 93.68% // Head: 93.67% // Decreases project coverage by -0.00% ⚠️

Coverage data is based on head (3fb2e43) compared to base (4e25889).
Patch coverage: 95.62% of modified lines in pull request are covered.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1338      +/-   ##
==========================================
- Coverage   93.68%   93.67%   -0.01%     
==========================================
  Files          94       94              
  Lines        9400     9485      +85     
==========================================
+ Hits         8806     8885      +79     
- Misses        594      600       +6     
Impacted Files Coverage Δ
darts/explainability/explainability.py 96.42% <ø> (ø)
darts/explainability/shap_explainer.py 87.92% <ø> (-0.18%) ⬇️
darts/models/filtering/kalman_filter.py 98.71% <ø> (-0.02%) ⬇️
darts/timeseries.py 91.70% <ø> (-0.08%) ⬇️
darts/models/forecasting/forecasting_model.py 96.70% <92.45%> (-0.35%) ⬇️
darts/dataprocessing/encoders/encoder_base.py 96.01% <95.00%> (-0.68%) ⬇️
darts/dataprocessing/encoders/encoders.py 98.86% <100.00%> (+0.04%) ⬆️
darts/models/filtering/filtering_model.py 100.00% <100.00%> (ø)
darts/models/forecasting/arima.py 95.34% <100.00%> (ø)
darts/models/forecasting/auto_arima.py 100.00% <100.00%> (ø)
... and 13 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

Copy link
Contributor

@hrzn hrzn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks for these changes @dennisbader ! Got a few small comments

scenarios described below. With user `covariates`, it simply copies and returns the `covariates` time index.

It can be used:
A in combination with :class:`LocalForecastingModel`, or in a model agnostic scenario:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Nitpicking - no need to do it if annoying] Instead of supporting all these different cases based on conventions (e.g "if calling from a RegressionModel, set the parameters in this way"), you could consider using factory methods (a bit like we have in TimeSeries). E.g. CovariatesIndexGenerator.for_regression_model(lags, lags_past_cov, lags_fut_cov, out_len) and CovariatesIndexGenerator.for_torch_model(in_len, out_len).

darts/dataprocessing/encoders/encoder_base.py Outdated Show resolved Hide resolved
darts/models/filtering/filtering_model.py Outdated Show resolved Hide resolved
darts/models/forecasting/forecasting_model.py Outdated Show resolved Hide resolved
darts/models/forecasting/forecasting_model.py Outdated Show resolved Hide resolved
future_covariates = future_covariates[
start : start + offset * self.training_series.freq
]
future_covariates = future_covariates.slice(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't forget that slice() is inclusive on the right for DatetimeIndex

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is correct, but let me know if I'm missing something.
start is one step after end of target series, and we end at:

  • n - 1 steps ahead of start for DatetimeIndex (as inclusive with slice)
  • n steps ahead of start for RangeIndex (as non-inclusive with slice)

darts/tests/metrics/test_metrics.py Outdated Show resolved Hide resolved
@@ -2549,10 +2549,6 @@ def append(self, other: "TimeSeries") -> "TimeSeries":
attrs=self._xa.attrs,
)

# new_xa = xr.concat(objs=[self._xa, other_xa], dim=str(self._time_dim))
if not self._has_datetime_index:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was not needed?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was related to

  • fixed TimeSeries.append() that dropped "time" index for integer indexed series

@hrzn
Copy link
Contributor

hrzn commented Nov 15, 2022

Also not forget: shift encodings when using one shot.

Copy link
Contributor

@hrzn hrzn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @dennisbader !

@dennisbader dennisbader merged commit 919f214 into master Dec 22, 2022
@dennisbader dennisbader deleted the feat/encoder_improvement branch December 22, 2022 12:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants