-
Notifications
You must be signed in to change notification settings - Fork 904
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix NLinear normalization to support past covariates #1873
Fix NLinear normalization to support past covariates #1873
Conversation
…covariates needing to end in the future
I added a test case that demonstrates why the change to # test_dlinear_nlinear.py line 277
e1, e2 = _eval_model(
train1,
train2,
val1,
val2,
None,
None,
past_cov1=past_cov1,
past_cov2=past_cov2,
val_past_cov1=val_past_cov1,
val_past_cov2=val_past_cov2,
cls=NLinearModel,
lkl=None,
normalize=True
) in # inference_dataset.py line 66:
if main_covariate_type is CovariateType.PAST:
future_end = past_end + max(0, n - output_chunk_length) * target_series.freq this line of code causes It seems to require past_covariates to extend into the future. Why is that? if a maintainer could help me understand I can finish the PR! |
If the |
Co-authored-by: Felix Divo <4403130+felixdivo@users.noreply.github.com>
…egressive predictions
should be ready for review! thanks @madtoinou for the explanation, that makes sense. made the same minor change from original PR, but added test case coverage to demonstrate why its necessary :) thank you for the patience. Had some issues with locally running tests through gradle (originating from lightning, support for float64 on macbook platform. Didn't think it was worth changing all the tests to include explicit casting to float32). Linting should be good Will make changes / fix if the tests are failing on pipeline once it runs. |
Codecov ReportPatch coverage is ❗ Your organization needs to install the Codecov GitHub app to enable full functionality.
📢 Thoughts on this report? Let us know!. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
Side note: This PR also extends the tests to cover DLinear and ensure it works too! 🚀 |
@madtoinou can this be merged? |
Hey @Eliotdoesprogramming and @felixdivo, and thanks for this. 🚀 The changes look good to me. I agree with the normalization of the target series to account for the distribution shift. Let's say we have an input_chunk_length of 3, and by chance all our batch samples start between March (month=3) and December (month=12). The input chunk for all past covariates batch samples in Also, we normalize the historic part of the future covariates (the values in the input chunk), but leave the values in the output chunk untouched. In my opinion they should be normalized as well using last value of the input chunk sequence. What's your take on these points? Should give users the choice to enable target/covariates normalization separately? And also normalize the future covariates on the output chunk? |
@dennisbader I agree its probably best to allow options I wanted to note that the changes here specifically affect when if self.normalize:
# discard covariates, to ensure that in_dim == out_dim
x = x[:, :, : self.output_dim]
x = x.permute(0, 2, 1) # (batch, out_dim, in_len) so covariates are similarly discarded from the output tensor. I think at least for my use cases, having the behavior be consistent for all inputs should be default. Current behavior will error with the following use case rather than being able to train: import darts
from darts.datasets import ETTh1Dataset
from darts.models.forecasting.nlinear import NLinearModel
series = ETTh1Dataset().load()
series = series.astype('float32')
target = series['HUFL']
past_cov = series['MULL']
model = NLinearModel(10, 1, shared_weights=False, normalize=True)
model.fit(series=target, past_covariates=past_cov) |
I was also very surprised to find that you could not train with covariates and would suggest to enable it by default. |
I worked through your comment and see your concerns. I would like to comment on these to move this conversation forward. It is a very stale discussion, given that it should be a simple issue to solve.
|
@felixdivo and @Eliotdoesprogramming , this is how I interpret it at the moment (feel free to correct me). My concern was regarding these two lines: line 141 and line 151.
So for my concern regarding future covariates: Here we normalize only the historic part of the future covariates (time steps in the input chunk) and leave the future part of future covariates (time steps in the output chunk) unchanged. Now when inverse transforming, the shape of Proposed solutionIf we only want to normalize only the target, then we need to change this. For the
Then for the inverse transformation from here, We should only add
Let me know what you think. |
Thank you, @dennisbader, for elaborating. This helped a lot. I think there are also use cases where we want to normalize the covariates (past, historical future, future) that go into the model. This is, for example, the case when we have multiple similar senor readings and want to use others to forecast the one we are interested in. Then, we would have similar dynamics in all the sensors and would, therefore, like to forecast the past covariates too. So we should probably replace the flag If only the target is to be normalized, your solution is probably the way to go. If both are normalized, the original PR would be fine. Any comments on this? |
sorry for the slow reply @dennisbader those changes sound good to me and I have implemented them. Thanks for all the help on the PR! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the update @Eliotdoesprogramming. Can you remove the previous inverse transformation, as now we perform it twice?
@felixdivo, even if we want to normalize both target and covariates, the initial PR would not work properly when using a likelihood with Let's say we use output_chunk_length=1, 1 target component/column, 1 past covariates component, and we use a gaussian likelihood with 2 params (mean, std). At the inverse transformation step:
So We do not want to add the past covariates value to |
the duplicated normalization step has now been removed 👍 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
extra normalization step removed!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for fixing @Eliotdoesprogramming, looks great now 🚀
Also in the new release, users can try out another normalizaton technique: The Reversible Instance Normalization can be used with any torch model (except RNNModel) with use_reversible_instance_norm=True
at model creation.
I think something went wrong here. I opened a new issue: #2035. |
Re-opening #1583
(sorry made a bit of a mistake when working on that branch, decided a fresh fork & pr might be best)
Normalization description from original paper
NLinear: To boost the performance of Linear when there is a distribution shift in the dataset, NLinear first subtracts the input by the last value of the sequence. Then, the input goes through a linear layer, and the subtracted part is added back before making the final prediction. The subtraction and addition in NLinear are a simple normalization for the input sequence.
Summary
current implementation of normalization follows the implementation here
this implementation works when the number of covariates being predicted as our target variable is the same as the number of covariates in our input (prev comment in implementation is incorrect, will work when n_params > 1 if n_params = target covariates)
since self.n_params == the amount of covariates we are predicting for AND we know that they are ordered first in our tensor
input_tensor = [batch,timesteps, input_dim/number of covariates]
we can slice the tensor to only include the covariates in our target tensor like solast_seq[:,:,output_dim:]
Other Information
New to doing open source work, please let me know if theres more that I need to do!! This was something I found when working on one of my own projects.