Fix NLinear normalization to support past covariates #1873

Eliotdoesprogramming · 2023-06-30T21:38:27Z

Re-opening #1583
(sorry made a bit of a mistake when working on that branch, decided a fresh fork & pr might be best)

Normalization description from original paper

NLinear: To boost the performance of Linear when there is a distribution shift in the dataset, NLinear first subtracts the input by the last value of the sequence. Then, the input goes through a linear layer, and the subtracted part is added back before making the final prediction. The subtraction and addition in NLinear are a simple normalization for the input sequence.

Summary

current implementation of normalization follows the implementation here

this implementation works when the number of covariates being predicted as our target variable is the same as the number of covariates in our input (prev comment in implementation is incorrect, will work when n_params > 1 if n_params = target covariates)

since self.n_params == the amount of covariates we are predicting for AND we know that they are ordered first in our tensor
input_tensor = [batch,timesteps, input_dim/number of covariates] we can slice the tensor to only include the covariates in our target tensor like so last_seq[:,:,output_dim:]

Other Information

New to doing open source work, please let me know if theres more that I need to do!! This was something I found when working on one of my own projects.

…covariates needing to end in the future

Eliotdoesprogramming · 2023-06-30T21:44:18Z

I added a test case that demonstrates why the change to seq_last is necessary. However I've been running into some confusion when it comes the InferenceDataset preparation in predict in _eval_model()

# test_dlinear_nlinear.py line 277
            e1, e2 = _eval_model(
                train1,
                train2,
                val1,
                val2,
                None,
                None,
                past_cov1=past_cov1,
                past_cov2=past_cov2,
                val_past_cov1=val_past_cov1,
                val_past_cov2=val_past_cov2,
                cls=NLinearModel,
                lkl=None,
                normalize=True
            )

in

# inference_dataset.py line 66:
        if main_covariate_type is CovariateType.PAST:
            future_end = past_end + max(0, n - output_chunk_length) * target_series.freq

this line of code causes .predict() to fail when using past covariates. .fit() runs great!

It seems to require past_covariates to extend into the future. Why is that? if a maintainer could help me understand I can finish the PR!

@dennisbader @felixdivo

darts/models/forecasting/nlinear.py

madtoinou · 2023-07-04T07:26:05Z

It seems to require past_covariates to extend into the future. Why is that? if a maintainer could help me understand I can finish the PR!

If the n argument of predict() is greater than the output_chunk_length, the model will perform auto-regression (consume its own predictions in the target ts). However, for the 2nd "prediction round", it expect these "future" values of past covariates and will complain if there are not provided. To avoid this situation, you need to make sure that n<= output_chunk_length or that the past_covariates actually extend a bit in the future.

Co-authored-by: Felix Divo <4403130+felixdivo@users.noreply.github.com>

…ormalization

…egressive predictions

Eliotdoesprogramming · 2023-07-04T17:55:41Z

should be ready for review! thanks @madtoinou for the explanation, that makes sense.

made the same minor change from original PR, but added test case coverage to demonstrate why its necessary :) thank you for the patience.

Had some issues with locally running tests through gradle (originating from lightning, support for float64 on macbook platform. Didn't think it was worth changing all the tests to include explicit casting to float32). Linting should be good

Will make changes / fix if the tests are failing on pipeline once it runs.

…ormalization

codecov-commenter · 2023-07-05T20:56:46Z

Codecov Report

Patch coverage is 33.33% of modified lines.

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Files Changed	Coverage
darts/models/forecasting/nlinear.py	`33.33%`

📢 Thoughts on this report? Let us know!.

felixdivo

Nice!

felixdivo · 2023-07-06T08:16:21Z

Side note: This PR also extends the tests to cover DLinear and ensure it works too! 🚀

…ormalization

felixdivo · 2023-07-12T15:38:45Z

@madtoinou can this be merged?

…ormalization

dennisbader · 2023-07-17T14:20:06Z

Hey @Eliotdoesprogramming and @felixdivo, and thanks for this. 🚀 The changes look good to me.

I agree with the normalization of the target series to account for the distribution shift.
But I'm a bit worried about also enforcing normalization of the past covariates at the same time (and/or the historic part of future covariates in the input chunk).
If I think about covariates such as datetime attributes (e.g. value of the month of the year) then normalization could result in constant x values. These covariates can help a lot e.g. for capturing seasonalities (and trends, specific events, ...)

Let's say we have an input_chunk_length of 3, and by chance all our batch samples start between March (month=3) and December (month=12). The input chunk for all past covariates batch samples in x will have values [-2, -1, 0] -> we don't get any useful information out of this. Only For January and February we get different values ([10, 11, 0] and [10, -1, 0]).

Also, we normalize the historic part of the future covariates (the values in the input chunk), but leave the values in the output chunk untouched. In my opinion they should be normalized as well using last value of the input chunk sequence.

What's your take on these points? Should give users the choice to enable target/covariates normalization separately? And also normalize the future covariates on the output chunk?

Eliotdoesprogramming · 2023-07-19T17:06:52Z

@dennisbader I agree its probably best to allow options

I wanted to note that the changes here specifically affect when shared_weights is false. When shared weights is True on line 121 - 123, the behavior is:

if self.normalize:
            # discard covariates, to ensure that in_dim == out_dim
            x = x[:, :, : self.output_dim]
            x = x.permute(0, 2, 1)  # (batch, out_dim, in_len)

so covariates are similarly discarded from the output tensor.

I think at least for my use cases, having the behavior be consistent for all inputs should be default. Current behavior will error with the following use case rather than being able to train:

  import darts
  from darts.datasets import ETTh1Dataset
  from darts.models.forecasting.nlinear import NLinearModel
  series = ETTh1Dataset().load()
  series = series.astype('float32')
  target = series['HUFL']
  past_cov = series['MULL']
  model = NLinearModel(10, 1, shared_weights=False, normalize=True)
  model.fit(series=target, past_covariates=past_cov)

…ormalization

felixdivo · 2023-08-14T11:52:36Z

Current behavior will error with the following use case rather than being able to train

I was also very surprised to find that you could not train with covariates and would suggest to enable it by default.

felixdivo · 2023-08-30T15:43:25Z

@dennisbader

I worked through your comment and see your concerns. I would like to comment on these to move this conversation forward. It is a very stale discussion, given that it should be a simple issue to solve.

Regarding the limited usefulness of the normalization if we are dealing with, for example, dates. I agree that in your example, normalization is not that useful. However, just because the model is not perfect for some input data doesn't mean it should not be supported. For other types of data, where the absolute date is irrelevant, eliminating some drift over time can be very useful. And that is why it was proposed in the original publication.
About the historic part of the future covariates. I'm unsure whether I understood your point, but generally, the model returns the target predictions only, right? I looked here. So it does not really matter what we do to the covariates.

dennisbader · 2023-09-01T07:39:06Z

@felixdivo and @Eliotdoesprogramming , this is how I interpret it at the moment (feel free to correct me). My concern was regarding these two lines: line 141 and line 151.

# x has shape (batch, input_chunk_length, n targets + n past covs + n future covs)
seq_last = x[:, -1:, :].detach()  # (batch, 1, in_dim)

x here is actually the entire input chunk (past/lookback window of the model). This includes the past target, past covariates and historic part of the future covariates. So when we take x = x - seq_last we are actually also performing normalization on past and future covariates.

So for my concern regarding future covariates: Here we normalize only the historic part of the future covariates (time steps in the input chunk) and leave the future part of future covariates (time steps in the output chunk) unchanged.

Now when inverse transforming, the shape of x changed to (batch, output_chunk_length, n targets * n likelhood parameters). When we apply x = x + seq_last[:, :, : x.shape[-1]], we are not inverse transforming with only target values, but also past and historic future covariates.

Proposed solution

If we only want to normalize only the target, then we need to change this.

For the seq_last from here, x has shape (batch, input_chunk_length, n targets + n past covs + n future covs). To get only the last values of the target features we would have to change it to something like below:

# get last values only for target features
# x has shape (batch, input_chunk_length, n targets + n past covs + n future covs)
seq_last = x[:, -1:, :self.output_dim].detach()

Then for the inverse transformation from here, x has a different shape: (batch, output_chunk_length, n targets * n likelhood parameters)

We should only add seq_last to the matching features. I propose that we do this at the very end after having changed the view of x here.

x = x.view(batch, self.output_chunk_length, self.output_dim, self.nr_params)
if self.normalize:
    x = x + seq_last.view(seq_last.shape + (1,))

Let me know what you think.

felixdivo · 2023-09-04T16:00:44Z

Thank you, @dennisbader, for elaborating. This helped a lot.

I think there are also use cases where we want to normalize the covariates (past, historical future, future) that go into the model. This is, for example, the case when we have multiple similar senor readings and want to use others to forecast the one we are interested in. Then, we would have similar dynamics in all the sensors and would, therefore, like to forecast the past covariates too.

So we should probably replace the flag normalize: bool = True with normalize_targets: bool = True & normalize_covariates: bool = True, right? However, this would add the new possibility of only normalizing the covariates. Do we want that?

If only the target is to be normalized, your solution is probably the way to go. If both are normalized, the original PR would be fine. Any comments on this?

…ormalization

Eliotdoesprogramming · 2023-09-06T02:53:16Z

sorry for the slow reply @dennisbader those changes sound good to me and I have implemented them. Thanks for all the help on the PR!

dennisbader

Thanks for the update @Eliotdoesprogramming. Can you remove the previous inverse transformation, as now we perform it twice?

darts/models/forecasting/nlinear.py

dennisbader · 2023-09-06T07:00:34Z

@felixdivo, even if we want to normalize both target and covariates, the initial PR would not work properly when using a likelihood with nr_params > 1.

Let's say we use output_chunk_length=1, 1 target component/column, 1 past covariates component, and we use a gaussian likelihood with 2 params (mean, std).

At the inverse transformation step:

# x has shape (batch, output_chunk_length, out_dim * nr_params) = (batch, 1, 1 * 2) = (batch, 1, 2)
# seq_last has shape (batch, 1, 1 target comp + 1 past cov comp) = (batch, 1, 2)
x = x + seq_last[:, :, : x.shape[-1]]

So seq_last here in the last dimension are the last values of the target and the past covariates from the input chunk.

We do not want to add the past covariates value to x at the inverse transformation.

…ormalization

Eliotdoesprogramming · 2023-09-06T15:05:30Z

the duplicated normalization step has now been removed 👍

Eliotdoesprogramming

extra normalization step removed!

dennisbader

Thanks a lot for fixing @Eliotdoesprogramming, looks great now 🚀

Also in the new release, users can try out another normalizaton technique: The Reversible Instance Normalization can be used with any torch model (except RNNModel) with use_reversible_instance_norm=True at model creation.

felixdivo · 2023-10-25T12:43:02Z

I think something went wrong here. I opened a new issue: #2035.

adding test for past covariates, need to check with darts about past …

b0c3e47

…covariates needing to end in the future

Eliotdoesprogramming requested a review from dennisbader as a code owner June 30, 2023 21:38

eliot added 2 commits June 30, 2023 16:50

slicing typo

4d80b11

fix slicing

bb1b332

felixdivo reviewed Jul 1, 2023

View reviewed changes

darts/models/forecasting/nlinear.py Outdated Show resolved Hide resolved

Eliotdoesprogramming and others added 4 commits July 4, 2023 10:03

add explicit comment from @felixdivo

9ef9088

Co-authored-by: Felix Divo <4403130+felixdivo@users.noreply.github.com>

Merge branch 'master' into fix/nlinear_support_past_covariates_with_n…

ba914dc

…ormalization

update test to use future covariates in predict function due to autor…

24ea32c

…egressive predictions

linting

f880a46

eliot and others added 2 commits July 5, 2023 09:12

fix tests

bdefef9

Merge branch 'master' into fix/nlinear_support_past_covariates_with_n…

1a080ff

…ormalization

felixdivo approved these changes Jul 6, 2023

View reviewed changes

Merge branch 'master' into fix/nlinear_support_past_covariates_with_n…

43fab55

…ormalization

Merge branch 'master' into fix/nlinear_support_past_covariates_with_n…

84195fe

…ormalization

dennisbader mentioned this pull request Jul 19, 2023

Tide Implementation #1727

Merged

Eliotdoesprogramming added 2 commits July 22, 2023 14:42

Merge branch 'master' into fix/nlinear_support_past_covariates_with_n…

a8d30e9

…ormalization

Merge branch 'master' into fix/nlinear_support_past_covariates_with_n…

a7e207f

…ormalization

Merge branch 'master' into fix/nlinear_support_past_covariates_with_n…

d9c70d9

…ormalization

eliot added 2 commits September 5, 2023 21:47

making changes proposed by @dennisbader

0bba0e8

update comment

36cfc98

dennisbader requested changes Sep 6, 2023

View reviewed changes

darts/models/forecasting/nlinear.py Outdated Show resolved Hide resolved

eliot and others added 2 commits September 6, 2023 08:08

remove double denormalization

cc79366

Merge branch 'master' into fix/nlinear_support_past_covariates_with_n…

db4a36f

…ormalization

Eliotdoesprogramming commented Sep 6, 2023

View reviewed changes

Eliotdoesprogramming requested a review from dennisbader September 6, 2023 17:29

update CHANGELOG.md

cd8aa64

dennisbader approved these changes Sep 7, 2023

View reviewed changes

dennisbader merged commit 74ed2bb into unit8co:master Sep 7, 2023

felixdivo mentioned this pull request Oct 25, 2023

[BUG] NLinear fails to predict >1 variates #2035

Closed

felixdivo mentioned this pull request Oct 25, 2023

[BUG] DLinear fails to predict >1 variates in presence of static covariates #2036

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix NLinear normalization to support past covariates #1873

Fix NLinear normalization to support past covariates #1873

Eliotdoesprogramming commented Jun 30, 2023 •

edited

Loading

Eliotdoesprogramming commented Jun 30, 2023 •

edited

Loading

madtoinou commented Jul 4, 2023

Eliotdoesprogramming commented Jul 4, 2023 •

edited

Loading

codecov-commenter commented Jul 5, 2023 •

edited

Loading

felixdivo left a comment

felixdivo commented Jul 6, 2023

felixdivo commented Jul 12, 2023

dennisbader commented Jul 17, 2023

Eliotdoesprogramming commented Jul 19, 2023 •

edited

Loading

felixdivo commented Aug 14, 2023

felixdivo commented Aug 30, 2023

dennisbader commented Sep 1, 2023

felixdivo commented Sep 4, 2023

Eliotdoesprogramming commented Sep 6, 2023

dennisbader left a comment

dennisbader commented Sep 6, 2023 •

edited

Loading

Eliotdoesprogramming commented Sep 6, 2023

Eliotdoesprogramming left a comment

dennisbader left a comment

felixdivo commented Oct 25, 2023

Fix NLinear normalization to support past covariates #1873

Fix NLinear normalization to support past covariates #1873

Conversation

Eliotdoesprogramming commented Jun 30, 2023 • edited Loading

Normalization description from original paper

Summary

Other Information

Eliotdoesprogramming commented Jun 30, 2023 • edited Loading

madtoinou commented Jul 4, 2023

Eliotdoesprogramming commented Jul 4, 2023 • edited Loading

codecov-commenter commented Jul 5, 2023 • edited Loading

Codecov Report

felixdivo left a comment

Choose a reason for hiding this comment

felixdivo commented Jul 6, 2023

felixdivo commented Jul 12, 2023

dennisbader commented Jul 17, 2023

Eliotdoesprogramming commented Jul 19, 2023 • edited Loading

felixdivo commented Aug 14, 2023

felixdivo commented Aug 30, 2023

dennisbader commented Sep 1, 2023

Proposed solution

felixdivo commented Sep 4, 2023

Eliotdoesprogramming commented Sep 6, 2023

dennisbader left a comment

Choose a reason for hiding this comment

dennisbader commented Sep 6, 2023 • edited Loading

Eliotdoesprogramming commented Sep 6, 2023

Eliotdoesprogramming left a comment

Choose a reason for hiding this comment

dennisbader left a comment

Choose a reason for hiding this comment

felixdivo commented Oct 25, 2023

Eliotdoesprogramming commented Jun 30, 2023 •

edited

Loading

Eliotdoesprogramming commented Jun 30, 2023 •

edited

Loading

Eliotdoesprogramming commented Jul 4, 2023 •

edited

Loading

codecov-commenter commented Jul 5, 2023 •

edited

Loading

Eliotdoesprogramming commented Jul 19, 2023 •

edited

Loading

dennisbader commented Sep 6, 2023 •

edited

Loading