Why is use_feat_static_cat = False, NOT recommended? #1341

bramDelft · 2021-02-25T10:00:37Z

bramDelft
Feb 25, 2021

Hi All,

I am trying to use DeepStateEstimator. I do not have any categorical variates to use as static variates. I want to set the use_feat_static_cat to False, however the package tells me it is NOT recommended. Why? What happens if I do not use any static categorical feature?

I am coping with a dataset of 100k time series with every time series having 33 points.

Thanks in advance,
Bram

Answered by lostella

Mar 3, 2021

The DeepState model takes features (both dynamic and static features) as input, to output a state-space model (linear dynamical system) for the given series: no previous values ("lags") from the time series are used as inputs.

If no features are provided by the user, then only date-time features (day-of-week, hour-of-day, etc) will be used. However, say you have two time series spanning the same time range, but with significantly different target values, say TS1 = 1000 * TS2. Then training a good model for such data will be hard, if at all possible: inputs (date-time features) will be the same for both series, and so will be the output linear dynamical systems. So, in this case the model …

View full answer

lostella · 2021-03-03T13:46:51Z

lostella
Mar 3, 2021
Maintainer

The DeepState model takes features (both dynamic and static features) as input, to output a state-space model (linear dynamical system) for the given series: no previous values ("lags") from the time series are used as inputs.

If no features are provided by the user, then only date-time features (day-of-week, hour-of-day, etc) will be used. However, say you have two time series spanning the same time range, but with significantly different target values, say TS1 = 1000 * TS2. Then training a good model for such data will be hard, if at all possible: inputs (date-time features) will be the same for both series, and so will be the output linear dynamical systems. So, in this case the model will likely predict something "in the middle" of the two series.

Having categorical features attached to your data helps the model identify which time series it is given. If you cannot "categorize" your series, then you can simply add an ID: each time series in your dataset will have a feature ranging in [0, N-1], where N is the dataset size. The only drawback to this approach is that at prediction time you will need to provide the same feature, so you will only be able to do predictions on series that you previously trained on. This is usually OK in many cases, but is something to be taken into account: if you train on time series with ID 0 to 99, and then come up with time series having ID 100, then the trained model will not work with that.

1 reply

bramDelft Mar 5, 2021
Author

Thankyou! Very clear

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why is use_feat_static_cat = False, NOT recommended? #1341

{{title}}

Replies: 1 comment 1 reply

{{title}}

{{title}}

Select a reply

Why is use_feat_static_cat = False, NOT recommended? #1341

bramDelft Feb 25, 2021

Replies: 1 comment · 1 reply

lostella Mar 3, 2021 Maintainer

bramDelft Mar 5, 2021 Author

bramDelft
Feb 25, 2021

Replies: 1 comment 1 reply

lostella
Mar 3, 2021
Maintainer

bramDelft Mar 5, 2021
Author