Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: removed data input restriction during cross validation finetune #426

Merged
merged 4 commits into from
Jul 19, 2024

Conversation

Yibei990826
Copy link
Contributor

Addresses issue #424.

The current implementation of cross_validation restricts historical data even when finetune_steps is not zero. This restriction causes an UnprocessableEntityError when fine-tuning is used in cross-validation.

For instance, the following code using web traffic sample data generates this error:

cv_df = nixtla_client.cross_validation(
    df,
    h=7,
    step_size = 7,
    n_windows=1,
    time_col='date',
    target_col='users',
    id_col = 'unique_id',
    freq='D',
    finetune_steps = 10
)

UnprocessableEntityError: status_code: 422, body: detail=None requestID='C2T6854CVQ' details='request had an error' support='If you have questions or need support, please email ops@nixtla.io' data={'detail': 'Minimum number of samples by id required for finetuning is 29, got 28.'}

To resolve this issue, I added a condition to deactivate the _restrict_input_samples function when finetune_steps is not zero. This ensures that sufficient historical data is available during fine-tuning in cross-validation.

Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@Yibei990826 Yibei990826 requested review from AzulGarza and cchallu July 16, 2024 20:11
Copy link
Contributor

Experiment Results

Experiment 1: air-passengers

Description:

variable experiment
h 12
season_length 12
freq MS
level None
n_windows 1

Results:

metric timegpt-1 timegpt-1-long-horizon SeasonalNaive Naive
mae 12.6793 11.0623 47.8333 76
mape 0.027 0.0232 0.0999 0.1425
mse 213.935 199.132 2571.33 10604.2
total_time 2.1262 2.7237 0.0084 0.0044

Plot:

Experiment 2: air-passengers

Description:

variable experiment
h 24
season_length 12
freq MS
level None
n_windows 1

Results:

metric timegpt-1 timegpt-1-long-horizon SeasonalNaive Naive
mae 58.1031 58.4587 71.25 115.25
mape 0.1257 0.1267 0.1552 0.2358
mse 4040.22 4110.79 5928.17 18859.2
total_time 1.3951 0.9664 0.0051 0.0044

Plot:

Experiment 3: electricity-multiple-series

Description:

variable experiment
h 24
season_length 24
freq H
level None
n_windows 1

Results:

metric timegpt-1 timegpt-1-long-horizon SeasonalNaive Naive
mae 178.293 268.121 269.23 1331.02
mape 0.0234 0.0311 0.0304 0.1692
mse 121588 219457 213677 4.68961e+06
total_time 1.3754 2.0212 0.0073 0.0062

Plot:

Experiment 4: electricity-multiple-series

Description:

variable experiment
h 168
season_length 24
freq H
level None
n_windows 1

Results:

metric timegpt-1 timegpt-1-long-horizon SeasonalNaive Naive
mae 465.532 346.984 398.956 1119.26
mape 0.062 0.0437 0.0512 0.1583
mse 835120 403787 656723 3.17316e+06
total_time 3.4165 1.3548 0.0069 0.0063

Plot:

Experiment 5: electricity-multiple-series

Description:

variable experiment
h 336
season_length 24
freq H
level None
n_windows 1

Results:

metric timegpt-1 timegpt-1-long-horizon SeasonalNaive Naive
mae 558.649 459.769 602.926 1340.95
mape 0.0697 0.0566 0.0787 0.17
mse 1.22721e+06 739135 1.61572e+06 6.04619e+06
total_time 5.4129 1.8737 0.007 0.0065

Plot:

Copy link
Contributor

Experiment Results

Experiment 1: air-passengers

Description:

variable experiment
h 12
season_length 12
freq MS
level None
n_windows 1

Results:

metric timegpt-1 timegpt-1-long-horizon SeasonalNaive Naive
mae 12.6793 11.0623 47.8333 76
mape 0.027 0.0232 0.0999 0.1425
mse 213.936 199.132 2571.33 10604.2
total_time 1.514 0.7975 0.0081 0.0042

Plot:

Experiment 2: air-passengers

Description:

variable experiment
h 24
season_length 12
freq MS
level None
n_windows 1

Results:

metric timegpt-1 timegpt-1-long-horizon SeasonalNaive Naive
mae 58.1031 58.4587 71.25 115.25
mape 0.1257 0.1267 0.1552 0.2358
mse 4040.21 4110.79 5928.17 18859.2
total_time 1.0215 0.8546 0.0069 0.0062

Plot:

Experiment 3: electricity-multiple-series

Description:

variable experiment
h 24
season_length 24
freq H
level None
n_windows 1

Results:

metric timegpt-1 timegpt-1-long-horizon SeasonalNaive Naive
mae 178.293 268.121 269.23 1331.02
mape 0.0234 0.0311 0.0304 0.1692
mse 121588 219457 213677 4.68961e+06
total_time 0.8368 1.3185 0.0074 0.0062

Plot:

Experiment 4: electricity-multiple-series

Description:

variable experiment
h 168
season_length 24
freq H
level None
n_windows 1

Results:

metric timegpt-1 timegpt-1-long-horizon SeasonalNaive Naive
mae 465.532 346.984 398.956 1119.26
mape 0.062 0.0437 0.0512 0.1583
mse 835120 403787 656723 3.17316e+06
total_time 1.6995 4.6912 0.0069 0.0063

Plot:

Experiment 5: electricity-multiple-series

Description:

variable experiment
h 336
season_length 24
freq H
level None
n_windows 1

Results:

metric timegpt-1 timegpt-1-long-horizon SeasonalNaive Naive
mae 558.649 459.769 602.926 1340.95
mape 0.0697 0.0566 0.0787 0.17
mse 1.22721e+06 739135 1.61572e+06 6.04619e+06
total_time 2.8673 1.2394 0.0068 0.0063

Plot:

@Yibei990826 Yibei990826 changed the title [fix] Remove data input restriction in CV if finetune is enabled fix: removed data input restriction during cross validation finetune Jul 17, 2024
Copy link
Contributor

Experiment Results

Experiment 1: air-passengers

Description:

variable experiment
h 12
season_length 12
freq MS
level None
n_windows 1

Results:

metric timegpt-1 timegpt-1-long-horizon SeasonalNaive Naive
mae 12.6793 11.0623 47.8333 76
mape 0.027 0.0232 0.0999 0.1425
mse 213.935 199.132 2571.33 10604.2
total_time 2.5865 3.1422 0.0103 0.005

Plot:

Experiment 2: air-passengers

Description:

variable experiment
h 24
season_length 12
freq MS
level None
n_windows 1

Results:

metric timegpt-1 timegpt-1-long-horizon SeasonalNaive Naive
mae 58.1031 58.4587 71.25 115.25
mape 0.1257 0.1267 0.1552 0.2358
mse 4040.22 4110.79 5928.17 18859.2
total_time 2.2535 1.8014 0.0065 0.0045

Plot:

Experiment 3: electricity-multiple-series

Description:

variable experiment
h 24
season_length 24
freq H
level None
n_windows 1

Results:

metric timegpt-1 timegpt-1-long-horizon SeasonalNaive Naive
mae 178.293 268.121 269.23 1331.02
mape 0.0234 0.0311 0.0304 0.1692
mse 121588 219457 213677 4.68961e+06
total_time 3.593 3.7862 0.0081 0.0061

Plot:

Experiment 4: electricity-multiple-series

Description:

variable experiment
h 168
season_length 24
freq H
level None
n_windows 1

Results:

metric timegpt-1 timegpt-1-long-horizon SeasonalNaive Naive
mae 465.532 346.984 398.956 1119.26
mape 0.062 0.0437 0.0512 0.1583
mse 835120 403787 656723 3.17316e+06
total_time 3.9599 2.1826 0.007 0.0065

Plot:

Experiment 5: electricity-multiple-series

Description:

variable experiment
h 336
season_length 24
freq H
level None
n_windows 1

Results:

metric timegpt-1 timegpt-1-long-horizon SeasonalNaive Naive
mae 558.649 459.769 602.926 1340.95
mape 0.0697 0.0566 0.0787 0.17
mse 1.22721e+06 739135 1.61572e+06 6.04619e+06
total_time 5.8095 2.9093 0.0068 0.0064

Plot:

Copy link
Member

@AzulGarza AzulGarza left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you @Yibei990826! could we add a test for this fix? thank you so much🫶

Copy link
Contributor

Experiment Results

Experiment 1: air-passengers

Description:

variable experiment
h 12
season_length 12
freq MS
level None
n_windows 1

Results:

metric timegpt-1 timegpt-1-long-horizon SeasonalNaive Naive
mae 12.6793 11.0623 47.8333 76
mape 0.027 0.0232 0.0999 0.1425
mse 213.936 199.132 2571.33 10604.2
total_time 2.158 0.9086 0.0084 0.0046

Plot:

Experiment 2: air-passengers

Description:

variable experiment
h 24
season_length 12
freq MS
level None
n_windows 1

Results:

metric timegpt-1 timegpt-1-long-horizon SeasonalNaive Naive
mae 58.1031 58.4587 71.25 115.25
mape 0.1257 0.1267 0.1552 0.2358
mse 4040.21 4110.79 5928.17 18859.2
total_time 1.2421 2.3983 0.0053 0.0043

Plot:

Experiment 3: electricity-multiple-series

Description:

variable experiment
h 24
season_length 24
freq H
level None
n_windows 1

Results:

metric timegpt-1 timegpt-1-long-horizon SeasonalNaive Naive
mae 178.293 268.121 269.23 1331.02
mape 0.0234 0.0311 0.0304 0.1692
mse 121588 219457 213677 4.68961e+06
total_time 0.9052 1.7992 0.0072 0.0061

Plot:

Experiment 4: electricity-multiple-series

Description:

variable experiment
h 168
season_length 24
freq H
level None
n_windows 1

Results:

metric timegpt-1 timegpt-1-long-horizon SeasonalNaive Naive
mae 465.532 346.984 398.956 1119.26
mape 0.062 0.0437 0.0512 0.1583
mse 835120 403787 656723 3.17316e+06
total_time 2.4803 4.2324 0.0069 0.0066

Plot:

Experiment 5: electricity-multiple-series

Description:

variable experiment
h 336
season_length 24
freq H
level None
n_windows 1

Results:

metric timegpt-1 timegpt-1-long-horizon SeasonalNaive Naive
mae 558.649 459.769 602.926 1340.95
mape 0.0697 0.0566 0.0787 0.17
mse 1.22721e+06 739135 1.61572e+06 6.04619e+06
total_time 3.0144 1.5173 0.007 0.0067

Plot:

@AzulGarza AzulGarza self-requested a review July 19, 2024 21:02
@AzulGarza AzulGarza merged commit 55de0b3 into main Jul 19, 2024
14 checks passed
@AzulGarza AzulGarza deleted the fix/finetune-in-CV branch July 19, 2024 21:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants