Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Added cross-validation tutorial #897

Merged
merged 10 commits into from
Mar 1, 2024
Merged

Conversation

MMenchero
Copy link
Contributor

This is part of the new docs for neuralforecast.

Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@CLAassistant
Copy link

CLAassistant commented Feb 22, 2024

CLA assistant check
All committers have signed the CLA.

@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

Copy link

review-notebook-app bot commented Feb 22, 2024

View / edit / reply to this conversation on ReviewNB

jmoralez commented on 2024-02-22T15:52:38Z
----------------------------------------------------------------

typo in category (says catefory). also please use the term timestamp for ds, I think some errors refer to that as timestamps


Copy link

review-notebook-app bot commented Feb 22, 2024

View / edit / reply to this conversation on ReviewNB

jmoralez commented on 2024-02-22T15:52:39Z
----------------------------------------------------------------

Line #2.    StatsForecast.plot(Y_df)

This method calls utilsforecast.plotting.plot_series under the hood. Also statsforecast is not a dependency of neural (utils is). Can you please replace this with the plot_series function?


Copy link

review-notebook-app bot commented Feb 22, 2024

View / edit / reply to this conversation on ReviewNB

jmoralez commented on 2024-02-22T15:52:40Z
----------------------------------------------------------------

I think we could explain with more detail here how the process works, since it's a bit different with respect to the other libs. The models are trained only once and are used to generate predictions over several windows. The refit argument (which is on the main branch but hasn't been released) can control the retraining behavior (it currently defaults to False) but can be an integer or True


Copy link

review-notebook-app bot commented Feb 22, 2024

View / edit / reply to this conversation on ReviewNB

jmoralez commented on 2024-02-22T15:52:41Z
----------------------------------------------------------------

The 80 and 90% intervals are because those are the defaults of the MQLoss, I think we should clarify that.


Copy link

review-notebook-app bot commented Feb 22, 2024

View / edit / reply to this conversation on ReviewNB

jmoralez commented on 2024-02-22T15:52:42Z
----------------------------------------------------------------

Please use the following here:

from utilsforecast.evaluation import evaluate
from utilsforecast.losses import rmse

The evaluate function is very similar to accuracy and we're going to remove the metrics module from datasetsforecast in the next release.


MMenchero commented on 2024-02-28T02:47:09Z
----------------------------------------------------------------

All suggested changes have been implemented.

Copy link
Contributor Author

All suggested changes have been implemented.


View entire conversation on ReviewNB

Copy link

review-notebook-app bot commented Feb 28, 2024

View / edit / reply to this conversation on ReviewNB

jmoralez commented on 2024-02-28T16:58:28Z
----------------------------------------------------------------

Should be utilsforecast at the end here


MMenchero commented on 2024-02-29T03:33:22Z
----------------------------------------------------------------

thanks, missed that one

Copy link

review-notebook-app bot commented Feb 28, 2024

View / edit / reply to this conversation on ReviewNB

jmoralez commented on 2024-02-28T16:58:29Z
----------------------------------------------------------------

Line #1.    cv_df = cv_df.reset_index()

I think it'd be better to set os.environ['NIXTLA_ID_AS_COL'] = '1' at the top of the notebook to avoid having to do this


MMenchero commented on 2024-02-29T03:33:50Z
----------------------------------------------------------------

TIL, that's pretty useful

Copy link
Contributor Author

thanks, missed that one


View entire conversation on ReviewNB

Copy link
Contributor Author

TIL, that's pretty useful


View entire conversation on ReviewNB

Copy link

review-notebook-app bot commented Feb 29, 2024

View / edit / reply to this conversation on ReviewNB

jmoralez commented on 2024-02-29T15:27:21Z
----------------------------------------------------------------

Can you please set step_size=horizon here so that there's only one prediction per timestamp? Otherwise the plot at the bottom looks weird because there are several predictions for each timestamp. Also feel free to open an issue in utilsforecast to support plotting the results from CV if you think it would be useful.


MMenchero commented on 2024-03-01T01:58:45Z
----------------------------------------------------------------

Got it, and I'll open the issue in utilsforecast since I think it'll be very useful.

Copy link

review-notebook-app bot commented Feb 29, 2024

View / edit / reply to this conversation on ReviewNB

jmoralez commented on 2024-02-29T15:27:22Z
----------------------------------------------------------------

I missed the last part here, we should remove it (We also need to reset its index to include the column with the unique ids)


MMenchero commented on 2024-03-01T01:59:26Z
----------------------------------------------------------------

I also missed it, but is now deleted.

Copy link

review-notebook-app bot commented Feb 29, 2024

View / edit / reply to this conversation on ReviewNB

cchallu commented on 2024-02-29T19:54:12Z
----------------------------------------------------------------

Can we remove this equation? to keep the simpler. Maybe do a callout box (see other tutorials) with this information and the link to more details


MMenchero commented on 2024-03-01T04:31:53Z
----------------------------------------------------------------

sure, I added a callout box and kept it simple in the main text.

Copy link

review-notebook-app bot commented Feb 29, 2024

View / edit / reply to this conversation on ReviewNB

cchallu commented on 2024-02-29T19:54:13Z
----------------------------------------------------------------

Sorry for not saying it earlier, but I strongly suggest to switch to NHITS and LSTM. The Auto models add a layer of complexity, because we are not explaining the validation set. We can use the NHTIS and LSTM with their default parameters as well


MMenchero commented on 2024-03-01T04:36:51Z
----------------------------------------------------------------

I dropped the auto models, but in order to keep Capi's suggestion of setting step_size=horizon so that we only have one prediction per timestamp, I had to remove the LSTM since recurrent models don't support step_size>1. Instead, I'm using the MLP and the NBEATS models.

Copy link

review-notebook-app bot commented Feb 29, 2024

View / edit / reply to this conversation on ReviewNB

cchallu commented on 2024-02-29T19:54:14Z
----------------------------------------------------------------

Can we add the plot with sliding windows on the predict insample tutorial? And the explanation above taking it from the predict insample tutorial as well:

With the step_size parameter you can specify the step size between consecutive windows to produce the forecasts. In this example we will set step_size=horizon to produce non-overlapping forecasts.

The following diagram shows how the forecasts are produced based on the step_size parameter and h (horizon) of the model. In the diagram we set step_size=2 and h=4.


Copy link

review-notebook-app bot commented Feb 29, 2024

View / edit / reply to this conversation on ReviewNB

cchallu commented on 2024-02-29T19:54:15Z
----------------------------------------------------------------

I think we need an additional paragraph or even a plot. The main goal of the tutorial is explaining cross-validation, so we need to provide all the possible details and explanation. For example:

With the current parameters, cross validation will look like this: (and add a plot, with an example of a time series with the three windows at the end. And mention that the model will be trained with info prior to x timestamp, and so on.


MMenchero commented on 2024-03-01T06:44:17Z
----------------------------------------------------------------

I added an additional paragraph + plots at the end to further clarify the concept. It's at the end because I need to rename a column to make the plots and that is done in section 5.

Copy link

review-notebook-app bot commented Feb 29, 2024

View / edit / reply to this conversation on ReviewNB

cchallu commented on 2024-02-29T19:54:15Z
----------------------------------------------------------------

same as before, remove equation


Copy link
Contributor Author

got it.


View entire conversation on ReviewNB

Copy link
Contributor Author

I also missed it, but is now deleted.


View entire conversation on ReviewNB

Copy link
Contributor Author

sure, I added a callout box and kept it simple in the main text.


View entire conversation on ReviewNB

Copy link
Contributor Author

I'm now using the default values, but in order to keep Capi's suggestion of setting step_size=horizon, I had to remove the LSTM model since recurrent models don't support step_size>1. Instead, I'm now using the MLP and the NBEATS models.


View entire conversation on ReviewNB

Copy link
Contributor Author

I added an additional paragraph + plots at the end to further clarify the concept. It's at the end because I need to rename a column to make the plots and that is done in section 5.


View entire conversation on ReviewNB

@cchallu cchallu self-requested a review March 1, 2024 20:16
@cchallu cchallu merged commit 7f69d78 into Nixtla:main Mar 1, 2024
15 checks passed
marcopeix pushed a commit that referenced this pull request Mar 8, 2024
* feat: Added cross-validation tutorial

* fix: Made suggested changes to cross-validation tutorial

* fix: Removed fix for MPS issue

* fix: Removed metadata

* fix: Fixed typo and added environ var

* fix: Fixed typo and added environ var

* fix: Added new suggestions

---------

Co-authored-by: Cristian Challu <cristiani.challu@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants