-
Notifications
You must be signed in to change notification settings - Fork 912
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rm data leakage in example notebooks #2008
Conversation
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
Codecov ReportAll modified lines are covered by tests ✅ ❗ Your organization needs to install the Codecov GitHub app to enable full functionality. see 6 files with indirect coverage changes 📢 Thoughts on this report? Let us know!. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for this @SimTheGreat this is a nice cleaning up of the notebooks 🚀
I added some suggestions in the ReviewNB. The main suggestions were:
- try to revert changes to files where no code changed -> we rude the git diffs
- try to upgrade ipywidgets and rerun all the notebooks with changes (I explain in one of the comments how to do it): this will make the progress bar look better and improve the notebook appearance
- we can actually avoid scaling/transforming for many of the covariates as they were one hot encoded.
@@ -2970,7 +2970,7 @@ | |||
"name": "python", | |||
"nbconvert_exporter": "python", | |||
"pygments_lexer": "ipython3", | |||
"version": "3.9.16" | |||
"version": "3.10.11" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as discussed, let's revert this change
"version": "3.10.11" | |
"version": "3.9.16" |
View / edit / reply to this conversation on ReviewNB dennisbader commented on 2023-09-28T08:13:36Z There is an issue with ipywidgets not showing the progess bar properly (the "Predicting: 0it [00:00, ?it/s]")
I had this as well. We can take this opportunity to improve the progess bar appearance :)
Could you try |
View / edit / reply to this conversation on ReviewNB dennisbader commented on 2023-09-28T08:13:37Z Can you rename
Just have a clear distinction between the covariates and the target series ( |
@@ -4450,7 +4450,7 @@ | |||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like there were no code changes in this notebook. Can you try to revert the changes to this file? We usually try to keep the git history/diffs as minimal as possible :)
Reply via ReviewNB
examples/05-TCN-examples.ipynb
Outdated
@@ -54,19 +54,22 @@ | |||
"source": [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add the following lines in this cell at the bottom?
We want to avoid some info from PyTorch lightning (mostly the device information that get spammed when calling historical forecasts)
import warningswarnings.filterwarnings("ignore")
import logginglogging.disable(logging.CRITICAL)
Reply via ReviewNB
examples/05-TCN-examples.ipynb
Outdated
@@ -54,19 +54,22 @@ | |||
"source": [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Line #16. scaler_month = Scaler()
actually we can avoid scaling the month series, as it's one hot encoded anyways (encodes months as n months - 1
binary columns).
Can you remove the scaling here?
Reply via ReviewNB
examples/05-TCN-examples.ipynb
Outdated
@@ -54,19 +54,22 @@ | |||
"source": [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this here will definitely look better after upgrading ipywidgets (my comments in notebook 01)
Reply via ReviewNB
examples/05-TCN-examples.ipynb
Outdated
@@ -54,19 +54,22 @@ | |||
"source": [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Line #17. series_en_transformed = scaler_en.fit_transform(series_en)
this line should be somewhere below fitting the transformer on train_en and using transform()
Reply via ReviewNB
examples/05-TCN-examples.ipynb
Outdated
@@ -54,19 +54,22 @@ | |||
"source": [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Line #26. scaler_day = Scaler()
here we can avoid scaling again as we use a one hot encoding for the days
Reply via ReviewNB
examples/07-NBEATS-examples.ipynb
Outdated
@@ -37,7 +37,8 @@ | |||
"from darts.models import NBEATSModel\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Line #10. from darts.timeseries import concatenate
change to
from darts import concatenate
Reply via ReviewNB
examples/08-DeepAR-examples.ipynb
Outdated
@@ -11,7 +11,7 @@ | |||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you rerun this notebook? seems like there were some issues with too short series for validation
Reply via ReviewNB
examples/08-DeepAR-examples.ipynb
Outdated
@@ -11,7 +11,7 @@ | |||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Line #27. train, val = day_series.split_after(pd.Timestamp("20170901"))
we can avoid scaling the day series as it's one-hot encoded
Reply via ReviewNB
examples/08-DeepAR-examples.ipynb
Outdated
@@ -11,7 +11,7 @@ | |||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
examples/09-DeepTCN-examples.ipynb
Outdated
@@ -44,7 +44,7 @@ | |||
"from darts import TimeSeries\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
examples/09-DeepTCN-examples.ipynb
Outdated
@@ -44,7 +44,7 @@ | |||
"from darts import TimeSeries\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
examples/09-DeepTCN-examples.ipynb
Outdated
@@ -44,7 +44,7 @@ | |||
"from darts import TimeSeries\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Line #2. series=concatenate([train_en_transformed, val_en_transformed]),
Let's define series_en_transformed
again before (where we apply the scaler), and then reuse it here and in the next cell. Like this we can avoid having to compute series_en_transformed twice, which can reduce time for longer series.
Reply via ReviewNB
View / edit / reply to this conversation on ReviewNB dennisbader commented on 2023-09-28T08:13:47Z It looks like there weren't any code changes in this file. Can you try to revert the changes? |
fixes #644
Other Information
fix scaling in example notebook