Added support for prediction intervals for VARMAX regressor #4267

christopherbunn · 2023-08-09T18:36:32Z

Resolves #4262

codecov · 2023-08-09T18:44:46Z

Codecov Report

Merging #4267 (2890f04) into main (daa8568) will increase coverage by 0.1%.
The diff coverage is 100.0%.

@@           Coverage Diff           @@
##            main   #4267     +/-   ##
=======================================
+ Coverage   99.7%   99.7%   +0.1%     
=======================================
  Files        355     355             
  Lines      38915   38956     +41     
=======================================
+ Hits       38794   38835     +41     
  Misses       121     121

Files Changed	Coverage Δ
...tors/regressors/exponential_smoothing_regressor.py	`100.0% <100.0%> (ø)`
...mponents/estimators/regressors/varmax_regressor.py	`100.0% <100.0%> (ø)`
...lml/tests/component_tests/test_varmax_regressor.py	`100.0% <100.0%> (ø)`

jeremyliweishih

LGTM

eccabay

Looks solid! Just a few questions and some testing suggestions

eccabay · 2023-08-10T17:50:09Z

evalml/pipelines/components/estimators/regressors/varmax_regressor.py

+ # anchor represents where the simulations should start from (forecasting is done from the "end")
+ y_pred = self._component_obj._fitted_forecaster.simulate(
+ nsimulations=X.shape[0],
+ repetitions=400,


Why is this fixed at 400?

This implementation is based on the one we have for exponential smoothing and this is the value that is set there. Do you think we should have it passed in as a parameter?

Hmm, poking around in our exponential smoother and statsmodels' docs on the subject, it's unclear to me why this was set at 400. I think at least setting it as a constant would be good, since the number seems arbitrary.

Sounds good, will update to include _N_REPETITIONS=400

eccabay · 2023-08-10T17:52:01Z

evalml/pipelines/components/estimators/regressors/varmax_regressor.py

@@ -217,9 +217,43 @@ def get_prediction_intervals(
 Returns:
 dict: Prediction intervals, keys are in the format {coverage}_lower or {coverage}_upper.


I think this needs to be updated since the return here will be a nested, per series dictionary - do I have that correct?

Yep, updated the doc string!

eccabay · 2023-08-10T17:52:50Z

evalml/pipelines/components/estimators/regressors/varmax_regressor.py

 )
+ prediction_interval_result = {}
+ for series in self._component_obj._fitted_forecaster.model.endog_names:


What are endog_names, where do those come from? Is that the columns of y in unstacked/dataframe format?

Yes, they are and they are set internally in statsmodels during the fit process. I can add a comment describing this. I don't think there's a better way to access this info other than storing it as class variable during the fit process?

eccabay · 2023-08-10T17:55:23Z

evalml/tests/component_tests/test_varmax_regressor.py

+@pytest.mark.parametrize("use_covariates", [True, False])
+def test_varmax_regressor_prediction_intervals(use_covariates, ts_multiseries_data):
+ X_train, X_test, y_train = ts_multiseries_data(no_features=not use_covariates)


I think an interesting test here would be to check the cases where X is None and use_covariates is True, and where X is not None and use_covariates is False - we have lots of checks for those cases, it'd be nice to ensure we handle those smoothly

To clarify, you mean the cases where X in fit() and not in in get_prediction_intervals() right? I can add that case in!

evalml/tests/component_tests/test_varmax_regressor.py

eccabay · 2023-08-14T12:17:12Z

evalml/pipelines/components/estimators/regressors/varmax_regressor.py

+ # anchor represents where the simulations should start from (forecasting is done from the "end")
+ y_pred = self._component_obj._fitted_forecaster.simulate(
+ nsimulations=X.shape[0],
+ repetitions=400,


Hmm, poking around in our exponential smoother and statsmodels' docs on the subject, it's unclear to me why this was set at 400. I think at least setting it as a constant would be good, since the number seems arbitrary.

christopherbunn marked this pull request as ready for review August 10, 2023 12:40

auto-assign bot assigned christopherbunn Aug 10, 2023

christopherbunn requested review from chukarsten, eccabay, jeremyliweishih, fjlanasa, remyogasawara and MichaelFu512 August 10, 2023 12:40

christopherbunn force-pushed the TML-7894_pred_int_varmax branch from a6b738c to c435cd1 Compare August 10, 2023 17:04

jeremyliweishih approved these changes Aug 10, 2023

View reviewed changes

eccabay requested changes Aug 11, 2023

View reviewed changes

eccabay approved these changes Aug 14, 2023

View reviewed changes

christopherbunn force-pushed the TML-7894_pred_int_varmax branch from dd72b1b to 962d3e7 Compare August 14, 2023 15:00

christopherbunn enabled auto-merge (squash) August 14, 2023 15:01

christopherbunn disabled auto-merge August 14, 2023 15:01

christopherbunn added 5 commits August 14, 2023 12:17

Initial commit

23810ba

Updated release notes

1b6dc25

Fixed random state handling

a8a20bd

Updated tests

327b5e6

Moved number of simulations as constant.

2890f04

christopherbunn force-pushed the TML-7894_pred_int_varmax branch from 962d3e7 to 2890f04 Compare August 14, 2023 16:17

christopherbunn merged commit 3572300 into main Aug 14, 2023
24 checks passed

christopherbunn deleted the TML-7894_pred_int_varmax branch August 14, 2023 18:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added support for prediction intervals for VARMAX regressor #4267

Added support for prediction intervals for VARMAX regressor #4267

christopherbunn commented Aug 9, 2023

codecov bot commented Aug 9, 2023 •

edited

Loading

jeremyliweishih left a comment

eccabay left a comment

eccabay Aug 10, 2023

christopherbunn Aug 11, 2023

eccabay Aug 14, 2023

christopherbunn Aug 14, 2023 •

edited

Loading

eccabay Aug 10, 2023

christopherbunn Aug 11, 2023

eccabay Aug 10, 2023

christopherbunn Aug 11, 2023

eccabay Aug 10, 2023

christopherbunn Aug 11, 2023

eccabay Aug 14, 2023

		@@ -217,9 +217,43 @@ def get_prediction_intervals(
		Returns:
		dict: Prediction intervals, keys are in the format {coverage}_lower or {coverage}_upper.

Added support for prediction intervals for VARMAX regressor #4267

Added support for prediction intervals for VARMAX regressor #4267

Conversation

christopherbunn commented Aug 9, 2023

codecov bot commented Aug 9, 2023 • edited Loading

Codecov Report

jeremyliweishih left a comment

Choose a reason for hiding this comment

eccabay left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

christopherbunn Aug 14, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov bot commented Aug 9, 2023 •

edited

Loading

christopherbunn Aug 14, 2023 •

edited

Loading