Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Factor out time series-related functionality into a time series Task object #989

Merged
merged 171 commits into from
Jun 19, 2023
Merged
Show file tree
Hide file tree
Changes from 117 commits
Commits
Show all changes
171 commits
Select commit Hold shift + click to select a range
99f1f2d
Refactor into automl subpackage
markharley Nov 9, 2022
eb7aac9
Fix doc building post automl subpackage refactor
markharley Nov 9, 2022
0eda959
Fix broken links in website post automl subpackage refactor
markharley Nov 9, 2022
c3a567c
Fix broken links in website post automl subpackage refactor
markharley Nov 9, 2022
f148cac
Remove vw from test deps as this is breaking the build
markharley Nov 9, 2022
7ce03a9
Move default back to the top-level
markharley Nov 10, 2022
739b256
Re-add top level modules with deprecation warnings
markharley Nov 13, 2022
4c008e8
Merge branch 'main' into subpackage-refactor-for-automl
qingyun-wu Nov 14, 2022
e7c8f91
Merge branch 'main' into subpackage-refactor-for-automl
qingyun-wu Nov 15, 2022
f845df8
Merge branch 'main' into subpackage-refactor-for-automl
qingyun-wu Nov 19, 2022
36ffb1e
Merge remote-tracking branch 'upstream/main' into subpackage-refactor…
markharley Nov 24, 2022
2386989
Merge microsoft/main into here
EgorKraevTransferwise Dec 2, 2022
3c53eca
Merge remote-tracking branch 'upstream/main' into subpackage-refactor…
markharley Dec 3, 2022
d747851
Fix model.py line-endings
markharley Dec 3, 2022
3a6b95b
WIP
markharley Nov 10, 2022
1e51966
WIP - Notes below
markharley Nov 14, 2022
938e3c9
Merge remote-tracking branch 'upstream/main' into extract-task-class-…
markharley Dec 11, 2022
1cc1f1d
Re-add generic_task
markharley Dec 11, 2022
9ca5f18
Merge remote-tracking branch 'upstream/main' into extract-task-class-…
markharley Dec 19, 2022
8e6a73f
Most of the merge done, test_forecast_automl fit succeeds, fails at p…
EgorKraevTransferwise Dec 21, 2022
413dbe1
Remaining fixes - test_forecast.py passes
EgorKraevTransferwise Dec 22, 2022
4108eb1
Comment out holidays-related code as it's not currently used
EgorKraevTransferwise Dec 23, 2022
9eb62cb
Further holidays cleanup
EgorKraevTransferwise Dec 23, 2022
1472a55
Fix imports in a test
EgorKraevTransferwise Dec 23, 2022
dde26d5
tidy up validate_data in time series task
EgorKraevTransferwise Dec 23, 2022
3aff2c3
Test fixes
EgorKraevTransferwise Dec 24, 2022
5a0694b
Fix tests: add Task.__str__
EgorKraevTransferwise Jan 9, 2023
b5b6cc8
Fix tests: test for ray.ObjectRef
EgorKraevTransferwise Jan 9, 2023
dfcca3b
Hotwire TS_Sklearn wrapper to fix test fail
EgorKraevTransferwise Jan 9, 2023
143205c
Merge remote-tracking branch 'origin/extract-task-class-from-automl' …
EgorKraevTransferwise Jan 9, 2023
9a14188
Attempt at test fix
EgorKraevTransferwise Jan 9, 2023
93f531f
Fix test where val_pred_y is a list
EgorKraevTransferwise Jan 9, 2023
0926f62
Attempt to fix remaining tests
EgorKraevTransferwise Jan 10, 2023
c3464a3
Push to retrigger tests
EgorKraevTransferwise Jan 10, 2023
3d2213a
Push to retrigger tests
EgorKraevTransferwise Jan 10, 2023
e6bf73d
Push to retrigger tests
EgorKraevTransferwise Jan 10, 2023
e97e827
Push to retrigger tests
EgorKraevTransferwise Jan 11, 2023
11614d9
Remove plots from automl/test_forecast
EgorKraevTransferwise Jan 11, 2023
84114ff
Merge remote-tracking branch 'upstream/main' into extract-task-class-…
markharley Jan 11, 2023
0e2877a
Merge remote-tracking branch 'origin/extract-task-class-from-automl' …
markharley Jan 11, 2023
d03b5d5
Remove unused data size field from Task
markharley Jan 11, 2023
8c7e35b
Merge remote-tracking branch 'upstream/main' into extract-task-class-…
markharley Jan 11, 2023
d5466e1
Fix import for CLASSIFICATION in notebook
markharley Jan 11, 2023
d557bab
Monkey patch TFT to avoid plotting, to fix tests on MacOS
EgorKraevTransferwise Jan 12, 2023
f2e2110
Monkey patch TFT to avoid plotting v2, to fix tests on MacOS
EgorKraevTransferwise Jan 12, 2023
9954837
Monkey patch TFT to avoid plotting v2, to fix tests on MacOS
EgorKraevTransferwise Jan 12, 2023
ecefd8c
Fix circular import
EgorKraevTransferwise Jan 12, 2023
09e1ad8
Merge remote-tracking branch 'origin/extract-task-class-from-automl' …
EgorKraevTransferwise Jan 12, 2023
2a7f2dc
remove redundant code in task.py post-merge
EgorKraevTransferwise Jan 12, 2023
3f71ead
Fix test: set svd_solver="full" in PCA
EgorKraevTransferwise Jan 12, 2023
6c42942
Update flaml/automl/data.py
markharley Jan 22, 2023
554e740
Fix review comments
markharley Jan 22, 2023
4469176
Fix task -> str in custom learner constructor
markharley Jan 22, 2023
a0913ab
Merge remote-tracking branch 'upstream/main' into extract-task-class-…
markharley Jan 22, 2023
0a5a135
Remove unused CLASSIFICATION imports
markharley Jan 22, 2023
9df5200
Merge branch 'main' into extract-task-class-from-automl
markharley Feb 4, 2023
0e60ae0
Hotwire TS_Sklearn wrapper to fix test fail by setting
EgorKraevTransferwise Feb 7, 2023
5ec5649
Revert changes to the automl_classification and pin FLAML version
markharley Feb 12, 2023
f473179
Merge branch 'extract-task-class-from-automl' of github.com:markharle…
markharley Feb 12, 2023
af78d65
Fix imports in reverted notebook
markharley Feb 12, 2023
9fc53f9
Fix FLAML version in automl notebooks
markharley Feb 12, 2023
8a0e948
Merge remote-tracking branch 'upstream/main' into extract-task-class-…
markharley Feb 12, 2023
9db1379
Fix ml.py line endings
markharley Feb 12, 2023
9a18f73
Merge branch 'main' into extract-task-class-from-automl
sonichi Feb 17, 2023
e7c85b1
Merge remote-tracking branch 'upstream/main' into extract-task-class-…
markharley Feb 18, 2023
13c6115
Fix CLASSIFICATION task import in automl_classification notebook
markharley Feb 18, 2023
9a452fa
Merge branch 'extract-task-class-from-automl' of github.com:markharle…
markharley Feb 18, 2023
33b1a18
Uncomment pip install in notebook and revert import
markharley Feb 18, 2023
ed250fa
Merge branch 'main' into extract-task-class-from-automl
markharley Feb 18, 2023
e55e35a
Revert c6a5dd1a0
markharley Feb 19, 2023
a44184c
Merge branch 'extract-task-class-from-automl' of github.com:markharle…
markharley Feb 19, 2023
46aeb0d
Merge branch 'extract-task-class-from-automl' into time-series-task
markharley Feb 19, 2023
ae81d18
Fix get_classification_objective import in suggest.py
markharley Feb 19, 2023
48c56d0
Remove hcrystallball docs reference in TS_Sklearn
markharley Feb 19, 2023
507b3d0
Merge branch 'extract-task-class-from-automl' into time-series-task
EgorKraevTransferwise Feb 27, 2023
dbf8728
Merge markharley:extract-task-class-from-automl into this
EgorKraevTransferwise Feb 27, 2023
c74282e
Merge branch 'time-series-task' of https://github.com/markharley/FLAM…
EgorKraevTransferwise Feb 27, 2023
71c86b7
Fix import, remove smooth.py
EgorKraevTransferwise Feb 27, 2023
16b045a
Fix dependencies to fix TFT fail on Windows Python 3.8 and 3.9
EgorKraevTransferwise Mar 1, 2023
74a49d7
Add tensorboardX dependency to fix TFT fail on Windows Python 3.8 and…
EgorKraevTransferwise Mar 1, 2023
d4d87be
Set pytorch-lightning==1.9.0 to fix TFT fail on Windows Python 3.8 a…
EgorKraevTransferwise Mar 1, 2023
9331c81
Set pytorch-lightning==1.9.0 to fix TFT fail on Windows Python 3.8 a…
EgorKraevTransferwise Mar 1, 2023
2a33903
Disable PCA reduction of lagged features for now, to fix svd converve…
EgorKraevTransferwise Mar 2, 2023
18f5029
Merge remote-tracking branch 'origin/main' into time-series-task
EgorKraevTransferwise Mar 13, 2023
f759305
Merge flaml/main into time_series_task
EgorKraevTransferwise Mar 20, 2023
297606e
Attempt to fix formatting
EgorKraevTransferwise Mar 20, 2023
1865558
Attempt to fix formatting
EgorKraevTransferwise Mar 20, 2023
1c37c24
tentatively implement holt-winters-no covariates
andreaw-ag Mar 24, 2023
a97512b
fix forecast method, clean class
andreaw-ag Mar 25, 2023
4342274
checking external regressors too
andreaw-ag Mar 25, 2023
79cafcd
update test forecast
andreaw-ag Mar 25, 2023
d03f320
remove duplicated test file, re-add sarimax, search space cleanup
andreaw-ag Mar 25, 2023
db946f3
Update flaml/automl/model.py
coffepowered Mar 25, 2023
0a04771
Merge branch 'main' into exp_smoothing
coffepowered Mar 27, 2023
43e182d
prevent short series
andreaw-ag Mar 27, 2023
89da8f7
add docs
andreaw-ag Mar 27, 2023
fac9031
Merge remote-tracking branch 'coffee/exp_smoothing' into time-series-…
EgorKraevTransferwise Mar 30, 2023
285fa0b
First attempt at merging Holt-Winters
EgorKraevTransferwise Mar 30, 2023
9f6cc18
Linter fix
EgorKraevTransferwise Mar 30, 2023
7fc2a17
Add holt-winters to TimeSeriesTask.estimators
EgorKraevTransferwise Mar 30, 2023
438bce5
Fix spark test fail
EgorKraevTransferwise Mar 31, 2023
af7e0f8
Attempt to fix another spark test fail
EgorKraevTransferwise Mar 31, 2023
180bace
Attempt to fix another spark test fail
EgorKraevTransferwise Mar 31, 2023
b57813a
Merge remote-tracking branch 'origin/main' into time-series-task
EgorKraevTransferwise Apr 9, 2023
f9dc1c6
Merge branch 'microsoft:main' into time-series-task
EgorKraevTransferwise Apr 10, 2023
cd1b7ee
Merge remote-tracking branch 'origin/main' into time-series-task
EgorKraevTransferwise Apr 11, 2023
8749cdf
Merge remote-tracking branch 'mark_fork/time-series-task' into time-s…
EgorKraevTransferwise Apr 11, 2023
ac47676
Change Black max line length to 127
EgorKraevTransferwise Apr 11, 2023
66b6cb5
Change Black max line length to 120
EgorKraevTransferwise Apr 11, 2023
5a2540f
Merge branch 'main' of https://github.com/microsoft/FLAML into time-s…
Apr 18, 2023
c6fe3d0
Add logging for ARIMA params, clean up time series models inheritance
EgorKraevTransferwise Apr 19, 2023
aef9e4d
Add more logging for missing ARIMA params
EgorKraevTransferwise Apr 19, 2023
db12932
Remove a meaningless test causing a fail, add stricter check on ARIMA…
EgorKraevTransferwise Apr 19, 2023
a230f3d
Fix a bug in HoltWinters
EgorKraevTransferwise Apr 19, 2023
2b16901
A pointless change to hopefully trigger the on and off KeyError in AR…
EgorKraevTransferwise Apr 19, 2023
2cad889
Fix formatting
markharley Apr 24, 2023
a196975
Merge branch 'main' into time-series-task
EgorKraevTransferwise Apr 24, 2023
2b9a404
Merge remote-tracking branch 'origin/main' into time-series-task
EgorKraevTransferwise May 2, 2023
15fea88
Attempt to fix formatting
EgorKraevTransferwise May 2, 2023
597fb1e
Attempt to fix formatting
EgorKraevTransferwise May 2, 2023
0b51496
Attempt to fix formatting
EgorKraevTransferwise May 2, 2023
ef051b3
Merge remote-tracking branch 'origin/main' into time-series-task
EgorKraevTransferwise May 2, 2023
ab2f6ee
Merge branch 'main' into time-series-task
thinkall May 3, 2023
9c13220
Merge branch 'main' into time-series-task
thinkall May 3, 2023
02b30b9
Merge branch 'main' into time-series-task
EgorKraevTransferwise May 3, 2023
688b1a2
Merge branch 'main' into time-series-task
thinkall May 3, 2023
1759d57
Merge remote-tracking branch 'origin/main' into time-series-task
EgorKraevTransferwise May 17, 2023
f50b5dd
Merge branch 'time-series-task' of https://github.com/markharley/FLAM…
EgorKraevTransferwise May 17, 2023
f4f4d6a
Attempt to fix formatting
EgorKraevTransferwise May 17, 2023
8ea71f4
Add type annotations to _train_with_config() in state.py
EgorKraevTransferwise May 17, 2023
25cc7d8
Add type annotations to prepare_sample_train_data() in state.py
EgorKraevTransferwise May 18, 2023
379acc0
Add docstring for time_col argument of AutoML.fit()
EgorKraevTransferwise May 18, 2023
3ed148f
Merge branch 'main' into time-series-task
EgorKraevTransferwise May 22, 2023
19bfb5b
Address @sonichi's comments on PR
EgorKraevTransferwise May 23, 2023
b2ac0cf
Merge branch 'time-series-task' of https://github.com/markharley/FLAM…
EgorKraevTransferwise May 23, 2023
01ff63a
Fix formatting
EgorKraevTransferwise May 23, 2023
1f8aced
Fix formatting
EgorKraevTransferwise May 23, 2023
f350dc0
Merge branch 'main' into time-series-task
EgorKraevTransferwise May 23, 2023
1d599a7
Reduce test time budget
EgorKraevTransferwise May 24, 2023
9da88a0
Reduce test time budget
EgorKraevTransferwise May 24, 2023
0895c3c
Merge branch 'time-series-task' of https://github.com/markharley/FLAM…
EgorKraevTransferwise May 24, 2023
bb621f1
Increase time budget for the test to pass
EgorKraevTransferwise May 24, 2023
e8bac43
Merge remote-tracking branch 'origin/main' into time-series-task
EgorKraevTransferwise May 25, 2023
50de28c
Remove redundant imports
EgorKraevTransferwise May 25, 2023
932d01c
Remove more redundant imports
EgorKraevTransferwise May 25, 2023
a629c54
Minor fixes of points raised by Qingyun
EgorKraevTransferwise May 26, 2023
b41fe40
Try to fix pandas import fail
EgorKraevTransferwise May 27, 2023
f69ec0e
Try to fix pandas import fail, again
EgorKraevTransferwise May 28, 2023
b1b5404
Try to fix pandas import fail, again
EgorKraevTransferwise May 28, 2023
f744227
Try to fix pandas import fail, again
EgorKraevTransferwise May 28, 2023
736ed40
Try to fix pandas import fail, again
EgorKraevTransferwise May 28, 2023
731757d
Try to fix pandas import fail, again
EgorKraevTransferwise May 28, 2023
8fd1235
Try to fix pandas import fail, again
EgorKraevTransferwise May 28, 2023
27c80fb
Try to fix pandas import fail, again
EgorKraevTransferwise May 28, 2023
2213a18
Try to fix pandas import fail, again
EgorKraevTransferwise May 28, 2023
fd427ec
Try to fix pandas import fail, again
EgorKraevTransferwise May 28, 2023
b9a7e28
Try to fix pandas import fail, again
EgorKraevTransferwise May 28, 2023
6fd290b
Formatting fixes
EgorKraevTransferwise May 29, 2023
a288317
More formatting fixes
EgorKraevTransferwise May 29, 2023
fa62ac6
Added test that loops over TS models to ensure coverage
EgorKraevTransferwise May 30, 2023
edd0198
Fix formatting issues
EgorKraevTransferwise May 30, 2023
5d3c109
Fix more formatting issues
EgorKraevTransferwise May 30, 2023
21368d1
Merge remote-tracking branch 'origin/main' into time-series-task
EgorKraevTransferwise May 30, 2023
d1c543e
Fix random fail in check
EgorKraevTransferwise May 30, 2023
10b851d
Put back in tests for ARIMA predict without fit
EgorKraevTransferwise Jun 5, 2023
1c5d6ad
Merge branch 'main' into time-series-task
EgorKraevTransferwise Jun 6, 2023
f5dc200
Put back in tests for lgbm
EgorKraevTransferwise Jun 13, 2023
ff50f14
Merge branch 'time-series-task' of https://github.com/markharley/FLAM…
EgorKraevTransferwise Jun 13, 2023
b7c3ce8
Merge branch 'main' into time-series-task
EgorKraevTransferwise Jun 13, 2023
39cf3e6
Update test/test_model.py
sonichi Jun 13, 2023
f6fdbbf
Match target length to X length in missing test
EgorKraevTransferwise Jun 19, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .flake8
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
[flake8]
ignore = E203, E266, E501, W503, F403, F401, C901
ignore = E203, E266, E402, E501, W503, F403, F401, C901
max-line-length = 127
max-complexity = 10
select = B,C,E,F,W,T4,B9
20 changes: 11 additions & 9 deletions flaml/automl/automl.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,10 +15,9 @@
import json

from flaml.automl.state import SearchState, AutoMLState
from flaml.automl.ml import (
train_estimator,
get_estimator_class,
)
from flaml.automl.ml import train_estimator

from flaml.automl.time_series import TimeSeriesDataset
from flaml.config import (
MIN_SAMPLE_TRAIN,
MEM_THRES,
Expand All @@ -31,7 +30,7 @@
)

# TODO check to see when we can remove these
from flaml.automl.task.task import CLASSIFICATION, TS_FORECAST, Task
from flaml.automl.task.task import CLASSIFICATION, Task
from flaml.automl.task.factory import task_factory
from flaml import tune
from flaml.automl.logger import logger, logger_formatter
Expand Down Expand Up @@ -914,7 +913,7 @@ def _decide_eval_method(self, eval_method, time_budget):
], "eval_method must be 'auto' or 'cv' for custom data splitter."
assert self._state.X_val is None, "custom splitter and custom validation data can't be used together."
return "cv"
if self._state.X_val is not None:
if self._state.X_val is not None and not isinstance(self._state.X_val, TimeSeriesDataset):
sonichi marked this conversation as resolved.
Show resolved Hide resolved
assert eval_method in [
"auto",
"holdout",
Expand Down Expand Up @@ -1159,7 +1158,7 @@ def _prepare_data(self, eval_method, split_ratio, n_splits):
self._df,
self._sample_weight_full,
)
self.data_size_full = len(self._state.y_train_all)
self.data_size_full = self._state.data_size_full

def fit(
self,
Expand Down Expand Up @@ -1211,6 +1210,7 @@ def fit(
free_mem_ratio=0,
metric_constraints=None,
custom_hp=None,
time_col=None,
EgorKraevTransferwise marked this conversation as resolved.
Show resolved Hide resolved
cv_score_agg_func=None,
skip_transform=None,
fit_kwargs_by_estimator=None,
Expand Down Expand Up @@ -1531,6 +1531,7 @@ def cv_score_agg_func(val_loss_folds, log_metrics_folds):
if isinstance(task, str):
task = task_factory(task, X_train, y_train)
self._state.task = task
self._state.task.time_col = time_col
self._estimator_type = "classifier" if task.is_classification() else "regressor"
time_budget = time_budget or self._settings.get("time_budget")
n_jobs = n_jobs or self._settings.get("n_jobs")
Expand Down Expand Up @@ -1832,7 +1833,7 @@ def is_to_reverse_metric(metric, task):
if estimator_name not in self._state.learner_classes:
self.add_learner(
estimator_name,
get_estimator_class(self._state.task, estimator_name),
self._state.task.estimator_class_from_str(estimator_name),
)
# set up learner search space
if isinstance(starting_points, str) and starting_points.startswith("data"):
Expand Down Expand Up @@ -1888,6 +1889,7 @@ def is_to_reverse_metric(metric, task):
self._search_states[estimator_name] = SearchState(
learner_class=estimator_class,
data_size=self._state.data_size,
data=self._state.X_val,
task=self._state.task,
starting_point=starting_points.get(estimator_name),
period=self._state.fit_kwargs.get(
Expand Down Expand Up @@ -2597,7 +2599,7 @@ def _search(self):
if self._max_iter > 1:
self._state.time_budget = -1
if (
self._state.task in TS_FORECAST
self._state.task.is_ts_forecast()
or self._trained_estimator is None
or self._trained_estimator.model is None
or (
Expand Down
Loading