Improve model selection #848

kbattocchi · 2024-02-02T17:38:01Z

Fixes two-stage model selection used by IV models, so that we first select and train nuisances for y, t, etc., and then select and train the covariance model. Added central documentation for how to specify how to select models and simplifies the documentation for each estimator by pointing there.

econml/sklearn_extensions/model_selection.py

Signed-off-by: Keith Battocchi <kebatt@microsoft.com>

fverac · 2024-02-07T14:37:01Z

Should we add a test that sklearn pipelines work correctly?

kbattocchi · 2024-02-07T16:35:20Z

Should we add a test that sklearn pipelines work correctly?

Good point, I'll add several tests covering this functionality.

econml/_ortho_learner.py

Signed-off-by: Keith Battocchi <kebatt@microsoft.com>

econml/sklearn_extensions/model_selection.py

econml/tests/test_model_selection.py

fverac · 2024-02-14T16:34:38Z

econml/sklearn_extensions/model_selection.py

                scores.append(model.best_score)
            self._all_scores = scores
            self._best_score = np.max(scores)
            self._best_model = self.models[np.argmax(scores)]

        else:
-            self._best_model.train(is_selecting, *args, **kwargs)
+            self._best_model.train(is_selecting, folds, *args, **kwargs)


is_selecting will be False here. Do we need to pass folds in this case?

It's academic; if is_selecting is False then folds will be None anyway so we could pass folds or we could equivalently pass None.

But I think this is slightly more robust to changes in the upstream logic, since if for some reason we ever change that invariant then the logic here might not need to be changed.

fverac

Looks good!

kbattocchi requested a review from fverac February 2, 2024 17:38

fverac reviewed Feb 2, 2024

View reviewed changes

econml/sklearn_extensions/model_selection.py Outdated Show resolved Hide resolved

kbattocchi force-pushed the kebatt/improveModelSelection branch from 3720600 to 9f517fd Compare February 6, 2024 20:09

kbattocchi added 2 commits February 6, 2024 15:44

Restore _rlearner.py line endings

152a609

Signed-off-by: Keith Battocchi <kebatt@microsoft.com>

Pass discrete outcome to nested model

c48fd74

Signed-off-by: Keith Battocchi <kebatt@microsoft.com>

kbattocchi force-pushed the kebatt/improveModelSelection branch from 9f517fd to d6b0353 Compare February 6, 2024 20:45

fverac reviewed Feb 7, 2024

View reviewed changes

econml/_ortho_learner.py Outdated Show resolved Hide resolved

kbattocchi added 10 commits February 12, 2024 09:54

Add needs_fit property

1430fe5

Signed-off-by: Keith Battocchi <kebatt@microsoft.com>

Enable pipeline handling during model selection

188a33b

Signed-off-by: Keith Battocchi <kebatt@microsoft.com>

Enable multiple nuisance fitting passes

7ee2cdc

Signed-off-by: Keith Battocchi <kebatt@microsoft.com>

Use R2 score for ElasticNet model selection

6c1499a

Signed-off-by: Keith Battocchi <kebatt@microsoft.com>

Extend string options for model selection

d3c0ed7

Signed-off-by: Keith Battocchi <kebatt@microsoft.com>

Add model selection documentation

496bbb2

Signed-off-by: Keith Battocchi <kebatt@microsoft.com>

Fix misc. errors flagged by flake8

9fb3701

Signed-off-by: Keith Battocchi <kebatt@microsoft.com>

Always use generated folds for model selection

435ef0f

Signed-off-by: Keith Battocchi <kebatt@microsoft.com>

Fix RidgeCV model selection

4fd28e8

Signed-off-by: Keith Battocchi <kebatt@microsoft.com>

Add model selection tests

d911f1b

Signed-off-by: Keith Battocchi <kebatt@microsoft.com>

kbattocchi force-pushed the kebatt/improveModelSelection branch from d6b0353 to 4c180f7 Compare February 14, 2024 05:35

Update docstrings

d42cb2f

Signed-off-by: Keith Battocchi <kebatt@microsoft.com>

kbattocchi force-pushed the kebatt/improveModelSelection branch from 4c180f7 to d42cb2f Compare February 14, 2024 14:52

kbattocchi requested a review from fverac February 14, 2024 14:53

fverac reviewed Feb 14, 2024

View reviewed changes

econml/sklearn_extensions/model_selection.py Show resolved Hide resolved

fverac reviewed Feb 14, 2024

View reviewed changes

econml/tests/test_model_selection.py Show resolved Hide resolved

fverac reviewed Feb 14, 2024

View reviewed changes

fverac approved these changes Feb 14, 2024

View reviewed changes

kbattocchi merged commit 329effa into main Feb 14, 2024
77 checks passed

kbattocchi deleted the kebatt/improveModelSelection branch February 14, 2024 17:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve model selection #848

Improve model selection #848

kbattocchi commented Feb 2, 2024

fverac commented Feb 7, 2024

kbattocchi commented Feb 7, 2024

fverac Feb 14, 2024

kbattocchi Feb 14, 2024

fverac left a comment

Improve model selection #848

Improve model selection #848

Conversation

kbattocchi commented Feb 2, 2024

fverac commented Feb 7, 2024

kbattocchi commented Feb 7, 2024

fverac Feb 14, 2024

Choose a reason for hiding this comment

kbattocchi Feb 14, 2024

Choose a reason for hiding this comment

fverac left a comment

Choose a reason for hiding this comment