Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve model selection #848

Merged
merged 13 commits into from
Feb 14, 2024
Merged

Improve model selection #848

merged 13 commits into from
Feb 14, 2024

Conversation

kbattocchi
Copy link
Collaborator

Fixes two-stage model selection used by IV models, so that we first select and train nuisances for y, t, etc., and then select and train the covariance model. Added central documentation for how to specify how to select models and simplifies the documentation for each estimator by pointing there.

Signed-off-by: Keith Battocchi <kebatt@microsoft.com>
Signed-off-by: Keith Battocchi <kebatt@microsoft.com>
@fverac
Copy link
Collaborator

fverac commented Feb 7, 2024

Should we add a test that sklearn pipelines work correctly?

@kbattocchi
Copy link
Collaborator Author

Should we add a test that sklearn pipelines work correctly?

Good point, I'll add several tests covering this functionality.

econml/_ortho_learner.py Outdated Show resolved Hide resolved
Signed-off-by: Keith Battocchi <kebatt@microsoft.com>
Signed-off-by: Keith Battocchi <kebatt@microsoft.com>
Signed-off-by: Keith Battocchi <kebatt@microsoft.com>
Signed-off-by: Keith Battocchi <kebatt@microsoft.com>
Signed-off-by: Keith Battocchi <kebatt@microsoft.com>
Signed-off-by: Keith Battocchi <kebatt@microsoft.com>
Signed-off-by: Keith Battocchi <kebatt@microsoft.com>
Signed-off-by: Keith Battocchi <kebatt@microsoft.com>
Signed-off-by: Keith Battocchi <kebatt@microsoft.com>
Signed-off-by: Keith Battocchi <kebatt@microsoft.com>
Signed-off-by: Keith Battocchi <kebatt@microsoft.com>
scores.append(model.best_score)
self._all_scores = scores
self._best_score = np.max(scores)
self._best_model = self.models[np.argmax(scores)]

else:
self._best_model.train(is_selecting, *args, **kwargs)
self._best_model.train(is_selecting, folds, *args, **kwargs)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is_selecting will be False here. Do we need to pass folds in this case?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's academic; if is_selecting is False then folds will be None anyway so we could pass folds or we could equivalently pass None.

But I think this is slightly more robust to changes in the upstream logic, since if for some reason we ever change that invariant then the logic here might not need to be changed.

Copy link
Collaborator

@fverac fverac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

@kbattocchi kbattocchi merged commit 329effa into main Feb 14, 2024
77 checks passed
@kbattocchi kbattocchi deleted the kebatt/improveModelSelection branch February 14, 2024 17:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants