-
Notifications
You must be signed in to change notification settings - Fork 85
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Auto choose most appropriate explainable model #355
base: main
Are you sure you want to change the base?
Auto choose most appropriate explainable model #355
Conversation
gaugup
commented
Dec 17, 2020
- This PR helps choose the best possible surrogate model by training multiple surrogate models based on accuracy or r2_score.
- If the training of multiple surrogate model fails for some reason, then we train the explainable model passed on by the user.
- We compute a replication metric (accuracy for classification and r2_score for regression) which helps find which of the surrogate models was a better fit.
Signed-off-by: Gaurav Gupta <gaugup@microsoft.com>
Signed-off-by: Gaurav Gupta <gaugup@microsoft.com>
Signed-off-by: Gaurav Gupta <gaugup@microsoft.com>
…eModel Signed-off-by: Gaurav Gupta <gaugup@microsoft.com>
Signed-off-by: Gaurav Gupta <gaugup@microsoft.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the code itself looks good but I'm concerned about structure and complexity, maybe we can discuss these changes more before moving forward with this PR
@@ -133,14 +134,19 @@ class MimicExplainer(BlackBoxExplainer): | |||
:param reset_index: Uses the pandas DataFrame index column as part of the features when training | |||
the surrogate model. | |||
:type reset_index: str | |||
:param auto_select_explainable_model: Set this to 'True' if you want to use the MimicExplainer with an |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if this should be a separate explainer or function - mimic explainer takes a specific surrogate model and not a list. This also seems like something that complicates mimic explainer logic. Maybe we can discuss more.
Thinking of other libraries, usually there is a distinction between hyperparameter tuning and training (eg in both v1 studio and designer there is a Train Model and Tune Hyperparameters or Cross validate module, in spark ML the hyperparameter tuner is a separate estimator, in scikit-learn similarly grid search cv is a separate function). I feel like for users who want to do this we should have a separate function/class instead of complicating the current mimic explainer.
@@ -304,14 +313,86 @@ def __init__(self, model, initialization_examples, explainable_model, explainabl | |||
if isinstance(training_data, DenseData): | |||
training_data = training_data.data | |||
|
|||
self._original_eval_examples = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is quite a bit of logic to put inside mimic explainer, I'm really wondering how we could simplify this as mimic explainer is already quite complicated