Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Auto choose most appropriate explainable model #355

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

gaugup
Copy link
Collaborator

@gaugup gaugup commented Dec 17, 2020

  • This PR helps choose the best possible surrogate model by training multiple surrogate models based on accuracy or r2_score.
  • If the training of multiple surrogate model fails for some reason, then we train the explainable model passed on by the user.
  • We compute a replication metric (accuracy for classification and r2_score for regression) which helps find which of the surrogate models was a better fit.

Signed-off-by: Gaurav Gupta <gaugup@microsoft.com>
Signed-off-by: Gaurav Gupta <gaugup@microsoft.com>
Signed-off-by: Gaurav Gupta <gaugup@microsoft.com>
…eModel

Signed-off-by: Gaurav Gupta <gaugup@microsoft.com>
Signed-off-by: Gaurav Gupta <gaugup@microsoft.com>
Copy link
Collaborator

@imatiach-msft imatiach-msft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the code itself looks good but I'm concerned about structure and complexity, maybe we can discuss these changes more before moving forward with this PR

@@ -133,14 +134,19 @@ class MimicExplainer(BlackBoxExplainer):
:param reset_index: Uses the pandas DataFrame index column as part of the features when training
the surrogate model.
:type reset_index: str
:param auto_select_explainable_model: Set this to 'True' if you want to use the MimicExplainer with an
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this should be a separate explainer or function - mimic explainer takes a specific surrogate model and not a list. This also seems like something that complicates mimic explainer logic. Maybe we can discuss more.

Thinking of other libraries, usually there is a distinction between hyperparameter tuning and training (eg in both v1 studio and designer there is a Train Model and Tune Hyperparameters or Cross validate module, in spark ML the hyperparameter tuner is a separate estimator, in scikit-learn similarly grid search cv is a separate function). I feel like for users who want to do this we should have a separate function/class instead of complicating the current mimic explainer.

@@ -304,14 +313,86 @@ def __init__(self, model, initialization_examples, explainable_model, explainabl
if isinstance(training_data, DenseData):
training_data = training_data.data

self._original_eval_examples = None
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is quite a bit of logic to put inside mimic explainer, I'm really wondering how we could simplify this as mimic explainer is already quite complicated

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants