Can Autosklearn be used with SHAP? #1272

roger-yu-ds · 2021-10-28T10:11:01Z

Can the model be used with SHAP?

Currently

import shap
explainer = shap.Explainer(model)

results in

---------------------------------------------------------------------------
Exception                                 Traceback (most recent call last)
<ipython-input-361-5c713ec694d6> in <module>
----> 1 explainer = shap.Explainer(model)

~/anaconda3/envs/python3/lib/python3.6/site-packages/shap/explainers/_explainer.py in __init__(self, model, masker, link, algorithm, output_names, feature_names, **kwargs)
    145                 # if we get here then we don't know how to handle what was given to us
    146                 else:
--> 147                     raise Exception("The passed model is not callable and cannot be analyzed directly with the given masker! Model: " + str(model))
    148 
    149             # build the right subclass

Exception: The passed model is not callable and cannot be analyzed directly with the given masker! Model: AutoSklearn2Classifier(delete_output_folder_after_terminate=False,
                       ensemble_size=1, memory_limit=7000, metric=f1, n_jobs=8,
                       output_folder='automl4_preds', per_run_time_limit=480,
                       time_left_for_this_task=600)

System Details (if relevant)

auto-sklearn 0.12.7
shap 0.38.1
Running on Linux?

The text was updated successfully, but these errors were encountered:

eddiebergman · 2021-10-28T11:31:50Z

Hi @roger-yu-ds,

Looking at shap source code for Explainer.__init__(), it seems that it will not work out of the box as you have given.

I would suggest using their model agnostic Explainer.

estimator = AutoSklearnClassifier(...) 
explainer = shap.KernelExplainer(estimator.predict_proba, shap.sample(X_train, 128))

I would also advise updating auto-sklearn to 14.0 as we fixed some issues with probability outputs in some scenarios. Hopefully it won't matter but if you face issues with incorrect probability sizes, this should fix it.

mfeurer · 2021-10-29T07:18:46Z

You can also find an example in this notebook: https://github.com/automl/auto-sklearn-talks/blob/main/2021_07_28_EuroPython/Tutorial-Regression.ipynb

mfeurer · 2021-11-11T08:12:15Z

We should give an example, also inspired by this AG example that demonstrates usage with categorical data.

Unn20 · 2022-05-02T14:05:08Z

Hi @eddiebergman

Unfortunately, Your method won't work with cateogrical data

Here I described my problem:
shap/shap#2530

eddiebergman · 2022-05-02T14:29:45Z

Hey @Unn20, thanks for sharing the issue with us and raising it with them. It would be good to have test suites for external tool usage with auto-sklearn at some point so we can catch them and point to them too.

eliwoods · 2023-01-13T21:55:52Z

I ran into the same issue with trying to use a pandas trained model with categorical data in shap. This is the workaround that I landed on, which amounts to encoding categorical columns as floats before passing to autosklearn. One thing that would potentially simplify this is placing everything in an sklearn Pipeline (i.e. ColumnTransformer -> FunctionTransformer for remapping columns -> autosklearn Model), I'm not sure if that would put you back to square one with shap issues though.

from autosklearn.classification import AutoSklearnClassifier
from sklearn.datasets import fetch_openml
from sklearn.metrics import accuracy_score
from sklearn.model_selection import train_test_split
from category_encoders import OrdinalEncoder
import numpy as np
import shap


def main():
    # Note that this dataset has categorical features that are already numerically encoded,
    # so we technically don't need to do any of this. However it still works as an illustrative example
    bunch = fetch_openml(data_id=40981, as_frame=True)
    y = bunch["target"]
    X = bunch["data"]

    # Fit and transform our data using category_encoders.OrdinalEncoder. This is a more convenient implementation than
    # sklearn's, especially given the outdated version of sklearn required for auto-sklearn
    cat_cols = [c for c in X.select_dtypes(['category']).columns]
    enc = OrdinalEncoder(cols=cat_cols)
    X_trans = enc.fit_transform(X)

    # Now we can transform our dataframes to numpy arrays to pass to autosklearn
    Xnp = X_trans.to_numpy(dtype=np.float64)
    ynp = y.to_numpy(dtype=np.float64)

    X_train, X_test, y_train, y_test = train_test_split(Xnp, ynp, random_state=1)

    # List to tell autosklearn which columns are categorical features. This may change depending on
    # if you reorder columns post encoding
    feat_type = ["Categorical" if x.name == "category" else "Numerical" for x in X.dtypes]
    cls = AutoSklearnClassifier(
        time_left_for_this_task=120,
        per_run_time_limit=30,
        # Required on OSX otherwise autosklearn crashes
        memory_limit=None,
    )
    cls.fit(X_train, y_train, X_test, y_test, feat_type=feat_type)

    yhat = cls.predict(X_test)
    print('Model accuracy: ', accuracy_score(y_test, yhat))

    # Now to show that it works in shap. This is not the optimal way to explain this dataset
    # with shap as it took ~10 minutes to run. You'll want to adjust based on your use case
    # and dataset size
    explainer = shap.KernelExplainer(
        cls.predict_proba,
        shap.kmeans(X_train, k=10),
        feature_names=X.columns.values,
    )
    shap_values = explainer.shap_values(X_test[:50])
    shap.summary_plot(shap_values, X_test[:50], feature_names=X.columns.values)


if __name__ == '__main__':
    main()

The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
Model accuracy:  0.8728323699421965
100%|██████████| 50/50 [10:05<00:00, 12.11s/it]

Process finished with exit code 0

This was run with the following versions on macOS 12.2.1 on the M1 chipset:

auto-sklearn==0.14.7
shap==0.41.0
category-encoders==2.5.1.post0

Edit: Updated code to use category_encoders.OrdinalEncoder which simplifies the transformation step.

roger-yu-ds mentioned this issue Oct 28, 2021

Pipeline Export #952

Closed

eddiebergman changed the title ~~[Question] My Question?~~ Can Autosklearn be used with SHAP? Oct 28, 2021

eddiebergman added the Feedback-Required label Oct 28, 2021

mfeurer added documentation Something to be documented and removed Feedback-Required labels Nov 9, 2021

eddiebergman added the bug label Jun 10, 2022

eddiebergman mentioned this issue Jul 21, 2023

What's in store for Auto-Sklearn? -- From the Developers #1677

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can Autosklearn be used with SHAP? #1272

Can Autosklearn be used with SHAP? #1272

roger-yu-ds commented Oct 28, 2021 •

edited by eddiebergman

Loading

eddiebergman commented Oct 28, 2021 •

edited

Loading

mfeurer commented Oct 29, 2021

mfeurer commented Nov 11, 2021

Unn20 commented May 2, 2022

eddiebergman commented May 2, 2022

eliwoods commented Jan 13, 2023 •

edited

Loading

Can Autosklearn be used with SHAP? #1272

Can Autosklearn be used with SHAP? #1272

Comments

roger-yu-ds commented Oct 28, 2021 • edited by eddiebergman Loading

System Details (if relevant)

eddiebergman commented Oct 28, 2021 • edited Loading

mfeurer commented Oct 29, 2021

mfeurer commented Nov 11, 2021

Unn20 commented May 2, 2022

eddiebergman commented May 2, 2022

eliwoods commented Jan 13, 2023 • edited Loading

roger-yu-ds commented Oct 28, 2021 •

edited by eddiebergman

Loading

eddiebergman commented Oct 28, 2021 •

edited

Loading

eliwoods commented Jan 13, 2023 •

edited

Loading