[REVIEW] Kernel shap improvements #5187

vinaydes · 2023-02-01T09:57:57Z

Removed slow modulo operator by minor change in index arithmetic. This gave me following performance improvement for a test case:

	branch-23.02	kernel-shap-improvments	Gain
sampled_rows_kernel	663	193	3.4x
exact_rows_kernel	363	236	1.5x

All times in microseconds.

Code used for benchmarking:

from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestRegressor as rf
from cuml.explainer import KernelExplainer

import numpy as np

data, labels = make_classification(n_samples=1000, n_features=20, n_informative=20,  random_state=42,
  n_redundant=0, n_repeated=0)

X_train, X_test, y_train, y_test = train_test_split(data, labels, train_size=998,
                                                    random_state=42) #sklearn train_test_split
y_train = np.ravel(y_train)
y_test = np.ravel(y_test)

model = rf(random_state=42).fit(X_train, y_train)
cu_explainer = KernelExplainer(model=model.predict, data=X_train, is_gpu_model=False, random_state=42, nsamples=100)
cu_shap_values = cu_explainer.shap_values(X_test)
print('cu_shap:', cu_shap_values)

vinaydes · 2023-02-01T15:22:48Z

I'll take a look at CI failures.

codecov-commenter · 2023-02-02T19:46:22Z

Codecov Report

❗ No coverage uploaded for pull request base (branch-23.04@b26c212). Click here to learn what that means.
Patch has no changes to coverable lines.

Additional details and impacted files

@@               Coverage Diff               @@
##             branch-23.04    #5187   +/-   ##
===============================================
  Coverage                ?   67.12%           
===============================================
  Files                   ?      192           
  Lines                   ?    12396           
  Branches                ?        0           
===============================================
  Hits                    ?     8321           
  Misses                  ?     4075           
  Partials                ?        0

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

vinaydes · 2023-02-03T00:16:58Z

CI is successful, ready to merge.

dantegd · 2023-02-07T15:47:12Z

Changes look great! Just merging branch-23.04 into the PR and will merge after CI runs. There's a new CI check that fails if a PR is 5+ commits behind the target branch, we might increase that threshold.

dantegd · 2023-02-08T16:43:05Z

/merge

Removed slow modulo operator by minor change in index arithmetic. This gave me following performance improvement for a test case: | | branch-23.02 |kernel-shap-improvments | Gain | |-------------------------|------------------|-------------------------|------| | sampled_rows_kernel | 663 | 193 | 3.4x | | exact_rows_kernel | 363 | 236 | 1.5x | All times in microseconds. Code used for benchmarking: ```python from sklearn.datasets import make_classification from sklearn.model_selection import train_test_split from sklearn.ensemble import RandomForestRegressor as rf from cuml.explainer import KernelExplainer import numpy as np data, labels = make_classification(n_samples=1000, n_features=20, n_informative=20, random_state=42, n_redundant=0, n_repeated=0) X_train, X_test, y_train, y_test = train_test_split(data, labels, train_size=998, random_state=42) #sklearn train_test_split y_train = np.ravel(y_train) y_test = np.ravel(y_test) model = rf(random_state=42).fit(X_train, y_train) cu_explainer = KernelExplainer(model=model.predict, data=X_train, is_gpu_model=False, random_state=42, nsamples=100) cu_shap_values = cu_explainer.shap_values(X_test) print('cu_shap:', cu_shap_values) ``` Authors: - Vinay Deshpande (https://github.com/vinaydes) - Dante Gama Dessavre (https://github.com/dantegd) Approvers: - Dante Gama Dessavre (https://github.com/dantegd) URL: rapidsai#5187

vinaydes added 4 commits February 1, 2023 14:30

Removing the slow modulo operator from kernels

f839cbb

Fixing style

880ab40

Removing commented code

d48d698

Merge branch 'branch-23.04' into kernel-shap-improvements

75ae955

vinaydes requested a review from a team as a code owner February 1, 2023 09:57

github-actions bot added the CUDA/C++ label Feb 1, 2023

vinaydes changed the title ~~Kernel shap improvements~~ [WIP] Kernel shap improvements Feb 1, 2023

vinaydes added 2 commits February 2, 2023 21:56

Fixing an issue with the index arithmetic and simplifying it further

a7d0431

Merge branch 'branch-23.04' into kernel-shap-improvements

78c11c7

vinaydes changed the title ~~[WIP] Kernel shap improvements~~ [REVIEW] Kernel shap improvements Feb 2, 2023

dantegd approved these changes Feb 7, 2023

View reviewed changes

Merge branch 'branch-23.04' into kernel-shap-improvements

c3ff90f

dantegd added improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Feb 7, 2023

rapids-bot bot merged commit bd138d8 into rapidsai:branch-23.04 Feb 8, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[REVIEW] Kernel shap improvements #5187

[REVIEW] Kernel shap improvements #5187

vinaydes commented Feb 1, 2023 •

edited

Loading

vinaydes commented Feb 1, 2023

codecov-commenter commented Feb 2, 2023

vinaydes commented Feb 3, 2023

dantegd commented Feb 7, 2023

dantegd commented Feb 8, 2023

[REVIEW] Kernel shap improvements #5187

[REVIEW] Kernel shap improvements #5187

Conversation

vinaydes commented Feb 1, 2023 • edited Loading

vinaydes commented Feb 1, 2023

codecov-commenter commented Feb 2, 2023

Codecov Report

vinaydes commented Feb 3, 2023

dantegd commented Feb 7, 2023

dantegd commented Feb 8, 2023

vinaydes commented Feb 1, 2023 •

edited

Loading