Data converter for shap4j, which converts tree ensemble models
trained by XGBoost, LightGBM, CatBoost, scikit-learn and pyspark to .shap4j
data files.
$ python setup.py install
Here, we borrow the example in the shap documentation
import xgboost
import shap
import pickle
# train XGBoost model
X,y = shap.datasets.boston()
model = xgboost.train({"learning_rate": 0.01}, xgboost.DMatrix(X, label=y), 100)
with open("boston.pkl", "wb+") as f:
pickle.dump(model, f)
We can use the shap4jconv
program to convert the boston.pkl
pickle file generated by the previous
example to a .shap4j
data file:
$ shap4jconv boston.pkl
The command above generates the output file as boston.shap4j
Alternatively, you could specify the output file via the --output
parameter:
$ shap4jconv --output model.shap4j --overwrite boston.pkl
where the --overwrite
parameter allows shap4jconv
to overwrite an existing file.
You can also use shap4j-data-converter
in your Python program by simply importing the
Shap4jDataConverter
class from the shap4jconv
package, for example:
from shap4jconv import Shap4jDataConverter
converter = Shap4jDataConverter()
converter.convert("boston.pkl", output_file="dumped_boston.pkl", overwrite=True)
See the Java example of shap4j
for how to integrate
SHAP into your JVM projects using the data files generated above.