Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Application to GradientBoostingTree class #6

Open
roeldobbe opened this issue Jul 8, 2016 · 1 comment
Open

Application to GradientBoostingTree class #6

roeldobbe opened this issue Jul 8, 2016 · 1 comment

Comments

@roeldobbe
Copy link

Hi Ando,

Thanks for this wonderful package, makes my life a lot easier!

The treeinterpreter does not seem to work for the class GradientBoostingTree. It has a bug because this class does not output n_output, which is checked in your code to ensure the model has a univariate output.

This might be a quick fix. Would it be possible to do this?

I used the below code to test it.

Thanks,
Roel

----code----

import numpy as np

from sklearn.metrics import mean_squared_error
from sklearn.datasets import make_friedman1
from sklearn.ensemble import GradientBoostingRegressor

X, y = make_friedman1(n_samples=1200, random_state=0, noise=1.0)
X_train, X_test = X[:200], X[200:]
y_train, y_test = y[:200], y[200:]
gbt = GradientBoostingRegressor(n_estimators=100, learning_rate=0.1,
... max_depth=1, random_state=0, loss='ls').fit(X_train, y_train)
mean_squared_error(y_test, est.predict(X_test))

X.shape

instances = X[300:309,:]
print "Instance 0 prediction:", gbt.predict(instances[0])
print "Instance 1 prediction:", gbt.predict(instances[1])

prediction, bias, contributions = ti.predict(gbt, instances)

@marcbllv
Copy link

Hi Ando !

Would you still be interested in a scikit GBT version of your package ?
I needed one, so I adapted your code and it now runs for Random forests as well as GBT, here: https://github.com/marcbllv/treeinterpreter

I also changed one thing in _predict_forest: now predictions, biases and contributions are preallocated (line 110) before instead of computing everything then averaging. That ensures a more reasonable use of memory.

And like roeldobbe said, thank you for that really nice package !
Cheers
Marc

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants