-
Notifications
You must be signed in to change notification settings - Fork 255
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Decision function for LinearSVC #51
Comments
Decision function method is not yet implemented. BTW it's pretty straightforward: class LinearClassifierMixin(ClassifierMixin):
"""Mixin for linear classifiers.
Handles prediction for sparse and dense X.
"""
def decision_function(self, X):
"""Predict confidence scores for samples.
The confidence score for a sample is the signed distance of that
sample to the hyperplane.
Parameters
----------
X : {array-like, sparse matrix}, shape = (n_samples, n_features)
Samples.
Returns
-------
array, shape=(n_samples,) if n_classes == 2 else (n_samples, n_classes)
Confidence scores per (sample, class) combination. In the binary
case, confidence score for self.classes_[1] where >0 means this
class would be predicted.
"""
if not hasattr(self, 'coef_') or self.coef_ is None:
raise NotFittedError("This %(name)s instance is not fitted"
"yet" % {'name': type(self).__name__})
X = check_array(X, accept_sparse='csr')
n_features = self.coef_.shape[1]
if X.shape[1] != n_features:
raise ValueError("X has %d features per sample; expecting %d"
% (X.shape[1], n_features))
scores = safe_sparse_dot(X, self.coef_.T,
dense_output=True) + self.intercept_
return scores.ravel() if scores.shape[1] == 1 else scores We need to create a spark version of LinearClassifierMixin, simply map the sklearn's decision_function method on the RDD, something like this: class SparkLinearClassifierMixin(LinearClassifierMixin, SparkBroadcasterMixin):
"""Mixin for linear classifiers.
Handles prediction for sparse and dense X.
"""
__transient__ = ['coef_', 'intercept_'] #broadcastable variables, possibly larger arrays
def decision_function(self, X):
check_rdd(X, (sp.spmatrix, np.ndarray))
mapper = self.broadcast(
super(LinearClassifierMixin, self).decision_function, X.context)
return X.map(mapper) Finally extend SparkLinearSVC to support the functionality above: class SparkLinearSVC(LinearSVC, SparkLinearClassifierMixin, SparkLinearModelMixin): We plan to implement it in the next few weeks, but as always, contribution is appreciated :) |
@mrshanth I saw You've implemented the decision function support. Would You make a pull request please? :) |
Hi,
Can we get the confidence score, like we get it in sci-kit learn using decision function method?
I get the following error when I run the code:
error:
Thanks
The text was updated successfully, but these errors were encountered: