[SPARK-19336][ML][Pyspark]: LinearSVC Python API #16694

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Closed

wangmiao1981 wants to merge 7 commits into apache:master from wangmiao1981:ser

Contributor

wangmiao1981 commented Jan 24, 2017

What changes were proposed in this pull request?

Add Python API for the newly added LinearSVC algorithm.

How was this patch tested?

Add new doc string test.

Contributor Author

wangmiao1981 commented Jan 24, 2017

cc @hhbyyh Thanks!

SparkQA commented Jan 24, 2017

Test build #71945 has finished for PR 16694 at commit abafaeb.

This patch fails Python style tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA commented Jan 24, 2017

Test build #71946 has finished for PR 16694 at commit 98bd7e7.

This patch fails Python style tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA commented Jan 24, 2017

Test build #71948 has finished for PR 16694 at commit 2980e67.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

wangmiao1981 added 6 commits

January 25, 2017 12:02


          linearsvm python initial checkin

afd45f5


          check in doc test

fb989d1


          add shared param

5e1a28a


          add a negative test

b3b0497


          fix python style

898dbcb


          fix missing ,

fb6d96f

jkbradley reviewed

View reviewed changes

Member

jkbradley left a comment

Thanks for the PR! Small comments only.

mllib/src/main/scala/org/apache/spark/ml/classification/LinearSVC.scala Outdated

Member

jkbradley Jan 27, 2017

There's no need to change this. Most other algorithms use "set" not "sets"

python/pyspark/ml/classification.py Outdated

Member

jkbradley Jan 27, 2017

Have you tried generating the docs? Check out other examples to see how to do links.

Contributor Author

wangmiao1981 Jan 27, 2017

OK. I will fix it. Thanks!

python/pyspark/ml/classification.py Outdated

Member

jkbradley Jan 27, 2017

Rename bdf -> df

python/pyspark/ml/classification.py Outdated

Member

jkbradley Jan 27, 2017

I'd simplify this example since it is going to be part of the documentation:

Remove "weight"
Just use dense vectors to make the doc clearer. Sparse vectors are tested elsewhere for Python and should be tested in Scala for LinearSVC (for which I'll make a JIRA).
Make the feature vectors be length 2 or 3

Contributor Author

wangmiao1981 Jan 27, 2017

OK. I will modify it.

python/pyspark/ml/classification.py Outdated

Member

jkbradley Jan 27, 2017

No need to test sparse vectors here

python/pyspark/ml/classification.py Outdated

Member

jkbradley Jan 27, 2017

Put this in a unit test (tests.py), not here in the doc tests (though I also don't think you really need this test)

Contributor Author

wangmiao1981 Jan 27, 2017

I follow the LogisticRegression to create this test. I will remove it. Thanks!

Member

jkbradley Jan 27, 2017

I know, there are some not great examples to follow. It'd be nice to clean those out sometime...


          address review comments

e2e9943

wangmiao1981 force-pushed the ser branch from 2980e67 to e2e9943 Compare

January 27, 2017 18:04

SparkQA commented Jan 27, 2017

Test build #72080 has finished for PR 16694 at commit e2e9943.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

Member

jkbradley commented Jan 28, 2017

LGTM, thank you!
Merging with master

asfgit closed this in

bb1a1fe

cmonkey pushed a commit to cmonkey/spark that referenced this pull request


          [SPARK-19336][ML][PYSPARK] LinearSVC Python API

251864b

## What changes were proposed in this pull request?

Add Python API for the newly added LinearSVC algorithm.

## How was this patch tested?

Add new doc string test.

Author: wm624@hotmail.com <wm624@hotmail.com>

Closes apache#16694 from wangmiao1981/ser.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet