-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-9773] [ML] [PySpark] Add Python API for MultilayerPerceptronClassifier #8067
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Test build #40288 has finished for PR 8067 at commit
|
|
It looks like we hit SPARK-7379, I try to find some clue. |
|
Test build #40433 has finished for PR 8067 at commit
|
|
@yanboliang Could you describe the bug in the PR description? |
|
Jenkins, test this please. |
|
Test build #40953 has finished for PR 8067 at commit
|
|
Test build #40954 has finished for PR 8067 at commit
|
|
Jenkins, test this please. |
|
Test build #40958 has finished for PR 8067 at commit
|
|
@yanboliang Shall we split this PR into two? One makes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here we also need to make layers as a property function, but if we do that like weights in the following lines, we will hit SPARK-7379. It can work well in Python 2 but raise errors in Python 3.
I propose to make the layers of MultilayerPerceptronClassificationModel as Vector rather than Array[Int] at Scala side. Because PySpark can tackle Vector elegantly. And I found all other interfaces of ML use Vector rather than Array. @mengxr
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shall we add a package private method to Scala's MPCM that returns a Java list of integers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree, done.
|
Test build #42198 has finished for PR 8067 at commit
|
|
Test build #42261 has finished for PR 8067 at commit
|
python/pyspark/ml/classification.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sqlContext.createDataFrame([
|
Test build #42315 has finished for PR 8067 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
L882 we still keep layers=[1, 1] in doc to tell users the default value.
|
LGTM. Merged into master. Thanks! |
Add Python API for
MultilayerPerceptronClassifier.