-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-6256] [MLlib] MLlib Python API parity check for regression #4997
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Test build #28509 has started for PR 4997 at commit
|
|
Test build #28509 has finished for PR 4997 at commit
|
|
Test PASSed. |
|
@jkbradley @mengxr Can you review this patch? |
|
@yanboliang |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use builder pattern.
|
Let's also add setIntercept. Also, in addition to setFeatureScaling being private, we do not need to expose optimizer.setUpdater for the 2 algorithms you listed because they have fixed updaters they should use (corresponding to the regularization they use). |
|
Test build #28978 has started for PR 4997 at commit
|
|
Test build #28978 has finished for PR 4997 at commit
|
|
Test PASSed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, "addIntercept" should be more clear and consistent.
|
Test build #29090 has started for PR 4997 at commit
|
|
Test build #29090 has finished for PR 4997 at commit
|
|
Test PASSed. |
python/pyspark/mllib/regression.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm sorry! I got confused about "intercept," thinking it was being added to this class. We should stick with the original name ("intercept") since it's a public API change otherwise.
|
@yanboliang It looks fine to me, except for the intercept issue (sorry!) and for doc tests. Could you please add doc tests for LassoWithSGD, RidgeRegressionWithSGD using setIntercept, setValidateData? Thanks! |
|
Test build #29160 has started for PR 4997 at commit
|
|
Test build #29160 has finished for PR 4997 at commit
|
|
Test PASSed. |
|
@yanboliang LGTM merging into master |
MLlib Python API parity check for Regression, major disparities need to be added for Python list following:
setFeatureScaling is mllib private function which is not needed to expose in pyspark.