-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-10595] [ML] [MLLIB] [DOCS] Various ML guide cleanups #8752
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
LDA user guide: EM often begins with useless topics, but running longer generally improves them dramatically. E.g., 10 iterations on a Wikipedia dataset produces useless topics, but 50 iterations produces very meaningful topics. mllib-feature-extraction.html#elementwiseproduct * “w” parameter should be “scalingVec” Clean up Binarizer user guide a little. Document in Pipeline that users should not put an instance into the Pipeline in more than 1 place. spark.ml Word2Vec user guide: * clean up grammar/writing Chi Sq Feature Selector docs * Improve text in doc.
|
Test build #42436 has finished for PR 8752 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if we should include model summaries in this description; I had a mailing list question about where that feature is documented
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, there's not a great place. I'll try sticking a note here.
|
@feynmanliang Thanks for reviewing. Just updated per your comments. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The classname is backticked in ChiSqSelector but not here or in Binarizer, we should choose one and be consistent. I would vote for backticking everything since that's what I've been doing
|
LGTM after changes |
|
Test build #42500 has finished for PR 8752 at commit
|
|
Merged into master. Thanks! |
Various ML guide cleanups.
CC: @mengxr @feynmanliang