-
Notifications
You must be signed in to change notification settings - Fork 538
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[LogisticRegressionMG] Support sparse vectors #5632
Conversation
a838ac8
to
92d4084
Compare
f033c56
to
5d29871
Compare
…ently calculated reg_loss
…oducible when running test_dask_base.py with 2 GPUs
5d29871
to
22e90fe
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We've discussed this previously and I think the impl mostly looks good. Just two very small but important things.
@@ -63,6 +63,22 @@ void qnFit(raft::handle_t& handle, | |||
float* f, | |||
int* num_iters); | |||
|
|||
/** | |||
* TODO: add docstring |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is part of the public API so we should add a docstring here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for helping review the PR. Just added the docstring.
@pytest.mark.parametrize("dtype", [np.float32]) | ||
def test_sparse_nlp20news(dtype, nlp_20news, client): | ||
# sklearn score with max_iter = 10000 | ||
sklearn_score = 0.878 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could become a maintenance nightmare should sklearn ever get updated. Any reason we can't just compute this in the test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds Good! Just revised the test test to compute sklearn accuracy in the test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
/merge |
…ly one class (#5655) This pull request introduces functionality for C++ training on datasets with a single label. It helps Spark Rapids ML match Spark's behavior. Additionally, it updates the Dask class to generate an error message, consistent with Scikit-learn's behavior. This PR depends on #5632 Authors: - Jinfeng Li (https://github.com/lijinf2) Approvers: - Simon Adorf (https://github.com/csadorf) - Corey J. Nolet (https://github.com/cjnolet) URL: #5655
No description provided.