[LogisticRegressionMG] Support sparse vectors #5632

lijinf2 · 2023-10-26T22:33:54Z

No description provided.

copy-pr-bot · 2023-10-26T22:33:56Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

…ently calculated reg_loss

…oducible when running test_dask_base.py with 2 GPUs

cjnolet

We've discussed this previously and I think the impl mostly looks good. Just two very small but important things.

cjnolet · 2023-11-16T18:58:08Z

cpp/include/cuml/linear_model/qn_mg.hpp

@@ -63,6 +63,22 @@ void qnFit(raft::handle_t& handle,
           float* f,
           int* num_iters);

+/**
+ * TODO: add docstring


This is part of the public API so we should add a docstring here.

Thanks for helping review the PR. Just added the docstring.

cjnolet · 2023-11-27T22:56:51Z

python/cuml/tests/dask/test_dask_logistic_regression.py

+@pytest.mark.parametrize("dtype", [np.float32])
+def test_sparse_nlp20news(dtype, nlp_20news, client):
+    # sklearn score with max_iter = 10000
+    sklearn_score = 0.878


This could become a maintenance nightmare should sklearn ever get updated. Any reason we can't just compute this in the test?

Sounds Good! Just revised the test test to compute sklearn accuracy in the test.

…e_nlp20news

cjnolet

LGTM!

cjnolet · 2023-11-28T21:01:43Z

/merge

…ly one class (#5655) This pull request introduces functionality for C++ training on datasets with a single label. It helps Spark Rapids ML match Spark's behavior. Additionally, it updates the Dask class to generate an error message, consistent with Scikit-learn's behavior. This PR depends on #5632 Authors: - Jinfeng Li (https://github.com/lijinf2) Approvers: - Simon Adorf (https://github.com/csadorf) - Corey J. Nolet (https://github.com/cjnolet) URL: #5655

lijinf2 requested review from a team as code owners October 26, 2023 22:33

github-actions bot added Cython / Python Cython or Python issue CUDA/C++ labels Oct 26, 2023

lijinf2 force-pushed the fea_lrmg_sparse branch from a838ac8 to 92d4084 Compare October 26, 2023 22:41

lijinf2 added improvement Improvement / enhancement to an existing function breaking Breaking change labels Oct 26, 2023

lijinf2 force-pushed the fea_lrmg_sparse branch 2 times, most recently from f033c56 to 5d29871 Compare October 26, 2023 22:43

lijinf2 added 4 commits October 26, 2023 15:44

support sparse vector, reveal a potential issue

7cda477

resolved hanging issue by using averaged reg_loss instead of independ…

1739e9f

…ently calculated reg_loss

broadcast GPU 0 coefficients to all other GPUs to avoid divergence

04b41c7

remove a test case that reveals NCCl error relates to empty GPU. Repr…

22e90fe

…oducible when running test_dask_base.py with 2 GPUs

lijinf2 force-pushed the fea_lrmg_sparse branch from 5d29871 to 22e90fe Compare October 26, 2023 22:45

lijinf2 added the 3 - Ready for Review Ready for review by team label Oct 27, 2023

lijinf2 mentioned this pull request Nov 14, 2023

[LogisticRegressionMG][FEA] Support training when dataset contains only one class #5655

Merged

Merge branch 'branch-23.12' into fea_lrmg_sparse

2e2d688

cjnolet requested changes Nov 27, 2023

View reviewed changes

revise the PR to add API docstring and compare with CPU in test_spars…

866742e

…e_nlp20news

cjnolet approved these changes Nov 28, 2023

View reviewed changes

cjnolet assigned lijinf2 Nov 28, 2023

rapids-bot bot merged commit 197d4f3 into rapidsai:branch-23.12 Nov 28, 2023
52 checks passed

lijinf2 deleted the fea_lrmg_sparse branch November 28, 2023 21:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[LogisticRegressionMG] Support sparse vectors #5632

[LogisticRegressionMG] Support sparse vectors #5632

lijinf2 commented Oct 26, 2023

copy-pr-bot bot commented Oct 26, 2023

cjnolet left a comment

cjnolet Nov 16, 2023

lijinf2 Nov 28, 2023

cjnolet Nov 27, 2023

lijinf2 Nov 28, 2023 •

edited

Loading

cjnolet left a comment

cjnolet commented Nov 28, 2023

[LogisticRegressionMG] Support sparse vectors #5632

[LogisticRegressionMG] Support sparse vectors #5632

Conversation

lijinf2 commented Oct 26, 2023

copy-pr-bot bot commented Oct 26, 2023

cjnolet left a comment

Choose a reason for hiding this comment

cjnolet Nov 16, 2023

Choose a reason for hiding this comment

lijinf2 Nov 28, 2023

Choose a reason for hiding this comment

cjnolet Nov 27, 2023

Choose a reason for hiding this comment

lijinf2 Nov 28, 2023 • edited Loading

Choose a reason for hiding this comment

cjnolet left a comment

Choose a reason for hiding this comment

cjnolet commented Nov 28, 2023

lijinf2 Nov 28, 2023 •

edited

Loading