[REVIEW] Estimator Pickling Demo & Adding to Docs #3154

cjnolet · 2020-11-18T17:26:51Z

We have had a few inquiries recently about model serialization and I think it would be useful to have a very simple notebook demonstrating very basic operation.

review-notebook-app · 2020-11-18T17:26:55Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

GPUtester · 2020-11-18T17:27:23Z

Please update the changelog in order to start CI tests.

View the gpuCI docs here.

docs/source/pickling_cuml_models.ipynb

wphicks

This looks good, @cjnolet! Good, straightforward walkthrough. My only general feedback is that it might be helpful to have a little bit more context for some of what you're demonstrating in the middle there.

codecov-io · 2020-11-18T23:43:39Z

Codecov Report

Merging #3154 (b822dfc) into branch-0.17 (d8b4765) will increase coverage by 0.18%.
The diff coverage is n/a.

@@               Coverage Diff               @@
##           branch-0.17    #3154      +/-   ##
===============================================
+ Coverage        70.50%   70.68%   +0.18%     
===============================================
  Files              196      197       +1     
  Lines            15442    15564     +122     
===============================================
+ Hits             10887    11002     +115     
- Misses            4555     4562       +7

Impacted Files	Coverage Δ
python/cuml/manifold/umap.pyx	`92.40% <0.00%> (-0.32%)`	⬇️
python/cuml/__init__.py	`100.00% <0.00%> (ø)`
python/cuml/metrics/__init__.py	`100.00% <0.00%> (ø)`
python/cuml/common/array_sparse.py	`94.44% <0.00%> (ø)`
python/cuml/dask/common/__init__.py	`100.00% <0.00%> (ø)`
python/cuml/dask/solvers/__init__.py	`80.00% <0.00%> (ø)`
python/cuml/dask/ensemble/__init__.py	`83.33% <0.00%> (ø)`
python/cuml/dask/manifold/__init__.py	`80.00% <0.00%> (ø)`
python/cuml/dask/neighbors/__init__.py	`85.71% <0.00%> (ø)`
python/cuml/metrics/cluster/__init__.py	`100.00% <0.00%> (ø)`
... and 3 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d8b4765...b822dfc. Read the comment docs.

JohnZed

I think the pickling examples look great, though we might want to make it even shorter just so we're purely emphasizing persistence. I'd take out the first visualization and maybe slim one or two other comments as noted. The single-gpu example here is very barebones, and I think that works really well. MG is obviously more complex, but minimizing complexity is really helpful there too.

docs/source/pickling_cuml_models.ipynb

wphicks

Looks great @cjnolet! Just one missing word, but besides that it looks good.

I noticed that you dropped the actual example of ParallelPostFit, which I thought was helpful before. @JohnZed is that what you meant in your comment about "minimizing complexity" with MG? If not, I'd advocate for bringing it back, since I thought it was pretty straightforward and useful. It's not absolutely necessary for this demo, though, so let's not hold up this PR if there's disagreement there.

cjnolet · 2020-11-19T16:47:49Z

@wphicks, as far as content is concerned, I fully agree that it would be useful to bring back the ParallelPostFit cell.

It's not absolutely necessary for this demo, though, so let's not hold up this PR if there's disagreement there.

I could go either way with it. While it's not directly related to pickling models, I do think it provides a users a useful example that they may otherwise not know about. The other challenge is that in order for CI to build and run the notebook with the current configuration, it looks like either cuml would need to add dask-ml as a dependency or the RAPIDS doc-building umbrella recipe will need to depend on it. I'm okay with either merging before or after we figure out which direction to go with the dependency. I don't think there's a huge rush to get it in, but there have been 2 users in the past couple of weeks that have been confused about pickling of distributed models.

wphicks

Sounds good! In that case, I'd advocate for getting it merged as is and doing a follow-on once we have time to figure out the dependency issues, etc. I did see the user questions you mentioned, so it would be good to make sure we put something up for that eventually, but this PR is useful and clear in its current state. Let's merge it!

cjnolet · 2020-11-30T15:42:25Z

Created #3199, which we should be able to get in early 0.18. Merging this for now to get pickling docs in for 0.17.

Adding simple dask estimator notebook to demonstrate saving/loading

0643650

cjnolet added 2 commits November 18, 2020 13:45

Renaming and updating cells

581323c

Updating source.rst

862a91d

cjnolet changed the title ~~[WIP] Adding simple dask estimator demo notebook~~ [REVIEW] Estimator Pickling Demo & Adding to Docs Nov 18, 2020

Updating changelog

4820bf3

cjnolet requested review from JohnZed, Salonijain27 and dantegd November 18, 2020 18:56

cjnolet added 2 commits November 18, 2020 13:58

Merge branch 'branch-0.17' into fea-017-dist_estimators_nb

2a8f74e

Updating pickling notebook

e3b1676

cjnolet added 3 - Ready for Review Ready for review by team doc Documentation Notebook Issue or PR related to cuML notebook examples labels Nov 18, 2020

wphicks reviewed Nov 18, 2020

View reviewed changes

docs/source/pickling_cuml_models.ipynb Show resolved Hide resolved

docs/source/pickling_cuml_models.ipynb Show resolved Hide resolved

wphicks reviewed Nov 18, 2020

View reviewed changes

JohnZed reviewed Nov 19, 2020

View reviewed changes

docs/source/pickling_cuml_models.ipynb Outdated Show resolved Hide resolved

docs/source/pickling_cuml_models.ipynb Outdated Show resolved Hide resolved

Review updates

78bb4f7

wphicks reviewed Nov 19, 2020

View reviewed changes

docs/source/pickling_cuml_models.ipynb Show resolved Hide resolved

wphicks requested changes Nov 19, 2020

View reviewed changes

More review feedback

b822dfc

wphicks approved these changes Nov 19, 2020

View reviewed changes

Merge branch 'branch-0.17' into fea-017-dist_estimators_nb

59a1750

cjnolet removed the 3 - Ready for Review Ready for review by team label Nov 30, 2020

cjnolet added the 5 - Ready to Merge Testing and reviews complete, ready to merge label Nov 30, 2020

cjnolet merged commit 63c8a44 into rapidsai:branch-0.17 Nov 30, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[REVIEW] Estimator Pickling Demo & Adding to Docs #3154

[REVIEW] Estimator Pickling Demo & Adding to Docs #3154

cjnolet commented Nov 18, 2020

review-notebook-app bot commented Nov 18, 2020

GPUtester commented Nov 18, 2020

wphicks left a comment

codecov-io commented Nov 18, 2020 •

edited

Loading

JohnZed left a comment

wphicks left a comment

cjnolet commented Nov 19, 2020

wphicks left a comment

cjnolet commented Nov 30, 2020

[REVIEW] Estimator Pickling Demo & Adding to Docs #3154

[REVIEW] Estimator Pickling Demo & Adding to Docs #3154

Conversation

cjnolet commented Nov 18, 2020

review-notebook-app bot commented Nov 18, 2020

GPUtester commented Nov 18, 2020

wphicks left a comment

Choose a reason for hiding this comment

codecov-io commented Nov 18, 2020 • edited Loading

Codecov Report

JohnZed left a comment

Choose a reason for hiding this comment

wphicks left a comment

Choose a reason for hiding this comment

cjnolet commented Nov 19, 2020

wphicks left a comment

Choose a reason for hiding this comment

cjnolet commented Nov 30, 2020

codecov-io commented Nov 18, 2020 •

edited

Loading