-
Notifications
You must be signed in to change notification settings - Fork 550
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[REVIEW] Estimator Pickling Demo & Adding to Docs #3154
[REVIEW] Estimator Pickling Demo & Adding to Docs #3154
Conversation
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
Please update the changelog in order to start CI tests. View the gpuCI docs here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good, @cjnolet! Good, straightforward walkthrough. My only general feedback is that it might be helpful to have a little bit more context for some of what you're demonstrating in the middle there.
Codecov Report
@@ Coverage Diff @@
## branch-0.17 #3154 +/- ##
===============================================
+ Coverage 70.50% 70.68% +0.18%
===============================================
Files 196 197 +1
Lines 15442 15564 +122
===============================================
+ Hits 10887 11002 +115
- Misses 4555 4562 +7
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the pickling examples look great, though we might want to make it even shorter just so we're purely emphasizing persistence. I'd take out the first visualization and maybe slim one or two other comments as noted. The single-gpu example here is very barebones, and I think that works really well. MG is obviously more complex, but minimizing complexity is really helpful there too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great @cjnolet! Just one missing word, but besides that it looks good.
I noticed that you dropped the actual example of ParallelPostFit
, which I thought was helpful before. @JohnZed is that what you meant in your comment about "minimizing complexity" with MG? If not, I'd advocate for bringing it back, since I thought it was pretty straightforward and useful. It's not absolutely necessary for this demo, though, so let's not hold up this PR if there's disagreement there.
@wphicks, as far as content is concerned, I fully agree that it would be useful to bring back the
I could go either way with it. While it's not directly related to pickling models, I do think it provides a users a useful example that they may otherwise not know about. The other challenge is that in order for CI to build and run the notebook with the current configuration, it looks like either cuml would need to add dask-ml as a dependency or the RAPIDS doc-building umbrella recipe will need to depend on it. I'm okay with either merging before or after we figure out which direction to go with the dependency. I don't think there's a huge rush to get it in, but there have been 2 users in the past couple of weeks that have been confused about pickling of distributed models. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good! In that case, I'd advocate for getting it merged as is and doing a follow-on once we have time to figure out the dependency issues, etc. I did see the user questions you mentioned, so it would be good to make sure we put something up for that eventually, but this PR is useful and clear in its current state. Let's merge it!
Created #3199, which we should be able to get in early 0.18. Merging this for now to get pickling docs in for 0.17. |
We have had a few inquiries recently about model serialization and I think it would be useful to have a very simple notebook demonstrating very basic operation.