Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid hard import of sklearn in base module. #5663

Merged
merged 2 commits into from
Nov 21, 2023
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 9 additions & 4 deletions python/cuml/internals/base.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,11 @@ from cuml.internals.safe_imports import (
np = cpu_only_import('numpy')
nvtx_annotate = gpu_only_import_from("nvtx", "annotate", alt=null_decorator)

from sklearn.utils import estimator_html_repr
try:
from sklearn.utils import estimator_html_repr
except ImportError:
estimator_html_repr = None


import cuml
import cuml.common
Expand Down Expand Up @@ -447,9 +451,10 @@ class Base(TagsMixin,

def _repr_mimebundle_(self, **kwargs):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you verify if IPython handles this _repr_mimebundle_ function returning None in the same way as it handles this function not existing? In other words, is the notebook output without sklearn worse than the output before #5630?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know the answer, but I think you could return {"text/plain": repr(self)} in the case of scikit-learn not being installed and that would lead to the same thing being displayed as now. The way the mimebundles work is that you can return several (text, html, png, video, ...) and the UI then chooses the one it things is best. So if we only provide the text based on, it should always choose that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I read the documentation on this and it interpreted the phrase "[i]f this returns something, other _repr_*_ methods are ignored" to mean that if we return None on this function we are not returning "something" so other functions should be used. I also ran some tests in a Jupyter notebook which seemed to confirm that. However, could confirm that. I don't think that returning {"text/plain": repr(self)} is a good idea, because it seems that this function would take precedence over other potentially applicable functions. Which could lead to surprising behavior, especially when child classes try to specialize the representation in a meaningful way.

@betatim Was there a specific reason that you chose to use _repr_mimebundle_ as opposed to overriding _repr_html_?

Copy link
Contributor Author

@csadorf csadorf Nov 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm slightly revising my comment. I think returning {"text/plain": repr(self)} is a good idea as long as we use the _repr_mimebundle_ function, however, I'm wondering whether it wouldn't be be better to use _repr_html_ in the first place.

Copy link
Contributor

@bdice bdice Nov 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that it’s better to use _repr_html_ for supporting only one special repr format.

"""Prepare representations used by jupyter kernels to display estimator"""
output = {"text/plain": repr(self)}
output["text/html"] = estimator_html_repr(self)
return output
if estimator_html_repr is not None:
output = {"text/plain": repr(self)}
output["text/html"] = estimator_html_repr(self)
return output

def set_nvtx_annotations(self):
for func_name in ['fit', 'transform', 'predict', 'fit_transform',
Expand Down