-
Notifications
You must be signed in to change notification settings - Fork 534
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add cython-lint configuration. #5439
Conversation
@@ -9,3 +9,7 @@ ignore-words-list = "inout,numer,startd,couldn,referr" | |||
builtin = "clear" | |||
# disable warnings about binary files and wrong encoding | |||
quiet-level = 3 | |||
|
|||
[tool.cython-lint] | |||
max-line-length = 999 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to re-enable E501 with a reasonable line length limit (following our flake8/black conventions) in a follow-up PR. Wrapping lines takes a little more manual effort than I wanted to put in for this first pass.
@@ -347,7 +345,7 @@ class BaseRandomForestModel(Base): | |||
fil_sparse_format, threshold=0.5, | |||
output_class=False, | |||
predict_proba=False) -> CumlArray: | |||
_, n_rows, n_cols, dtype = \ | |||
_, _, _, _ = \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this call to input_to_cuml_array
doing anything? Its results are not used but it might be doing some error checking / validation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will set the classe's input_dtype. Getting rid of unexpected side-effects like that is one of my refactoring goals. Raelized it's not involving the class, might not do anything.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is doing nothing. The CumlArray conversion happens in the FIL call. I suspect this is an artifact of a time before FIL was used for inference here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@wphicks Can you push changes or file an issue if you'd like to see a change here? I don't have enough familiarity with this code, unfortunately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I pushed a commit, removing the affected line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Love it. ❤️
Can you add a note to https://github.com/rapidsai/cuml/blob/branch-23.06/CONTRIBUTING.md#summary-of-pre-commit-hooks ?
Style checks are failing because of incorrect dependency versions. I opened #5440 to resolve it. |
@csadorf I resolved merge conflicts on this PR. Can you re-approve it now that you have Python code owner privileges? |
/merge |
from cuml.manifold.simpl_set import fuzzy_simplicial_set, \ | ||
simplicial_set_embedding |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Deleting the unused import of simplicial_set_embedding
broke test_simpl_set.py
. Is this meant to be a public API exposed from cuml.manifold.umap
? Do we need to worry about other unused imports removed in this PR that are meant to be exposed as public APIs? If so, the proper mechanism is to define __all__ = ["simplicial_set_embedding", ...]
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We will have to perform a more thorough check on which symbols should be included in the public API (and either excluded from linting or explicitly added to __all__
). I will take care of that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @csadorf!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we want to merge this PR sooner, we could disable the linter checks for unused Python imports (unused Cython cimports should all be safe to remove).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, please go ahead with that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I filed an issue about this: MarcoGorelli/cython-lint#79
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cython-lint
also doesn't detect usage in __all__
, so that isn't a viable solution here. I filed a second issue: MarcoGorelli/cython-lint#80
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For now, I added these imports back with # no-cython-lint
in d773305. That should resolve the problem. I reviewed the changes in this PR for other possible public API breaks but I didn't see any candidates except for #5439 (comment).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added a comment that explains the need for the markers there.
@@ -26,10 +25,8 @@ from cuml.internals.input_utils import input_to_cupy_array | |||
from cuml.explainer.base import SHAPBase | |||
from cuml.explainer.common import get_cai_ptr | |||
from cuml.explainer.common import model_func_call | |||
from cuml.explainer.common import output_list_shap_values |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a public API? If so we need to add it back with # no-cython-lint
since it is an unused import. Otherwise no action is needed.
This is the only questionable case I saw, all the other unused imports are clearly not meant to be public APIs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is safe to assume that it is not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just reviewed this change set once more with special attention to the removed imports. That looks all good to me.
I've also pushed three commits with some changes, but once tests pass, this should be ready for merge.
from cuml.manifold.simpl_set import fuzzy_simplicial_set, \ | ||
simplicial_set_embedding |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added a comment that explains the need for the markers there.
python/cuml/manifold/umap_utils.pyx
Outdated
@@ -128,5 +128,5 @@ def find_ab_params(spread, min_dist): | |||
yv = np.zeros(xv.shape) | |||
yv[xv < min_dist] = 1.0 | |||
yv[xv >= min_dist] = np.exp(-(xv[xv >= min_dist] - min_dist) / spread) | |||
params, covar = curve_fit(curve, xv, yv) | |||
params, _covar = curve_fit(curve, xv, yv) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
params, _covar = curve_fit(curve, xv, yv) | |
params, _ = curve_fit(curve, xv, yv) |
Should do it, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I went ahead and applied that change.
@@ -347,7 +345,7 @@ class BaseRandomForestModel(Base): | |||
fil_sparse_format, threshold=0.5, | |||
output_class=False, | |||
predict_proba=False) -> CumlArray: | |||
_, n_rows, n_cols, dtype = \ | |||
_, _, _, _ = \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I pushed a commit, removing the affected line.
@csadorf Thanks for seeing this through! Your final changes LGTM. |
Note that the wheeltest failures are a known and unrelated issue. They should be resolved by Monday. |
This PR adds a configuration for
cython-lint
. It eliminates a lot of unused imports and unused variables from the Cython code.