-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add ruff to ci setup #82
Conversation
Thanks for working on this @adonath! I did a quick browse of the results, and it looks like a few error codes probably should simply be ignored
With those two ignored, hopefully it'll be a useful setup to catch other issues that may otherwise slip in.
Yes, that sounds like the right choice to me. CI is nice, and a lot less invasive than pre-commit. |
I think the convention in that case is to do explicit re-exports: from numpy import abs as abs |
Oh, good point - if that makes the checker happy, then sure seems good to fix up that way. |
Probably this PR, since it's needed to get CI green here? It's still not that much code, so fine to review in one go. |
I don't think I've ever seen the The correct way to be explicit about exports is to define |
And yes, make whatever improvements you need to make in this same PR. The point of this is also to make sure ruff isn't warning about any false positives, and there's no way to tell that unless we see the changes. |
The import * error codes also need to be ignored. |
Some references: Also in the typing PEP, considering stub files in this case: |
This is actually a good example of an issue I have with a lot of linters, especially "code quality" linters. In a sane world, a linter would flag |
We can do that here if you want, ruff has it. https://docs.astral.sh/ruff/rules/useless-import-alias/#useless-import-alias-plc0414 |
The "problem" with That said, it |
tests/test_isdtype.py
Outdated
return res | ||
|
||
@pytest.mark.parametrize("library", ["cupy", "numpy", "torch"]) | ||
def test_isdtype_spec_dtypes(library): | ||
xp = import_('array_api_compat.' + library) | ||
xp = pytest.importorskip('array_api_compat.' + library) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did you change this? This is not equivalent to the import_ helper.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And the same elsewhere. The only tests that should be skipped are cupy tests, because cupy cannot be installed everywhere (and in particular, not on CI).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I see. I looked at the dependency definitions and could not find torch as required or extra. So I assumed it was optional and test can be skipped.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By default those tests failed for me, because torch was not installed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All runtime dependencies are optional. But for the tests, we want to require all optional dependencies so that we make sure the tests are actually run. The exception is cupy, which simply cannot be installed everywhere.
I'm not sure if we can express this in a better way in the package metadata.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One could add another entry in the extras_require
dict, like {"test-no-cupy": ["numpy", "torch"], "test": ["numpy", "torch", "cupy"]}
. But I would suggest to not do this in this PR. If the package grows further, it is probably worth using tox
or similar to handle the different environments.
tests/test_isdtype.py
Outdated
@@ -61,12 +59,12 @@ def isdtype_(dtype_, kind): | |||
res = dtype_categories[kind](dtype_) | |||
else: | |||
res = dtype_ == kind | |||
assert type(res) is bool | |||
assert isinstance(res, bool) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should not be changed. This is a test. We want the type to be exactly bool
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. This is highlighted as an error in ruff, but in this case it is the desired behavior. I will add an ignore comment instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
|
||
# Note: torch.linalg.cross does not default to axis=-1 (it defaults to the | ||
# first axis with size 3), see https://github.com/pytorch/pytorch/issues/58743 | ||
def cross(x1: array, x2: array, /, *, axis: int = -1) -> array: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did you move these to this file? These were in linalg.py because they're linalg only functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I understand. It seemed conceptually simpler in the end to just use linalg.py
the same way as __init__.py
. So it only contains the explicit imports and the __all__
declarations. This avoids the need for del
statements and if something is missing from __all__
it errors as an unused import during the style check. I hope this makes sense. Otherwise I could maybe just introduce _aliases_linalg.py
or similar.
exclude : callable, optional | ||
A callable that takes a name and returns True if the name should be | ||
excluded from the list of members. | ||
extend_all : bool, optional |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is monkeypatching torch.__all__
etc.? We don't want to do that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but this just keeps the current behavior. Take a look at https://github.com/data-apis/array-api-compat/blob/main/array_api_compat/torch/__init__.py#L3
I have not checked whether this is still necessary, but probably we have to keep it this way?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I checked with the following code:
import torch
torch_all = set(torch.__all__)
public = set([name for name in dir(torch) if not name.startswith("_")])
print(torch_all.difference(public))
print(public.difference(torch_all))
And this gives:
set()
{'complex64', 'eig', 'special', ... , 'QInt8Storage', 'segment_reduce', 'ComplexDoubleStorage'}
So indeed __all__
does not contain multiple members and most importantly it does not contain the dtypes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but this just keeps the current behavior. Take a look at main/array_api_compat/torch/init.py#L3
That code does not modify the torch.__all__
list:
>>> import torch
>>> torch_all = list(torch.__all__)
>>> import array_api_compat.torch
>>> torch_all2 = list(torch.__all__)
>>> torch_all == torch_all2
True
Generally speaking, this package should not monkeypatch the underlying libraries.
So indeed all does not contain multiple members and most importantly it does not contain the dtypes.
Yes, that's a known issue. pytorch/pytorch#91908
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, now I understand. I did not mean to actually modify torch.__all__
in place but copy and extend instead. I'll fix that behavior.
array_api_compat/__init__.py
Outdated
@@ -19,4 +19,18 @@ | |||
""" | |||
__version__ = '1.4.1' | |||
|
|||
from .common import * | |||
from .common import ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can ruff help keep these two files in sync?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This part I was not happy with either, because the import path is two-level: functions are imported from array_api_compat/common._helpers.py
into array_api_compat.common
from their they are imported into array_api_compat
. So if anything gets added to _helpers.py
it has to be updated in two places. Maybe one keeps the *
import here and ignores...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess it is not important that users can do both from array_api_compat import get_namespace
and from array_api_compat.common import get_namespace
. So my proposal would be to decide on one instead? However this would be a backwards incompatible change...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternatively we could just keep the *
import and ignore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If ruff can't automatically keep __all__
synced between a module and submodule with explicit imports then using import *
and just setting __all__ = common.__all__
is the next best thing.
This would be a useful feature for some linter. We do this sort of thing in SymPy, for instance (e.g., at https://github.com/sympy/sympy/blob/master/sympy/core/__init__.py and https://github.com/sympy/sympy/blob/master/sympy/__init__.py), and we have to manually keep them in sync.
OTOH, having common
as a submodule that people can import from isn't that important, unlike the torch
, numpy
, etc. which are important since those are the actual array API namespaces.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If ruff can't automatically keep all synced between a module and submodule with explicit imports then using import * and just setting all = common.all is the next best thing.
I don't think any linter would do this, because developers can choose to only re-expose a subset. The correct behavior would depend on their intention.
I'll change to the *
import now to keep backwards compatibility.
Re-exports are the whole point of this package. Anyway, the duplication of |
Thanks everyone for the comments. I ended up doing more work than initially anticipated, but it's all related. Let me quickly summarize the related changes:
I don't really have a preference of |
For me it's a little easier to remember to update
As I noted in a line comment, it would be great if a linter can help us keep them consistent.
These editors are using a linter under the hood. If ruff doesn't do this, we should use a linter that does. |
|
Yes, but you have to change the import into the public namespace anyway. And if you forget to update
In vscode I use |
Concerning the type annotations and checking I think there are still two open issues:
|
|
Thanks @lucascolley! I enabled the corresponding option in ruff. For now I only added it as a command line option in the CI, long term it might be better to have ruff config file. |
Above it sounded like Aaron wanted |
There is a test fail in the CI, but it seems unrelated to this PR https://github.com/data-apis/array-api-compat/actions/runs/7702062263/job/20989506932?pr=82#step:6:5798 |
So just to be clear, is ruff now checking for I think probably my number 1 development mistake when working on this library is forgetting to add a new wrapper function to I don't know if there's some way to tell a linter, "warn me if I have a name in this file that isn't exported". Most linters do that for |
You can ignore the test failures. Looks like the numpy 1.21 failure needs to be added to this list array-api-compat/numpy-1-21-xfails.txt Line 99 in 916a84b
|
I see you concern here. The "source of truths" is now the |
Co-authored-by: Aaron Meurer <asmeurer@gmail.com>
I'm not aware of any linter that would do this. Playing a bit around I think this is roughly what you aim for: import importlib
from array_api_compat._internal import _get_all_public_members
EXCLUDE = {
"numpy": {"get_xp", "annotations", "partial", "np"},
"torch": {"get_xp", "wraps", "annotations", "builtin_all", "builtin_any", "vecdot_linalg"},
}
def check_aliases_consistency(library):
"""Check that the public API of the library is consistent with the aliases."""
submodule = importlib.import_module(f"array_api_compat.{library}")
all_public = set(submodule.__all__)
if hasattr(submodule, "linalg"):
all_public_linalg = set(submodule.linalg.__all__)
else:
all_public_linalg = set()
aliases = set(_get_all_public_members(submodule._aliases))
print(aliases.difference(all_public | all_public_linalg | EXCLUDE[library])) Which shows for example when calling
So these are forgotten exports. However those should definitely show up in the tests later. |
That looks like it would be a useful script. That already doesn't show up in the tests because the UniqueAllResult name is not part of the standard (the standard only requires the function to return a namedtuple, but the type of that namedtuple is not in the spec namespace). So indeed the UniqueAllResult should be either not exported at all or exported in both files. In fact, I would be cautious in general about relying on the test suite to catch these mistakes. There's a dozen reasons the test suite might not catch something. Maybe a function is wrapped to fix some behavior that is missed by the test suite. Or (more likely), maybe the test for a function is xfailed for some other reason. |
def import_(library): | ||
if 'cupy' in library: | ||
|
||
def import_or_skip_cupy(library): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI, I'm probably going to rename this back, because I need to add some additional skipping logic for jax as well at #84
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. I think it would be good to keep the "or skip" in the name, because from looking at the places where it was used, it was not clear what it does.
Thanks @asmeurer, @vnmabus , @rgommers and @lucascolley for the comments and review! |
Revert __all__ related changes from #82
As proposed in #73 I'm adding a CI check that runs ruff as a linter. It currently runs as an "allowed failure". I can adapt the configuration as needed and optionally add ruff as a pre-commit hook. But I think so far
array-api-compat
does not pre-commit hooks, so I opted for the CI.