Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reportGeneralTypeIssues in pytorch #4374

Closed
lkct opened this issue May 16, 2023 · 11 comments
Closed

reportGeneralTypeIssues in pytorch #4374

lkct opened this issue May 16, 2023 · 11 comments
Assignees

Comments

@lkct
Copy link

lkct commented May 16, 2023

Environment data

  • Language Server version: 2023.5.20
  • OS and version: linux x64
  • Python version (and distribution if applicable, e.g. Anaconda): 3.8.16 (miniconda)
  • python.analysis.indexing: true
  • python.analysis.typeCheckingMode: basic

Code Snippet

Install pytorch: torch==2.0.1+cu117

import torch
torch.backends.cudnn.benchmark = False  # reportGeneralTypeIssues
Additional Context

In torch/__init__.py there's a line:

import torch.backends.cudnn

Expected behavior

No error.

Actual behavior

"backends" is not a known member of module "torch" Pylance(reportGeneralTypeIssues)

@erictraut
Copy link
Contributor

The torch library is a "py.typed" library, which means that it provides inlined type annotations. It also means that it follows certain rules about specifying which imported symbols it intends to re-export from its various modules. If you're interested, you can read about these rules here.

The top-level torch module doesn't re-export the backends submodule. It's worth noting that it does re-export several dozen other submodules including fft, futures, etc. So it appears that it was an explicit decision by the pytorch maintainers not to re-export backends. That means you need to import it explicitly in your code if you want to access it.

import torch.backends

@lkct
Copy link
Author

lkct commented May 16, 2023

@erictraut Thanks very much for the explanation! I'll ask the pytorch community.

By the way, I'm curious if it's possible to re-export only torch.backends.cudnn in the top-level of torch without torch.backends. Could you please explain a little more?

@erictraut
Copy link
Contributor

I'm curious if it's possible to re-export only torch.backends.cudnn in the top-level of torch

It's possible to export the torch.backends.cudnn module and then re-export it as cudnn from the top-level module. Is that what you have in mind? To do so, the top-level module could use the statement from torch.backends import cudnn as cudnn or import torch.backends.cudnn as cudnn. Then your user code could simply import torch and use torch.cudnn to access that submodule.

It's also possible for the top-level module to import and re-export the backends submodule (from torch import backends as backends), and then the backends submodule could import and re-export the cudnn submodule from its namespace (from . import cudnn as cudnn). If done this way, your user code could import torch and then use torch.backends.cudnn to access that submodule.

It all depends on how the torch maintainers want to define the library's public interface.

@lkct
Copy link
Author

lkct commented May 16, 2023

The first changes the interface, and I'm afraid they would not adopt it.

So is the second approach the only one to expose torch.backends.cudnn as is to users?

@erictraut
Copy link
Contributor

Or you can import the submodule explicitly in your code: import torch.backends.cudnn. That's the only option that will work with the library as it is currently authored.

@lkct
Copy link
Author

lkct commented May 16, 2023

Thanks!

@malfet
Copy link

malfet commented May 23, 2023

The torch library is a "py.typed" library, which means that it provides inlined type annotations. It also means that it follows certain rules about specifying which imported symbols it intends to re-export from its various modules. If you're interested, you can read about these rules here.

The top-level torch module doesn't re-export the backends submodule. It's worth noting that it does re-export several dozen other submodules including fft, futures, etc.

@erictraut is there any runtime check one can use to check if something is re-exported or not from pyright PoV? I.e. both fft and backends are module members of torch, per:

% python -c "import inspect; import torch;print('backends' in sorted(x[0] for x in inspect.getmembers(torch, inspect.ismodule)))"
True

@erictraut
Copy link
Contributor

is there any runtime check one can use...?

Not that I can think of. You could perhaps walk the AST for the parent module to look for nodes that use redundant import forms. That would work for some cases, but it wouldn't account for cases where a type stub is present.

Out of curiosity, why do you want to verify this at runtime? Are you a library maintainer and want to validate this in a test? Or are you a library consumer?

@malfet
Copy link

malfet commented May 23, 2023

Not that I can think of. You could perhaps walk the AST for the parent module to look for nodes that use redundant import forms. That would work for some cases, but it wouldn't account for cases where a type stub is present.

Is there a PEP describing why it's proper to expose it one way vs the another?

Out of curiosity, why do you want to verify this at runtime?

Because anything that is not covered by CI has a high probability of regressing again.

Are you a library maintainer and want to validate this in a test? Or are you a library consumer?

Former.

@erictraut
Copy link
Contributor

Is there a PEP describing why it's proper to expose it one way vs the another?

There isn't a single PEP that covers this topic completely. PEP 484 introduced the concept of type stub files and described how imported symbols should be interpreted by type checkers within stub files.

PEP 561 introduced the notion of "py.typed" libraries and indicated that it was up to library authors to decide whether to include type information inline or in stub files that ship with the library. (Most library authors have found that inlined type information is much easier to maintain, so that is by far the most typical approach.) PEP 561 unfortunately doesn't provide guidance about how imported symbols are meant to be interpreted by a type checker for a "py.typed" library.

This topic has been discussed at length within the typing community, and we have converged on a set of rules that are now implemented in pyright. These rules can be found in the pyright documentation and also on the python/typing community site. Other type checkers, including mypy, have implemented parts of these rules, but there are gaps. For example, mypy doesn't support all of the documented ways to manipulate the __all__ symbol.

In an effort to help library maintainers, I've added a mode to pyright that allows it to verify conformance and type completeness for a "py.typed" library. To use this mode, run pyright --verifytypes <package-name> from the command line. You can easily incorporate this tool into a CI pipeline to catch regressions. You can use this command in conjunction with the --outputjson option if you want to process the output programmatically in a script (e.g. to baseline the output and detect regressions).

I just tried running pyright --verifytypes torch, and I noticed that it hangs. This appears to be due to an expression for the CUDA_TYPE_NAME_MAP symbol in the utils/hipify/cuda_to_hip_mappings.py module. I've filed a pyright bug, and I'll work on addressing that shortly. I'll try to get that fixed in this week's pyright release (version 1.1.310, which is due out later today). If I comment out this problematic expression, I can then run verifytypes on torch, and it gives me the following summary:

Symbols exported by "torch": 13171
  With known type: 3558
  With ambiguous type: 674
  With unknown type: 8939
  Functions without docstring: 3258
  Functions without default param: 562
  Classes without docstring: 359

Other symbols referenced but not exported by "torch": 7389
  With known type: 4149
  With ambiguous type: 129
  With unknown type: 3111

Type completeness score: 27%

Completed in 10.113sec

Another validation approach that some library maintainers are adopting is the use of the typing.assert_type function, which was added in Python 3.11. The maintainers of pandas, for example, have made extensive use of this in their test suite. You can read the documentation for this function here. All of the major Python type checkers (pyright, mypy, pyre, pytype) support assert_type.

Let me know if you run into any problems or have thoughts about how pyright (or tooling in general) could help you out here.

@erictraut
Copy link
Contributor

I've addressed the issue that caused the hang when running pyright --verifytypes torch. This fix will be included in version 1.1.310 of pyright.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants