Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make stubgen re-export correctly imported names when module has no __all__ #3981

Closed
dmoisset opened this issue Sep 22, 2017 · 15 comments
Closed

Comments

@dmoisset
Copy link
Contributor

PEP-484 describes in which situations an imported name (which comes from an import name or from module import name) in a stub file are considered part of the exported interface. When generating a stub, stubgen must decide which of the imported names are intended to be reexported (ehich then must be added to the stub as import name as name) and which aren't.

Currently stubgen handles properly most situations where __all__ has been defined, but does not attempt to force reexports otherwise (some reexports may happen anyway if a name imported with an alias is required in an annotation).

The main problem with the "no __all__ is set" is that we do not know author intent and we have to guess. There was some discussion within #3169 which was unresolved so I'm opening this issue as a followup

The 2 extreme alternatives when no __all__ are:
a) Assume every name coming from an import in the source module is intended to be exported, and add the aliases in the stubs.
b) Assume no name coming from an import in the source module is intended to be exported, and just keep the imports needed for annotations (the current status)

Other more complicated heuristics may be possible in the spectrum from a to b.

@dmoisset
Copy link
Contributor Author

Following up the discussion that came from #3169 (comment) and suggests (a), from checking a couple of simple examples (even the trivial examples in stubgen tests), I see that generated stubs will end up with no required imports. Most of the imports are normally not intended for reexport except on some high-level package modules (which are typically __init__.py files with just a lot of import statements), and the workaround for those cases will require the stubgen user to delete many import statements, and remove many as name reexports in them too (and also figuring out which is which).

My inclination towards (b) is that it's easier to add a few reexports when required.

Of course, any scenario can be fixed by adding an adequate __all__ to the source file, which makes user intent explicit and allows the tool to do the right thing in the first place instead of guessing.

@JelleZijlstra
Copy link
Member

I also favor option (b). The rule about import as in PEP 484 is about stub files, not implementation files, so I don't think it's relevant here.

@gvanrossum
Copy link
Member

gvanrossum commented Sep 23, 2017 via email

@ilevkivskyi
Copy link
Member

@JelleZijlstra
I am not sure what do you mean. Consider a (not so rare) situation:

# lib/utils.py
class PublicClass:
    ...
# lib/__init__.py
from .utils import PublicClass

# main.py
from lib import PublicClass

It works perfectly and passes mypy. But after running stubgen on lib this fails with Module 'lib' has no attribute 'PublicClass' because now stubgen uses (b).

@JelleZijlstra
Copy link
Member

Sure, lots of stubgen-generated code needs manual fixes later. I just think (b) is usually closer to being correct than (a) is. Perhaps we can make an additional exception for relative imports in files named __init__.py, but that might introduce too much complexity.

@ilevkivskyi
Copy link
Member

I just think (b) is usually closer to being correct than (a) is.

The point is that (a) just makes stubs a bit noisier (but they are not intended to be read often anyway), while (b) causes actual false positives, and what is worse, these false positives will be discovered not by a library maintainer or whoever produces stubs, but by users of mypy/typeshed.

@gvanrossum
Copy link
Member

gvanrossum commented Sep 23, 2017 via email

@ilevkivskyi
Copy link
Member

@gvanrossum

I wonder if there needs to be a special case for __init__.py when it imports things from submodules of the same package.

I was thinking about two options:
a) Only re-export imported names in __init__.py
b) Always re-export names except names imported from standard library (the module names are currently found in moduleinfo.py)

@tjdevries
Copy link

I'm not sure what the right solution for this would be, but it'd be really helpful for the situation described below to:

  1. Add a new option for no_implicit_reexport that allows the option A mentioned above in @ilevkivskyi 's post
  2. Or add a new configuration switch that makes everything imported (perhaps imported, but not used?) in an __init__.py be considered as part of an implicit __all__ (which if I understood the code well enough, should also solve this problem).
# Code base has file structure like this.
services/
-- first_service/
  -- __init__.py
  -- some_stuff.py
-- second_service/
  -- __init__.py
  -- other_stuff.py
# first_service/__init__.py

from services.first_service.some_stuff import foo
from services.first_service.some_stuff import other_cool_function

In our __init__.py's for each of the services, we basically declare our interfaces for other services. I don't want to have to define an __all__ for these, since it seems like I'd just be repeating myself in each of the __init__s just for this check (which I want to use since it's cool!).

Not sure if anyone has any other work arounds or suggestions.

Thanks!

@JukkaL
Copy link
Collaborator

JukkaL commented Nov 7, 2019

Here's an approach that I'm currently working on (these are the most important rules):

  1. If a module has __all__ (and we can determine its value), use it to decide what to re-export. Otherwise, follow the rules below.
  2. Re-export all names imported from any submodule of foo or _foo if the current module is a submodule of foo. (Submodules don't need to be direct submodules, and foo is a submodule of foo.)
  3. Re-export all imported names that aren't used in the module.

I'm also thinking of having an option to disable (2).

In particular, __init__ is not special in the above rules.

@JukkaL
Copy link
Collaborator

JukkaL commented Nov 7, 2019

The motivation for the above rules is that I'd like stubgen to generate usable stubs with minimal manual work. When unsure about whether something should be part of a public API, we'd include it in the public API.

Now we'd have extra manual work to trim unnecessary exported names from the stubs, instead of adding missing names. Arguably removing names is less tedious than adding names, and also extra names are less of a problem than missing names, I'd say.

@ilevkivskyi
Copy link
Member

The motivation for the above rules is that I'd like stubgen to generate usable stubs with minimal manual work. When unsure about whether something should be part of a public API, we'd include it in the public API.

OK, this makes sense.

@tjdevries
Copy link

Thanks for the response. That makes sense to me and sounds better than my proposal. 😄

@edaqa-uncountable
Copy link

What is wrong with making __init__.py a special case for re-exports? Is it not a common enough pattern that it makes clear what the module is exporting?

@hauntsaninja
Copy link
Collaborator

stubgen has the --export-less flag that was added at some point

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants