Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The public interface of git.exc is ambiguous #1718

Closed
EliahKagan opened this issue Oct 20, 2023 · 0 comments · Fixed by #1719
Closed

The public interface of git.exc is ambiguous #1718

EliahKagan opened this issue Oct 20, 2023 · 0 comments · Fixed by #1719

Comments

@EliahKagan
Copy link
Contributor

I believe the intent of git.exc is twofold:

  • To publish the exceptions it defines, which it unambiguously does.
  • To republish exceptions defined in gitdb.exc, which it... doesn't?

The problem is that the conceptually public names in a module are (basically) those listed in __all__ when __all__ is present, and since an original and continuing use of __all__ is to control what names a * import binds, it might seem like whatever a * import binds is considered conceptually public even without __all__... but that is not the case.

Instead, in the absence of __all__, names that both do not start with an _ and also were not introduced merely by themselves being imported are the conceptually public names. As the "Public and internal interfaces" section of PEP-8 says:

Imported names should always be considered an implementation detail. Other modules must not rely on indirect access to such imported names unless they are an explicitly documented part of the containing module’s API[...]

To make such a name public, it can be included in __all__, but git.exc currently does not define __all__. Does it otherwise document those names as public? Well...

""" Module containing all exceptions thrown throughout the git package """

This is why I say the situation is ambiguous--this might be interpreted to mean that the names are public, in that they are the names of exceptions that are thrown throughout GitPython and would not otherwise be public in GitPython. But really this is a stretch, and rather than saying the situation is ambiguous, it might be more reasonable to say that they are definitely not public.

But I am pretty sure they are intended to be public. So they should be documented as such. The best way to do this is by defining git.exc.__all__ and listing them there.

There then becomes the question of whether anything else that is currently imported by from git.exc import * belongs in __all__. I believe the answer is no. It's not reasonable for code outside git.exc to rely on those being part of git.exc (because they are present only due to imports, so they are nonpublic per the above-quoted rule, and the documentation of git.exc does not reference them even obliquely). It is especially unreasonable for code outside the git.exc module to rely on receiving them from a * import.

However, the top-level __init__.py does just that. When it began to do it depends on one's perspective. For some time, it has been including those names (such as safe_decode from git.util) in git.__all__ because of the way git.__all__ was dynamically generated with a comprehension. In #1659, the comprehension was removed and replaced with a listing of everything that was found to be in __all__.

Because git.__all__ did, and does, exist, and it included those names, it should continue to include them for backward compatibility, even in the case of some of them from typing that hopefully no one is importing from the git module. However, I think no similar reasoning applies to git.exc itself, since git.exc.__all__ has not existed. This is to say that I think it is always a bug to rely on picking up nonpublic names from * imports, including in git/__init__.py, which is currently doing that.

To keep from git import * working, as well as to keep git.__all__ a correct statement of names guaranteed accessible as attributes of git, when fixing this in git.exc it will also be necessary to modify the code of git/__init__.py to get these names from the modules that really provide them publicly. That should be no problem, though, and I'd say it would be an improvement even by itself.

I believe git.exc is the clearest place in GitPython where this kind of problem exists, but every attempt to suppress an unused import lint rule is a strong hint of a similar bug, so I'm pretty sure this is far from the only case. However, I'm opening an issue about this specifically because recent changes (in #1659) combine with it to create a situation where the bug could become entrenched if not addressed fairly soon. I say this because although git/__init__.py has, for quite some time, being wrongly relying on those names being present, in that it has been effectively guaranteeing their presence for future patch versions by including them in __all__ as it would be inspected by users, it is only now that this dependence can be easily discerned, and thus perhaps further relied on, by reading the code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging a pull request may close this issue.

2 participants