Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve Rendering of typing.TypeVar #361

Open
ktbarrett opened this issue Mar 12, 2022 · 9 comments
Open

Improve Rendering of typing.TypeVar #361

ktbarrett opened this issue Mar 12, 2022 · 9 comments

Comments

@ktbarrett
Copy link

Problem Description

TypeVars seem to be simply put through repr to get their textual format. TypeVar's __repr__ is missing important information necessary to the use of the class including constraints or bounds and covariance or contravariance.

Proposal

I would like to see TypeVars printed with a more information. For example, instead of just ~T, it could be {variance} T ≤ {bound}. The variance of a type only needs to be know once per type (likely at the occurrence of the TypeVar in the class definition) as TypeVars in functions/methods cannot have variance. Bounds should likely be printed on every occurrence of the TypeVar. Constraints could be represented as a tuple of bounds.

Alternatives

  1. Stating the bound and variance in the docstring. Docstrings can get out of date, but mypy ensures that annotations remain correct.
  2. Taking sphinx's approach and adding support for documenting the TypeVar (yuck!).
@mhils
Copy link
Member

mhils commented Mar 13, 2022

Thanks for the proposal. Some thoughts on this: First off, I think there's an argument to be made that this should be fixed in CPython by providing a better __repr__. I think there's a strong argument to be made that staying on whatever is the default __repr__ is a good idea. If there's consensus among CPython folks that the current __repr__ should be kept, we should also keep it as the default in pdoc. For the moment, I think there are two viable alternatives:

  1. Follow Sphinx's approach. I've just commited a small fix that makes sure that TypeVars are included including their value if they have a docstring.
  2. Using pdoc as a library and manually monkeypatch TypeVar.__repr__. If you do so, please share your code here so that others can profit. If we find that there's a strong community of pdoc users who would prefer it that way, we can reconsider our options and maybe deviate from what CPython is doing.

FWIW variance is already covered in __repr__, although it's not very obvious..

>>> TypeVar("T")
~T
>>> TypeVar("T", covariant=True)
+T
>>> TypeVar("T", contravariant=True)
-T

@ktbarrett
Copy link
Author

Additional Context

Upon exploring potential solutions to file a BPO against TypeVar.__repr__, I think I found why that information isn't included: stringifying the name of types is done very inconsistently done across the CPython codebase.

For example:

>>> repr(Awaitable[int])
typing.Awaitable[int]
>>> repr(str)
"<class 'str'>"
>>> str.__qualname__
'str'
>>> Awaitable[int].__qualname__
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3.9/typing.py", line 706, in __getattr__
    raise AttributeError(attr)
AttributeError: __qualname__

The only way to get a clean representation of a TypeVar which includes bounds is to copy the original source code. I think this would need to be done in pdoc in the AST in either case, so you could link types that appear in bounds or constraints.

Response to Alternatives

  1. Follow Sphinx's approach.

This is not a reasonable approach, I think this can be removed from the list of possible solutions. The consequences of this decision mean that if a user wants to document TypeVar variance and bound, which are important to use the entity where those TypeVar's are used, they need to either...

  1. clutter their module documentation with TypeVar documentation which is of near 0 value outside of variance and bound information.
  2. define all their TypeVars in a common module once and reuse those same definitions throughout their package. This is not how people actually use TypeVars (see the entirety of the typeshed). People usually make TypeVars over again for each module.

It is a mistake to think that because TypeVars appear to be free variables, that is how they should be documented. The way TypeVars are done is a quirk of the implementation; there is no such thing as a free TypeVar.

Additional Proposal

Perhaps another approach is to include TypeVar's in the using-entity's documentation. If the annotations of a class/function uses TypeVar "T", you can include the definition of "T" in the class/function documentation below the signature, but before the doctsring.

@ktbarrett
Copy link
Author

Actually overriding __repr__ might not be too bad. This is how the standard library handles Unions.

@mhils mhils changed the title better support for type variables Improve Rendering of typing.TypeVar Mar 14, 2022
@ktbarrett
Copy link
Author

I filed a BPO.

Even if they agree (BPO is where dreams go to die, unless you are core dev) this will needed to be added to support Python versions < 3.11. Here is some code which does the thing.

import types
from typing import TypeVar


_old_repr = TypeVar.__repr__


def _new_repr(self):
    name_variance_repr = _old_repr(self)
    if self.__bound__ is not None:
        bound_repr = _type_repr(self.__bound__)
        return f"{name_variance_repr}<={bound_repr}"
    elif self.__constraints__:
        constraint_repr = ", ".join(_type_repr(typ) for typ in self.__constraints__)
        return f"{name_variance_repr}<=({constraint_repr})"
    else:
        return name_variance_repr


TypeVar.__repr__ = _new_repr


# From https://github.com/python/cpython/blob/9f04ee569cebb8b4c6f04bea95d91a19c5403806/Lib/typing.py#L198-L216
def _type_repr(obj):
    """Return the repr() of an object, special-casing types (internal helper).
    If obj is a type, we return a shorter version than the default
    type.__repr__, based on the module and qualified name, which is
    typically enough to uniquely identify a type.  For everything
    else, we fall back on repr(obj).
    """
    if isinstance(obj, types.GenericAlias):
        return repr(obj)
    if isinstance(obj, type):
        if obj.__module__ == 'builtins':
            return obj.__qualname__
        return f'{obj.__module__}.{obj.__qualname__}'
    if obj is ...:
        return('...')
    if isinstance(obj, types.FunctionType):
        return obj.__name__
    return repr(obj)

@ktbarrett
Copy link
Author

@mhils Well its been two weeks and nothing from the PTBs. What do you say about adding this to pdoc?

@mhils
Copy link
Member

mhils commented Apr 1, 2022

Thanks for sharing your example! For your personal use you don't need to wait for pdoc to adopt this, you can just use pdoc as a library1 after monkeypatching TypeVar.__repr__.

In terms of getting this into pdoc, I'm a bit hesistant as it's quite a bit of code to maintain. Ideally it should really be done in CPython. If we find that we end up with a significant number of upvotes for this issue I'm happy to reconsider my stance, but for now I think it's best if we just keep it documented here.

Footnotes

  1. https://pdoc.dev/docs/pdoc.html#using-pdoc-as-a-library

@ktbarrett
Copy link
Author

For your personal use

This isn't just for me. This is for everyone wishing to use pdoc at my group at work. Forcing everyone to change how they generate their docs to get a useful and necessary feature is not a reasonable approach. Is there some other approach which incorporates this feature which you would better agree with? Perhaps I could add these 30 lines of code to a package, then it's only 3 lines of code to you: dependency, import, use. Or do you simply think that listing bounds and constraints is not important?

@mhils
Copy link
Member

mhils commented Apr 1, 2022

I think I agree with you that listing bounds is probably useful. But I do think that it's not very important for the majority of pdoc users. One symptom of that is that it's not supported in Sphinx, which usually tries to do much more than pdoc and provides more configuration knobs. Additionally, it should ideally be done upstream.

If you feel super strongly about this I don't want to make your life overly difficult. Maybe we can adjust the patch above as follows:

  1. Reuse typing._type_repr instead of copying it.
  2. try the patched __repr__ but log a warning and fall back to the default one if anything raises.
  3. Patch this as part of defuse_unsafe_reprs so that there are no permanent side effects. (I will take care of renaming + deprecating the old name)
  4. Add test coverage for the fallback (for example by unsetting typing._type_repr)
  5. Add snapshot test coverage for bounds in misc.py

Alternatively, I'd suggest you post a full PR for the BPO. From my experience that greatly increases the odds that you get actual feedback. :)

@ktbarrett
Copy link
Author

ktbarrett commented Apr 5, 2022

But I do think that it's not very important for the majority of pdoc users.

I would probably tell people who don't think it's important to correct their understanding. The majority of my uses of TypeVar's (across 100k+ lines of code) have bounds, and knowledge of those bounds is necessary to correctly use that interface. The only time I really see unbounded TypeVar's is on type-generic collections, which in general people aren't writing since those are already implemented in the standard library. Not trying to make this an argument, but educational.

One symptom of that is that it's not supported in Sphinx

Bounds and covariance are documented by Sphinx, but only if the entire TypeVar is documented. That's due to design decisions in Sphinx, not because they think that info is not important. Just correcting the record.

Additionally, it should ideally be done upstream.

Of course it would, but I don't think the core devs take the position that repr or str on types or values should be sufficient for documentation purposes. My gut feeling is they will try to push this burden on to the tool doing the documentation.

I'd suggest you post a full PR for the BPO. From my experience that greatly increases the odds that you get actual feedback.

I'll try this first, then the listed option second. Thank you for considering this change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants