Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spec: Annotating the self argument in __init__ methods #1563

Closed
Viicos opened this issue Jan 8, 2024 · 19 comments
Closed

Spec: Annotating the self argument in __init__ methods #1563

Viicos opened this issue Jan 8, 2024 · 19 comments
Labels
topic: feature Discussions about new features for Python's type annotations

Comments

@Viicos
Copy link
Contributor

Viicos commented Jan 8, 2024

I'm opening this issue to discuss a typing feature supported by major type checkers (at least pyright and partially mypy), that allows users to annotate the self argument of an __init__ method. This is currently used as a convenience feature (I haven't encountered any use case that would not be possible with the workaround described in the motivation section, but do let me know if you are aware of any), so the goal is to formally specify the behavior so that users can feel confident using it.

Annotating self is already supported in some cases:

Already supported use cases

  • Annotating self as a subclass:
from __future__ import annotations

from typing import overload

class A:
    @overload
    def __init__(self: B, a: int) -> None:
        ...
    @overload
    def __init__(self: C, a: str) -> None:
        ...
    @overload
    def __init__(self, a: bool) -> None:
        ...

class B(A):
    pass

class C(A):
    pass

playgrounds: mypy / pyright.

In this case, instantiating B should only match overload 1 and 3.

from typing import Self

class A:
    def __init__(self: Self):
        return


reveal_type(A())

playgrounds: mypy / pyright.

from typing import TypeVar

Self = TypeVar("Self", bound="A")

class A:
    def __init__(self: Self, other: type[Self]):
        return


reveal_type(A(A))

playgrounds: mypy / pyright.

This issue proposes adding support for a special use case of the self type annotation that would only apply when the annotation includes one or more type variables (class-scoped, method-scoped, or both), and would convey a special meaning for the type checker.

Motivation

This feature can be useful because the return type of __init__ is always None, meaning there is no way to influence the parametrization of a instance when relying solely on the type checker solver is not enough. In most cases, the type variable can be inferred correctly without any explicit annotation:

from typing import Generic, TypeVar

T = TypeVar("T")

class Wrapper(Generic[T]):
    def __init__(self, value: T) -> None:
        self.value = value

reveal_type(Wrapper(1))  # Revealed type is "Wrapper[int]"

However, there are some situations where more complex logic is involved, e.g. by using the dynamic features of Python (metaclasses for instance). Consider this example:

class NullableWrapper(Generic[T]):
    def __init__(self, value: T, null: bool = False) -> None:
        self.value = value
        # Some logic could make `value` as `None` depending on `null`

Ideally, we would like NullableWrapper(int, null=True) to be inferred as NullableWrapper[int | None]. A way to implement this is by making use of the __new__ method:

class NullableWrapper(Generic[T]):
    @overload
    def __new__(cls, value: V, null: Literal[True]) -> NullableWrapper[V | None]: ...
    @overload
    def __new__(cls, value: V, null: Literal[False] = ...) -> NullableWrapper[V]: ...

reveal_type(NullableWrapper(1, null=True))  # Type of "NullableWrapper(1)" is "NullableWrapper[int | None]"

However, this __new__ method might not exist at runtime, meaning users would have to add an if TYPE_CHECKING block.


What the example above tries to convey is the fact that some constructs can't be reflected natively by type checkers. The example made use of a null argument that should translate to None in the resolved type variable, and here is a list of already existing examples:

  • The stub definition of dict/collections.UserDict: a lot of custom logic is applied to the provided arguments, and the type stubs definition is making use of this feature to cover the possible use cases.
  • As mypy doesn't support solving a type variable from a default value (see issue), overloads are used to explicitly specify the solved type from the default value: see the stub definition of contextlib.nullcontext.
  • A similar example to one provided in this proposal: SQLAlchemy's UUID type: see the definition.

Specification

Definitions

  • A "class-scoped" type variable refers to a TypeVar used in conjunction with Generic (or as specified with the new 3.12 syntax):
from typing import Generic, TypeVar


T = TypeVar("T")

# In this case, `T` is a class-scoped type variable
class Foo(Generic[T]): ...
  • A "function-scoped" type variable refers to a TypeVar used in function / method:
from typing import TypeVar

T = TypeVar("T")

class Foo:
    #In this case, `T` is a function-scoped type variable
    def __init__(self, value: T) -> None: ...

Context

When instantiating a generic class, the user should generally explicitly specify the type(s) of the type variable(s), for example var: list[str] = list(). However, type checkers can solve the class-scoped type variable(s) based on the arguments passed to the __init__ method, similarly to functions where type variables are involved. For instance:

from typing import Generic, TypeVar

T = TypeVar("T")

class Foo(Generic[T]):
    def __init__(self, value: T) -> None: ...

reveal_type(Foo(1))  # Foo[int]

This proposal aims at standardizing the behavior when the self argument of a generic class' __init__ method is annotated.

Canonical examples

Whenever a type checker encounters the __init__ method of a generic class where self is explicitly annotated, it should use this type annotation as the single source of truth to solve the type variable(s) of that class. That includes the following examples:

  • Explicitly solving type variables with a "fixed" type (pyright / mypy):
from typing import Generic, TypeVar, overload

T = TypeVar("T")

class Foo(Generic[T]):
    @overload
    def __init__(self: Foo[int], value: int) -> None: ...
    @overload
    def __init__(self: Foo[str], value: str) -> None: ...

Currently supported by both mypy and pyright. ✅

  • Using function-scoped type variables (pyright / mypy):
from typing import Generic, TypeVar

T1 = TypeVar("T1")
T2 = TypeVar("T2")

V1 = TypeVar("V1")
V2 = TypeVar("V2")

class Foo(Generic[T1, T2]):
    def __init__(self: Foo[V1, V2], value1: V1, value2: V2) -> None: ...

class Bar(Generic[T1, T2]):
    def __init__(self: Bar[V2, V1], value1: V1, value2: V2) -> None: ...

reveal_type(Foo(1, "1"))  # Foo[int, str]
reveal_type(Bar(1, "1"))  # Bar[str, int]

Currently unsupported by both mypy and pyright. ❌

from typing import Generic, TypeVar

T1 = TypeVar("T1")
T2 = TypeVar("T2")


class Foo(Generic[T1, T2]):
    def __init__(self: Foo[T1, T2], value1: T1, value2: T2) -> None: ...

class Bar(Generic[T1, T2]):
    def __init__(self: Bar[T2, T1], value1: T1, value2: T2) -> None: ...

reveal_type(Foo(1, "1"))  # Foo[int, str]
reveal_type(Bar(1, "1"))  # Bar[str, int], Bar[int, str]?

Note

Although using class-scoped type variables to annotate self is already quite common (see examples in motivation), we can see diverging behavior between mypy and pyright in the Bar example. If the self type annotation should be the only source of truth, then type checkers should infer Bar(1, "1") as Bar[str, int], but this is open to discussion.

Behavior with subclasses

As stated in the motivation section, __new__ can be used as a workaround. However, it does not play well with subclasses (as expected):

from typing import Generic, TypeVar

T = TypeVar("T")
V = TypeVar("V")

class Foo(Generic[T]):
    def __new__(cls, value: V) -> Foo[V]: ...

class SubFoo(Foo[T]):
    pass

reveal_type(SubFoo(1))  # Type of "SubFoo(1)" is "Foo[int]"

The same would look like this with __init__:

from typing import Generic, TypeVar

T = TypeVar("T")
V = TypeVar("V")

class Foo(Generic[T]):
    def __init__(self: Foo[V], value: V) -> None: ...

class SubFoo(Foo[T]): ...

reveal_type(SubFoo(1))

As with __new__, subclasses shouldn't be supported in this case (i.e. reveal_type(SubFoo(1)) shouldn't be SubFoo[int]).

Note

I think shouldn't be supported should mean undefined behavior in this case, although this can be discussed. While the given example does not show any issues as to why it shouldn't be supported, consider the following example:

class OtherSub(Foo[str]): ...
OtherSub(1)  # What should happen here? is it `OtherSub[str]`, `OtherSub[int]`?
# This can also be problematic if multiple type variables
# are involved in the parent class and the subclass explicitly solves
# some of them (`Sub(Base[int, T1, T2]` for instance).

However, this is open to discussion if you think type checkers could handle these specific scenarios.

Appendix - Invalid use cases

For reference, here are some invalid use cases that are not necessarily related to the proposed feature:

Using an unrelated class as a type annotation

from __future__ import annotations

from typing import Generic, Literal, TypeVar, overload, reveal_type

T = TypeVar("T")

class Unrelated:
    pass

class A(Generic[T]):
    @overload
    def __init__(self: Unrelated, is_int: Literal[True]) -> None:
        ...

    @overload
    def __init__(self: A[str], is_int: Literal[False] = ...) -> None:
        ...

    def __init__(self, is_int: bool = False) -> None:
        ...


reveal_type(A(is_int=True))

playgrounds: mypy / pyright.

Both type checkers raise an error, but a different one (mypy explicitly disallows the annotated self in the first overload, pyright doesn't raise an error but instead discards the first overload, meaning True can't be used for is_int).

Using a supertype as a type annotation

from __future__ import annotations

from typing import Generic, Literal, TypeVar, overload, reveal_type

T = TypeVar("T")


class Super(Generic[T]):
    pass

class A(Super[T]):
    @overload
    def __init__(self: Super[int], is_int: Literal[True]) -> None:
        ...

    @overload
    def __init__(self: Super[str], is_int: Literal[False] = ...) -> None:
        ...

    def __init__(self, is_int: bool = False) -> None:
        ...

reveal_type(A(is_int=True))

playgrounds: mypy / pyright.

No error on both type checkers, but they both infer A[Unknown/Never]. I don't see any use case where this could be allowed? Probably better suited for __new__.

@Viicos Viicos added the topic: feature Discussions about new features for Python's type annotations label Jan 8, 2024
@jakkdl
Copy link

jakkdl commented Jan 8, 2024

I think you forgot to add is_int=True to the last line in the first code block?

@Viicos
Copy link
Contributor Author

Viicos commented Jan 8, 2024

I think you forgot to add is_int=True to the last line in the first code block?

Thanks for the catch, updated.

@A5rocks
Copy link

A5rocks commented Jan 8, 2024

Don't forget intersections with Self:

from typing import Self

class H:
    def __init__(self: Self):
        return


reveal_type(H())

and old-Self:

from typing import TypeVar

Self = TypeVar("Self")

class H:
    def __init__(self: Self, other: type[Self]):
        return


reveal_type(H(H))

@Gobot1234
Copy link
Contributor

Old self there is missing a bound=H

@erictraut
Copy link
Collaborator

Thanks for starting this discussion.

The example you posted above is fine, but it don't really demonstrate why __init__ needs to be treated as special by a type checker. With the examples you've provided above, the current type specification indicates the behavior that type checkers should provide, and there would be no need for an extension to the spec.

The unspecified behavior occurs when the annotation for self includes one or more type variable (class-scoped, method-scoped, or both). This is where the rules become unclear because the __init__ method acts differently from other methods.

Normally, type variables appears in input parameter types and the return types, and it's the job of a type checker to "solve" the type variables based on the arguments passed to the call and then specialize the return type based on the solved type variables.

In this example, T appears in the types for input parameters x and and y, and it is solved based on the arguments corresponding to those parameters. The return type (list[T]) is then specialized based on the solved value of T.

def func(x: T, y: T) -> list[T]:
    return [x, y]

reveal_type(func(1, 2)) # list[int]
reveal_type(func("", b"")) # list[str | bytes]

The return type of an __init__ method is always None, so its return type annotation never contains a type variable. However, the type annotation for the self parameter kind of acts like a return type in the case of __init__. This is different from every other method — and the reason why this needs special-cased behavior in a type checker.

T = TypeVar("T")
S = TypeVar("S")

# A class-scoped type variable is used in this example:
class Foo(Generic[T]):
    def __init__(self: Foo[T], value: T) -> None: ...

reveal_type(Foo(1)) # Foo[int]
reveal_type(Foo("")) # Foo[str]

# A method-scoped type variable is used in this example:
class Bar(Generic[T]):
    def __init__(self: "Bar[S]", value: S) -> None:
        ...

reveal_type(Bar(1))  # Bar[int]
reveal_type(Bar(""))  # Bar[str]

The __init__ method is also special in that it interacts in a complicated (and currently unspecified) way with the __new__ method if both are present on the class. When calling a class constructor, the __new__ method is invoked first, and it can potentially provide the type arguments for class-scoped type variables. (Mypy doesn't currently honor the return type of __new__, but it really should.)

That means we need to consider the intended behavior for cases like this:

class Foo(Generic[T]):
    def __new__(cls, value: T) -> Foo[T]: ...

    def __init__(self: Foo[T], value: T) -> None: ...

@henribru
Copy link

henribru commented Jan 8, 2024

@erictraut Is your Bar example incorrect? It works in neither Mypy nor Pyright

@erictraut
Copy link
Collaborator

erictraut commented Jan 8, 2024

Is your Bar example incorrect? It works in neither Mypy nor Pyright

The example is correct. My point in providing that example is to demonstrate that there are edge cases that are currently undefined — and therefore produce behaviors that may be undesirable with the current type checker implementations.

The behavior in this case is unspecified, so arguably any behavior is "correct" according to the current spec.

@Viicos
Copy link
Contributor Author

Viicos commented Jan 8, 2024

The example you posted above is fine, but it don't really demonstrate why __init__ needs to be treated as special by a type checker. With the examples you've provided above, the current type specification indicates the behavior that type checkers should provide, and there would be no need for an extension to the spec.

Thanks for the detailed feedback. I made some updates and still need to continue working on the requested feature.

I'm not sure I follow you here. Is the following example (which I've removed from the first post) already valid and formalized somewhere in the spec?

from typing import Generic, Literal, TypeVar, overload

T = TypeVar("T")

class A(Generic[T]):
    @overload
    def __init__(self: A[int], is_int: Literal[True]) -> None:
        ...

    @overload
    def __init__(self: A[str], is_int: Literal[False] = ...) -> None:
        ...

Or is still a valid example regarding the requested feature? According to your comment here, it doesn't seem to be specified, but is a trivial example. In that case, I will probably have to add some more complex examples involving unions and type variables.


Regarding this example:

# A class-scoped type variable is used in this example:
class Foo(Generic[T]):
    def __init__(self: Foo[T], value: T) -> None: ...

Is there really a need to add support for this, as you could simply omit Foo[T] and type checkers would actually infer T with the value of value?

Staying on class-scoped type vars, it might be interesting to see what should happen with the following:

class Foo(Generic[T]):
    def __init__(self: Foo[T | None], value: T) -> None: ...

f = Foo(1)

Currently pyright discards the self annotation (i.e. f is Foo[int], not Foo[int | None]), which imo makes sense.


@A5rocks, Thanks, examples added in the already supported use cases.

@gvanrossum
Copy link
Member

The following is a canonical example explaining the wanted behavior 1:

I may be coming late to this party, but if I take this example and simply remove the annotation for self, it gives the same results. So I am confused as to why you see a need for a change to the spec?

@Viicos
Copy link
Contributor Author

Viicos commented Jan 12, 2024

I may be coming late to this party, but if I take this example and simply remove the annotation for self, it gives the same results. So I am confused as to why you see a need for a change to the spec?

My previous comment raised the same question, I'm waiting for an answer on this one.

However, the second example is making use of a method-scoped type variable where an explicit annotation on self is needed.

@gvanrossum
Copy link
Member

However, the second example is making use of a method-scoped type variable where an explicit annotation on self is needed.

But why would you need this in real life? Please tell the story of the actual use case that led you to this proposal.

@A5rocks
Copy link

A5rocks commented Jan 12, 2024

I was the one who made the mypy issue that led to this; our motivation is proper typing for an exception group-friendly pytest.raises API.

See python/mypy#16752 which has a code sample, if that's not concrete enough for you.

We worked around this by subclassing ExceptionGroup for RaisesGroup but that's a massive hack because... it's not an exception group.

@Viicos
Copy link
Contributor Author

Viicos commented Jan 13, 2024

But why would you need this in real life? Please tell the story of the actual use case that led you to this proposal.

I've updated the motivation section, providing a simplified example that I encountered when trying to correctly type hint Django fields, along with a list of already existing examples. The "introduction" was also updated to make it clear that it doesn't actually unblocks any existing issue (afaik), as a workaround is already available with __new__. The goal is mainly to standardize something already used to avoid diverging behavior between type checkers.

@Viicos
Copy link
Contributor Author

Viicos commented Jan 31, 2024

Following my previous comment, I also updated the specification with examples that hopefully cover what needs to be specified. Feel free to add any other use cases that were missed.

There's a couple places where I wasn't sure what the wanted behavior is, and are open to discussion. These are marked with this Note mark:

Note

@erictraut
Copy link
Collaborator

erictraut commented Feb 1, 2024

Thanks @Viicos. I've been working to fill in missing chapters in the typing spec. One of the next that I have on my list is a chapter on type evaluation for constructor calls. This mini-spec will fit nicely within that chapter. I'll post a draft in the Python typing discourse channel when I have something ready to review. That will be a good forum for us to pin down the remaining behaviors.

@Viicos
Copy link
Contributor Author

Viicos commented Feb 1, 2024

Great to hear, and thanks for all the work on the typing spec lately. Indeed I think this "mini spec" would fit better in a more general chapter about constructor calls. As you mentioned earlier, will this include behavior with both a __init__ and __new__ method?

@erictraut
Copy link
Collaborator

erictraut commented Mar 28, 2024

@Viicos, I just posted a draft typing spec chapter on constructors. It incorporates parts of your earlier spec, although it deviates from your proposal in one significant way: it completely disallows the use of class-scoped TypeVars within a self type annotation in an __init__ method. When I was writing the spec and adding examples, I realized that allowing this construct creates ambiguities and nonsensical type evaluation results, so my proposal is to disallow it completely.

Please review the draft spec and add comments to this thread.

@jakkdl
Copy link

jakkdl commented May 11, 2024

The draft typing spec by @erictraut is merged, does that resolve the issues presented in the OP such that this can be closed?

@Viicos
Copy link
Contributor Author

Viicos commented May 11, 2024

It does, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic: feature Discussions about new features for Python's type annotations
Projects
None yet
Development

No branches or pull requests

7 participants