Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[red-knot] Extend instance-attribute tests #15808

Merged
merged 2 commits into from
Jan 29, 2025

Conversation

sharkdp
Copy link
Contributor

@sharkdp sharkdp commented Jan 29, 2025

Summary

When we discussed the plan on how to proceed with instance attributes, we said that we should first extend our research into the behavior of existing type checkers. The result of this research is summarized in the newly added / modified tests in this PR. The TODO comments align with existing behavior of other type checkers. If we deviate from the behavior, it is described in a comment.

When we discussed the plan on how to proceed with instance attributes,
we said that we should first extend our research into the behavior of
existing type checkers. The result of this research is summarized in the
newly added / modified tests in this PR. The TODO comments align with
existing behavior of other type checkers. If we deviate from the
behavior, it is described in a comment.
@sharkdp sharkdp added the red-knot Multi-file analysis & type inference label Jan 29, 2025
c_instance.declared_and_bound = 1
```

#### Variable declared in class body and not bound anywhere
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section has only been moved.

Comment on lines +190 to +192
# TODO: This would ideally be `str | None`, but mypy/pyright don't support this either,
# so `Unknown` + a diagnostic is also fine.
reveal_type(C().declared_and_bound) # revealed: @Todo(implicit instance attribute)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ooh, interesting. Yeah, I can see how this could be hard. Simply checking that the this variable in __init__ has the same type as self would not be sufficient; you'd need to know that it refers to exactly the same object as self.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interestingly, pyright does actually seem to have a special treatment of self / cls:

When a type annotation for a method’s self or cls parameter is omitted, pyright will infer its type based on the class that contains the method. The inferred type is internally represented as a type variable that is bound to the class.

The type of self is represented as Self@ClassName where ClassName is the class that contains the method. Likewise, the cls parameter in a class method will have the type Type[Self@ClassName].

class C:
    def __init__(self) -> None:
        reveal_type(self)  # pyright:  Self@C

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If by "special treatment" you mean "special treatment based on the name of the parameter," I don't see evidence for that in the linked docs or in the playground? Pyright does implement a lint rule telling you to change the name to self/cls, but using a different name doesn't seem to otherwise impact its type inference. The phrase "a method's self or cls parameter" in those docs does not mean "a parameter named self or cls", it means "the first parameter of a method or classmethod."

If by "special treatment" you mean assigning a type to an un-annotated first parameter that is a typevar bound to the type of the containing class, we will also need to do that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Simply checking that the this variable in __init__ has the same type as self would not be sufficient; you'd need to know that it refers to exactly the same object as self.

Once we implement correct Self typing, then checking that it has the same type as self will actually be sufficient, because the Self type can't apply to anything other than self.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once we implement correct Self typing, then checking that it has the same type as self will actually be sufficient, because the Self type can't apply to anything other than self.

Is that true?

from typing import Self

class Foo:
    def __init__(self, other: Self | None = None):
        if other is not None:
            other.bar = 42

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, good point. I think it would also be pretty reasonable to consider other.bar = 42 the same way we would consider self.bar = 42; both are cases of a method of the very same class assigning to an attribute on its self type. The only real distinction would be if we were trying to validate full initialization of self, but I don't think we're aiming to do that? But it would also be OK not to, whatever is easier I think, since I doubt it happens often.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yeah, that makes sense!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If by "special treatment" you mean assigning a type to an un-annotated first parameter that is a typevar bound to the type of the containing class, we will also need to do that.

Sorry, I should have been more clear. This is what I meant, yes. It's not yet clear to me why pyright sometimes infers C, sometimes *C (for "partially unknown" types?), and sometimes Self@C.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see. The type of self should generally be Self@C, which is a type variable with upper bound C; this is different from C in that it accounts for the possibility of a subclass, which has implications for methods that return self (when called on a subclass, they will return an instance of the subclass, not C), and for valid-Liskov-override checking of method signatures (normally parameter types are contravariant, so a subclass method parameter could not accept a subtype of the corresponding superclass method parameter, but this is exactly what needs to happen with self). Not sure in what cases pyright will flatten that to just C, or what C* represents.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some of the inconsistency in pyright's behaviour here may be because of what PEP484 specified before we added a Self type to the typing system: https://peps.python.org/pep-0484/#annotating-instance-and-class-methods

Pyright and other type checkers had to change various behaviours when Self was added

self.defined_in_init: str | None = "value in base"

class Intermediate(Base):
# TODO: This should be an error
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pyright emits an error here but mypy does not, which might be worth a comment.

I agree that it's unsafe to redeclare a mutable variable with a subtype of the type of the variable as it exists on the superclass Foo. Mutable variables should really be invariant rather than covariant.

Unfortunately I think a lot of typed Python code assumes that this kind of thing will be allowed, though, as mypy has always (unsafely) treated mutable instance attributes as covariant, and for a while I think pyright did too. So doing what pyright does here might cause compatibility issues for users migrating from mypy. This could possibly be a disabled-by-default error code, since I think in practice it doesn't actually cause a large number of type-safety issues for users.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pyright emits an error here but mypy does not, which might be worth a comment.

Done, thanks.

This could possibly be a disabled-by-default error code, since I think in practice it doesn't actually cause a large number of type-safety issues for users.

Hm. Sure, we can still discuss whether that should be enabled by default or not, but it seems to me like this is something that could easily cause problems? All of the following is well typed if you were to allow the var: int = 1 re-declaration on Derived (and in fact, as you said, mypy sees no problem with this program):

class Base:
    var: int | None = None

    def reset(self) -> None:
        self.var = None

class Derived(Base):
    var: int = 1

derived = Derived()
derived.reset()

derived.var + 1  # Boom!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Absolutely, I agree that treating mutable instance attributes as covariant is unsound! And we should definitely have a diagnostic about attempting to covariantly override declarations like that. I just worry that there's a lot of typed Python code out there that relies on mypy (unsoundly!) treating mutable attributes as covariant.

Anyway, we can figure out the details of whether the diagnostic is enabled or disabled by default at a later date.

@sharkdp sharkdp merged commit 0f1035b into main Jan 29, 2025
21 checks passed
@sharkdp sharkdp deleted the david/extend-instance-attribute-tests branch January 29, 2025 13:06
dcreager added a commit that referenced this pull request Jan 29, 2025
* main:
  [red-knot] Extend instance-attribute tests (#15808)
  Fix formatter warning message for `flake8-quotes` option (#15788)
  [`flake8-bugbear`] Exempt `NewType` calls where the original type is immutable (`B008`) (#15765)
  Add missing config docstrings (#15803)
  [`refurb`] Do not emit diagnostic when loop variables are used outside loop body (`FURB122`) (#15757)
  [`ruff`] Check for shadowed `map` before suggesting fix (`RUF058`) (#15790)
  [red-knot] Do not use explicit `knot_extensions.Unknown` declaration (#15787)
  Preserve quotes in generated byte strings (#15778)
  [minor] Simplify some `ExprStringLiteral` creation logic (#15775)
  Preserve quote style in generated code (#15726)
  Rename internal helper functions (#15771)
  [`airflow`] Extend airflow context parameter check for `BaseOperator.execute` (`AIR302`) (#15713)
  Implement tab autocomplete for `ruff config` (#15603)
# "… overrides symbol of same name in class "Base". Variable is mutable so its type is invariant"
# We should introduce a diagnostic for this. Whether or not that should be enabled by default can
# still be discussed.
can_not_be_redeclared: str = "a"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there are two possible rules here: "subclass can not re-declare, period", or "subclass can only re-declare with an equivalent type, because mutability implies invariance". The name can_not_be_redeclared here suggests the former, but pyright implements the latter. There's no unsoundness to redeclare with the same type on a subclass, and pyright allows it without error. Perhaps we could be a little clearer about that here? Even show an example of both?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. #15826

sharkdp added a commit that referenced this pull request Jan 30, 2025
…15826)

# Summary

Clarify the behavior regarding re-declaration of attributes from base
classes following [this
discussion](#15808 (comment))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
red-knot Multi-file analysis & type inference
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants