Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Infer type from in container check #10977

Open
Dreamsorcerer opened this issue Aug 13, 2021 · 17 comments
Open

Infer type from in container check #10977

Dreamsorcerer opened this issue Aug 13, 2021 · 17 comments
Labels
feature topic-type-narrowing Conditional type narrowing / binder

Comments

@Dreamsorcerer
Copy link
Contributor

Dreamsorcerer commented Aug 13, 2021

m: Mapping[str, str]
v: Union[str, bytes]
if v in m:
    m[v]

This produces:
Invalid index type "Union[str, bytes]" for "Mapping[str, str]"; expected type "str" [index]

But, we just checked that the value is in the mapping; so firstly, it most certainly is a valid index, but secondly, we can also infer that the variable is in fact str type.

@Dreamsorcerer
Copy link
Contributor Author

Dreamsorcerer commented Aug 13, 2021

More precisely, the type within an if value in container can be inferred to be the intersection of possible types that either value or container could be.

@erictraut
Copy link

From the perspective of a type checker, the above code looks like the following:

if m.__contains__(v):
    m.__getitem__(v)

Unless a type checker has intimate knowledge about the internal implementation of m, there's nothing it can infer about the types that are involved here. For most classes (including Mapping), the method __contains__ accepts any object as an input.

Generally, type checkers do not (and should not) assume knowledge about internal implementations of specific classes. They base their inferences only on the information provided in type declarations. Even if it were feasible to maintain hard-coded semantic knowledge about a subset of built-in classes, behaviors can be overridden by subclasses, and a type checker would have no way of knowing when its assumptions were invalidated.

@Dreamsorcerer
Copy link
Contributor Author

Dreamsorcerer commented Aug 13, 2021

there's nothing it can infer about the types that are involved here.

Why not? typing.Container means that a type is generic over a type which is checked for with __contains__(). If an implementation is returning True when the container does not contain the variable passed to it, then that is a broken implementation. The expected behaviour of a container is clear and mypy can infer types from that.

For most classes (including Mapping), the method __contains__ accepts any object as an input.

As does isinstance(), I don't see why that's relevant...

Generally, type checkers do not (and should not) assume knowledge about internal implementations of specific classes.

By that logic we shouldn't infer types from isinstance() either, as some idiot could do this:

class A:
    def __init__(self):
        self.foo = 1

class B:
    def __init__(self):
        self.__class__ = A

b = B()
if isinstance(b, A):
    print(b.foo)

@gwerbin
Copy link
Contributor

gwerbin commented Aug 14, 2021

Consider a more concrete example:

from collections.abc import Mapping
from typing import TypeVar, Union

A = TypeVar('A')
B = TypeVar('B')

class LookupTable(Mapping[A, B]):
    def __setitem__(self, key: A, value: B) -> None: ...
    def __getitem__(self, key: A) -> B: ...
    def __contains__(self, key: A) -> bool: ...

table: LookupTable[str, float]
key: Union[str, bytes, float]
if key in table:
    print(table[key])

I think the crux of the problem is here:

main.py:10: error: Argument 1 of "__contains__" is incompatible with supertype "Mapping"; supertype defines the argument type as "object"
main.py:10: note: This violates the Liskov substitution principle
main.py:10: note: See https://mypy.readthedocs.io/en/stable/common_issues.html#incompatible-overrides
main.py:10: error: Argument 1 of "__contains__" is incompatible with supertype "Container"; supertype defines the argument type as "object"
main.py:14: error: Unsupported operand types for in ("Union[str, bytes]" and "LookupTable[str, float]")

__contains__ must have the signature __contains__(self, key: object) -> bool. No type narrowing is possible here, becuase you could implement __contains__ to just return True in all cases.

Even in a legitimate case, how do you determine whether to narrow the type to str or float? And what happens in the case below? Can Mypy safely infer that key in table is always false because the types don't match?

table: LookupTable[str, float]
key: None
if key in table:
    print(table[key])

Perhaps Mypy could add some kind of --assume-mappings-contain-key-type option, to ignore this caveat and infer that if coll.__contains__(x), returns True, then the type of x can be narrowed to A. Or provide some way to annotate this behavior on a per-class basis.

That, or __contains__ should be changed from __contains__(self, value: object) to __contains__(self, value: Any), to support these kinds of "type-restricted" magic methods.

@A5rocks
Copy link
Contributor

A5rocks commented Aug 14, 2021

@gwerbin FWIW, Dreamsorcerer is talking about typing.Container

from typing import Container, Mapping, TypeVar

T = TypeVar('T')

def f(n: Container[T]) -> T: ...

blah: Mapping[str, float]

reveal_type(f(blah))  # N: Revealed type is "builtins.str*"

https://mypy-play.net/?mypy=latest&python=3.10&gist=96d1eb453aca921853fe80e6876af4da

... I for one would love to see the effects this narrowing would have on mypy-primer. I wonder if it would have an overall good (bunch of this type of pattern) or bad (bunch of weird situations in libraries) effect :)

@gwerbin
Copy link
Contributor

gwerbin commented Aug 14, 2021

Yes, as per the error message, this problem is inherited from typing.Container:

https://github.com/python/typeshed/blob/431c4f7fc16c1b593f22ee01ef7c686cc4ec557e/stdlib/typing.pyi#L296-L298

@runtime_checkable
class Container(Protocol[_T_co]):
    @abstractmethod
    def __contains__(self, __x: object) -> bool: ...

@Dreamsorcerer
Copy link
Contributor Author

__contains__ must have the signature __contains__(self, key: object) -> bool. No type narrowing is possible here, becuase you could implement __contains__ to just return True in all cases.

I'm still confused why we are talking about changing the type on __contains__(). The correct implementation should be def __contains__(self, key: object) -> bool:, as mypy tells you. I'm not sure how this is relevant...

The implementation should only return True if the value is in the container. The value should generally only be in the container if it's a compatible type of that container, therefore we should generally be able to narrow the type after the condition. I don't see how this is different to isinstance().

Essentially, the way it looks in my head is like this:

if isinstance(_T, _U):
    # value is now intersection of [_T, _U]
# or
if _T in Container[_U]:
    # Again, value should now be intersection of [_T, _U]

So, both of these should be able to achieve the same result:

from typing import Container, Union

a: Union[str, bytes, int]
c: Container[Union[bytes, int, bool]]
if isinstance(a, (bytes, int, bool)):
    reveal_type(a)  # Revealed type is "Union[builtins.bytes, builtins.int]"
if a in c:
    reveal_type(a)  # Revealed type is "Union[builtins.str, builtins.bytes, builtins.int]

Both could be inferred to be Union[builtins.bytes, builtins.int].

... I for one would love to see the effects this narrowing would have on mypy-primer. I wonder if it would have an overall good (bunch of this type of pattern) or bad (bunch of weird situations in libraries) effect :)

That would definitely be the best idea to see if there really are a bunch of weird edge cases used in practice.

@Dreamsorcerer
Copy link
Contributor Author

Also worth pointing out that mypy already makes this assumption about implementation, so we are not actually changing any assumptions here:

x: str
y: Container[int]
x in y  # error: Non-overlapping container check (element type: "str", container item type: "int")

If mypy is assuming that a type cannot be in the container, then why does it not carry that assumption to narrowing the type?

@gwerbin
Copy link
Contributor

gwerbin commented Aug 14, 2021

I think what I was missing my example is that Mapping[K, V] inherits from Container[K], so the correct thing to infer is that if v in m is true, then the type of v must be K. Makes sense and I agree that this would be useful.

Perhaps the inference can be restricted to objects that explicitly inherit from Container? Because you can still freely write a class that doesn't inherit from Container and doesn't share these semantics at all:

class Weird:
    def __contains__(self, key: object) -> bool:
        return True
    def __getitem__(self, value: int) -> int:
        return value + 5

x: Union[str, bytes] = 'hello'
w = Weird()
if x in w:
    # It would not make sense to infer anything about the type of x here.
    result = w[x]

@Dreamsorcerer
Copy link
Contributor Author

Dreamsorcerer commented Aug 14, 2021

Perhaps the inference can be restricted to objects that explicitly inherit from Container?

I think that is implicit, Weird in your example is not generic, so how would mypy know what type to infer from it?

You can see this already from the fact that mypy does not issue a 'Non-overlapping container check' error in your example, as it does in my previous comment. I don't expect it to attempt type narrowing in any situation that it can't already do an overlapping check.

@erictraut
Copy link

Yeah, I buy that line of reasoning. If a class explicitly subclasses from typing.Container[T] and the semantics of typing.Container are well established (and I agree that they are), then it's reasonable for a type checker to perform type narrowing in this case. It would still be possible for someone to violate the semantics of typing.Container in their subclass, but that's unlikely (and ill-advised).

@Dreamsorcerer
Copy link
Contributor Author

Also related, narrowing of the container should also work:

a: set[Optional[str]]
if None in a:
    return
reveal_type(a)

This should be able to infer the type as set[str], but mypy currently still sees it as set[Union[str, None]].

@JelleZijlstra
Copy link
Member

@Dreamsorcerer that example is unsafe because of invariance.

@Dreamsorcerer
Copy link
Contributor Author

Ah, good point. It would still be nice if it would recognise that values when iterating over the set would all be str, but maybe that's too complex for an edge case.

@Kuba314
Copy link

Kuba314 commented Mar 29, 2023

Sorry for hijacking this thread, but I'd expect that a class subclassing Container[int] can only "contain" integers, i.e. its __contains__ method accepts only integers. Why does it have to be objects?

@A5rocks
Copy link
Contributor

A5rocks commented Mar 29, 2023

"h" in x when x is Container[int] is (and should be IMO) fine.

This means __contains__ can take in any object. This issue is more about narrowing the "h" there xd.

@Dreamsorcerer
Copy link
Contributor Author

Or, more generally (because a str will never be in the container), a variable of str | int would be completely valid to check if it's in a Container[int] (and shouldn't cause any non-overlapping type warnings). But, definitely off-topic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature topic-type-narrowing Conditional type narrowing / binder
Projects
None yet
Development

No branches or pull requests

6 participants