Skip to content

Multiple Inheritance Makes Mypy think an if branch is unreachable. #3603

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
ilinum opened this issue Jun 23, 2017 · 18 comments
Closed

Multiple Inheritance Makes Mypy think an if branch is unreachable. #3603

ilinum opened this issue Jun 23, 2017 · 18 comments
Labels

Comments

@ilinum
Copy link
Collaborator

ilinum commented Jun 23, 2017

In the following example, the branch of an if-statement is not being typechecked.

class Parent1: pass

class Parent2: pass

class Child(Parent1, Parent2): pass

def goo(p: Parent1) -> None:
    if isinstance(p, Parent2):
        # this branch is not typechecked
        1 + 'boom'  # no error from mypy here.


goo(Child())

Here is the output from mypy and python.

$ mypy i.py 
$ python3 i.py 
Traceback (most recent call last):
  File "i.py", line 13, in <module>
    goo(Child())
  File "i.py", line 10, in goo
    1 + 'boom'  # no error from mypy but there's an error at runtime
TypeError: unsupported operand type(s) for +: 'int' and 'str'
@gvanrossum
Copy link
Member

These bugs are pretty good finds! How'd you find them?

@ilinum
Copy link
Collaborator Author

ilinum commented Jun 26, 2017

I am working on a patch for --html-report to mark unreachable code as Any (currently, it marks the code as typed).

So I just compared the output of new --html-report and the old one.

@JukkaL
Copy link
Collaborator

JukkaL commented Jun 26, 2017

This has come up before as well. Intersection types would perhaps be the cleanest way to support it, but just faking it by internally generating dummy multiply inheriting classes would perhaps work. So in the original example the type of p after isinstance would be a newly created empty class that looks like Child (it needs to be generated since generally something like Child can't be assumed to be visible to mypy at any given scope).

@ilevkivskyi
Copy link
Member

I would vote for implementing this using an internal intersection type. It looks like Intersection appeared here and there several times, so that we might want to make it available in PEP 484 in the future.

@JukkaL
Copy link
Collaborator

JukkaL commented Jun 27, 2017

I'm still not excited about general intersection types, though I'm not quite sure if that is what you proposed. I expect that implementing them would require a large amount of work and would complicate future work on mypy features, whereas the hack I proposed would be a local change. It's hard to justify adding heavy machinery for this issue, which a relatively rare edge case.

Here's how we could implement semi-general intersection types without full-on general intersection types (similar to current UnionType). This would be a generalized version of my proposed hack -- we'd try to represent the intersection using only the types we already have at our disposal.

We'd implement something like make_fake_intersection(t1, t2) that generates a dummy TypeInfo that basically is a subclass of t1 and t2 if both are Instance types. We can also support some other cases (these are just a few examples, and I'm not sure if all of these are worth it):

  • For protocols, the intersection would again be a new TypeInfo.
  • Intersection of a callable and Instance could be a TypeInfo with __call__.
  • Intersection of any type t with Any would perhaps be Union[t, Any] -- though that sounds a little unexpected.
  • An intersection of two callables could perhaps be Overloaded.
  • What about the intersection involving a union? Maybe Intersection[Union[t, s], u] == Union[Intersection[t, u], Intersection[s, u]].

In some cases we can't represent the intersection using the current type system features. An example would be the intersection of Type[X] and Tuple[int, str]. In cases like this we could generate an error and the user would have to use a cast or # type: ignore, or they can refactor their code. I expect that these cases would be very rare, though.

The generated TypeInfos would be named descriptively, such as <subclass of X and Y>. We'd perhaps need to add the generated TypeInfo to a symbol table as well.

This sounds a little ad hoc, but on reflection it doesn't sound terrible to me and it's actually somewhat principled. If we assume that intersections are only rarely needed (for the sake of the argument, let's say once every 50 kLOC on average), then supporting the most common 95% of intersection types would be good enough to require programmer intervention only very rarely (once every 1 MLOC based on previous assumptions). This would be a trivial fraction of all the other work needed to annotate such as large codebase, and any improvements to intersection types would have minimal practical significance for most users.

Anyway, I'm not yet convinced that generalized (fake or not) intersection types are among the most important potential features in the near future.

@ilevkivskyi
Copy link
Member

Here's how we could implement semi-general intersection types without full-on general intersection types

I am happy with this. I don't want to implement a fully functional intersection as well.

What about the intersection involving a union?

Yes, we can do some basic "set algebra"

In some cases we can't represent the intersection using the current type system features. An example would be the intersection of Type[X] and Tuple[int, str]. In cases like this we could generate an error and the user would have to use a cast or # type: ignore, or they can refactor their code. I expect that these cases would be very rare, though.

Maybe we can generate UninhabitedType in such cases?

We'd perhaps need to add the generated TypeInfo to a symbol table as well.

Yes, we will need to check that serialization/deserialization works properly (I remember having some problems when I tried "fake" TypeInfos for protocol joins).

@ddfisher
Copy link
Collaborator

Maybe we can generate UninhabitedType in such cases?

This will make mypy think the branch is unreachable.

@ilevkivskyi
Copy link
Member

This will make mypy think the branch is unreachable.

So that we are back at the original problem, but in the much narrower situation of really weird combinations like Type[X] and Tuple[int, str]. Anyway, this is just a possible option, I am also fine with emitting an error like Jukka proposed.

@JukkaL
Copy link
Collaborator

JukkaL commented Jun 28, 2017

Another thing that is unclear is how to make subtyping, joins, meets and other type operations would work with fake TypeInfos. Any class that inherits from X and Y should be a subtype of Intersection[X, Y]. Looks like we may need to do some special casing for fake typeinfos after all. However, if intersection types would only be inferred internally for isinstance tests, we could probably get away with not having subtyping and such do the right thing.

@Michael0x2a
Copy link
Collaborator

The question of how to handle multiple inheritance came up again during the overloads/overlapping types refactor and was deferred -- see #5476 (comment) and #5529 for more details.

msullivan added a commit that referenced this issue Feb 25, 2019
Bug #3603 strikes again. We've got this workaround in another place or
two.
msullivan added a commit that referenced this issue Feb 25, 2019
Bug #3603 strikes again. We've got this workaround in another place or
two.
msullivan added a commit that referenced this issue Feb 25, 2019
…sues

A mypy bug (#3603) causes mypy to think that SymbolNodes cannot be
FuncBases, which causes mypy-mypyc to crash in code that tests
that.
Currently we work around this in a couple gross ways (like turning the
SymbolNode into Any). Work around it instead by testing directly
against a tuple of the subtypes instead of FuncBase itself.

This is the better fix promised in #6480.
msullivan added a commit that referenced this issue Feb 28, 2019
#6481)

A mypy bug (#3603) causes mypy to think that SymbolNodes cannot be
FuncBases, which causes mypy-mypyc to crash in code that tests
that.
Currently we work around this in a couple gross ways (like turning the
SymbolNode into Any). Work around it instead by testing directly
against a tuple of the subtypes instead of FuncBase itself.

This is the better fix promised in #6480.
msullivan added a commit that referenced this issue Apr 27, 2019
PR #6645 introduced a mypyc failure due to the irritating #3603
intersection type issue.

Fix it by testing again SYMBOL_FUNCBASE_TYPES instead of FuncBase
ilevkivskyi pushed a commit that referenced this issue Apr 27, 2019
PR #6645 introduced a mypyc failure due to the irritating #3603
intersection type issue.

Fix it by testing again SYMBOL_FUNCBASE_TYPES instead of FuncBase
@Michael0x2a
Copy link
Collaborator

So, I'm experimenting with inferring internal-only intersection types after performing isinstance checks as a part of my new quest to get mypy to fully type-check under the --warn-unreachable flag -- basically, implementing the "construct a fake TypeInfo given two Instances" approach discussed above.

So far, this has been pretty straightforward. Our test suite is mostly undisrupted; my quest is making smooth progress.

The main complication seems to be when we combine these ad-hoc intersection types with TypeVars with value restrictions. For example, the following program previously used to type check with no error, but we now get the following error on the return A():

from typing import TypeVar

class A: pass
class B: pass

T = TypeVar('T', A, B)

def f(x: T) -> T:
    if isinstance(x, A):
        return A()   # E: Incompatible return value type (got "A", expected "B")
    else:
        return B()

Mypy is more or less pointing out a legitimate issue here -- we indeed hit an inconsistency in scenarios like this:

class Combo(A, B): pass
boom: B = Combo()

# Inferred type is 'B', but actual type at runtime is 'A'!
f(boom)

I think this is a problem though -- the error message is pretty mysterious, and users at this point are used to mypy mostly ignoring the possibility of multiple inheritance when it comes to these sorts of checks. Enabling this sort of behavior is likely going to be fairly disruptive, I feel.


So, I guess I'm just curious what people think about this problem/what good workarounds might be. Some ideas:

  1. Maybe we could try inferring ad-hoc intersections only if one of the two instances is annotated with the @trait thing from mypy_extensions? This would still help me make progress with my quest without committing mypy to any drastic behaviorial changes.
  2. Maybe we should just rethink how we handle Typevars with value restrictions altogether? We do seem to hit limitations with the "just try type-checking the method N times" approach we're currently using semi-regularly.
  3. Maybe there's some clever heuristic for deciding when and when not to create ad-hoc intersections that I'm not seeing?
  4. Maybe we could add a --assume-multiple-inheritance-is-a-thing flag to toggle the ad-hoc intersecting? Seems super duper hacky and ugly though.
  5. Maybe this is actually not a big deal after all, and we're ok with changing the behavior of mypy in this regard?

I'm leaning towards idea 1 atm since it seems the most straightforward and expedient, but was curious to see if anybody has other suggestions.

@ilevkivskyi
Copy link
Member

IMO we should not show this error at all. This would be consistent with the behavior of overloads. A function generic in a type variable with value restriction is essentially a quick way to make an overload (and get its body checked). We currently ignore the existence of multiple inheritance for overloads.

A possible (although ad-hoc) solution is to just not create these fake intersection inside a generic function with value restrictions.

@ilevkivskyi
Copy link
Member

A possible (although ad-hoc) solution is to just not create these fake intersection inside a generic function with value restrictions.

Btw, this would be the same as the current workaround for --warn-unreachable.

@Michael0x2a
Copy link
Collaborator

A possible (although ad-hoc) solution is to just not create these fake intersection inside a generic function with value restrictions.

Btw, this would be the same as the current workaround for --warn-unreachable.

I was hoping to remove this workaround soon though. It's inconsistent, complicates using things like AnyStr, and so forth.

Maybe another idea might be to make mypy remember if a certain variable or type originated from a substitution for the typevar? That would make it possible to disable ad-hoc intersections/skip unreachable warnings in a more targeted and principled way, for just if isinstance(...) checks involving these variables.

@ilevkivskyi
Copy link
Member

@Michael0x2a If it is not too much code, then it is OK.

@JukkaL
Copy link
Collaborator

JukkaL commented Jul 1, 2019

I think that it's fine to disable this for functions with type variable value restrictions for now, since the implementation is trivial and these functions are relatively rare. We can then pursue a cleaner implementation separately, without having to tie these two together. My motivation is that unreachable code is a pretty significant problem, whereas not having this in a small set of generic functions is admittedly unclean but has only a small practical impact.

If the clean implementation is simple enough then I don't have a problem, but I suspect that it might be nontrivial.

Michael0x2a added a commit to Michael0x2a/mypy that referenced this issue Jan 20, 2020
This diff makes `isinstance(...)` and `issubclass(...)` try
generating ad-hoc intersections of Instances when possible.

For example, we previously concluded the if-branch is unreachable
in the following program. This PR makes mypy infer an ad-hoc
intersection instead.

```python
class A: pass
class B: pass

x: A
if isinstance(x, B):
    reveal_type(x)  # N: Revealed type is 'test.<subclass of "A" and "B">'
```

If you try doing an `isinstance(...)` that legitimately is impossible
due to conflicting method signatures or MRO issues, we continue to
declare the branch unreachable. Passing in the `--warn-unreachable`
flag will now also report an error about this:

```python
x: str

if isinstance(x, bytes):
    reveal_type(x)  # E: Statement is unreachable
```

This error message has the same limitations as the other
`--warn-unreachable` ones: we suppress them if the isinstance check
is inside a function using TypeVars with multiple values.

However, we *do* end up always inferring an intersection type when
possible -- that logic is never suppressed.

I initially thought we might have to suppress the new logic as well
(see python#3603 (comment)),
but it turns out this is a non-issue in practice once you add in
the check that disallows impossible intersections.

For example, when I tried running this PR on the larger of our two
internal codebases, I found about 25 distinct errors, all of which
were legitimate and unrelated to the problem discussed in the PR.

(And if we don't suppress the extra error message, we get about
100-120 errors, mostly due to tests repeatdly doing `result = blah()`
followed by `assert isinstance(result, X)` where X keeps changing.)
Michael0x2a added a commit to Michael0x2a/mypy that referenced this issue Jan 20, 2020
This diff makes `isinstance(...)` and `issubclass(...)` try
generating ad-hoc intersections of Instances when possible.

For example, we previously concluded the if-branch is unreachable
in the following program. This PR makes mypy infer an ad-hoc
intersection instead.

    class A: pass
    class B: pass

    x: A
    if isinstance(x, B):
        reveal_type(x)  # N: Revealed type is 'test.<subclass of "A" and "B">'

If you try doing an `isinstance(...)` that legitimately is impossible
due to conflicting method signatures or MRO issues, we continue to
declare the branch unreachable. Passing in the `--warn-unreachable`
flag will now also report an error about this:

    # flags: --warn-unreachable
    x: str

    # E: Subclass of "str" and "bytes" cannot exist: would have
    #    incompatible method signatures
    if isinstance(x, bytes):
        reveal_type(x)  # E: Statement is unreachable

This error message has the same limitations as the other
`--warn-unreachable` ones: we suppress them if the isinstance check
is inside a function using TypeVars with multiple values.

However, we *do* end up always inferring an intersection type when
possible -- that logic is never suppressed.

I initially thought we might have to suppress the new logic as well
(see python#3603 (comment)),
but it turns out this is a non-issue in practice once you add in
the check that disallows impossible intersections.

For example, when I tried running this PR on the larger of our two
internal codebases, I found about 25 distinct errors, all of which
were legitimate and unrelated to the problem discussed in the PR.

(And if we don't suppress the extra error message, we get about
100-120 errors, mostly due to tests repeatdly doing `result = blah()`
followed by `assert isinstance(result, X)` where X keeps changing.)
Michael0x2a added a commit to Michael0x2a/mypy that referenced this issue Jan 21, 2020
This diff makes `isinstance(...)` and `issubclass(...)` try
generating ad-hoc intersections of Instances when possible.

For example, we previously concluded the if-branch is unreachable
in the following program. This PR makes mypy infer an ad-hoc
intersection instead.

    class A: pass
    class B: pass

    x: A
    if isinstance(x, B):
        reveal_type(x)  # N: Revealed type is 'test.<subclass of "A" and "B">'

If you try doing an `isinstance(...)` that legitimately is impossible
due to conflicting method signatures or MRO issues, we continue to
declare the branch unreachable. Passing in the `--warn-unreachable`
flag will now also report an error about this:

    # flags: --warn-unreachable
    x: str

    # E: Subclass of "str" and "bytes" cannot exist: would have
    #    incompatible method signatures
    if isinstance(x, bytes):
        reveal_type(x)  # E: Statement is unreachable

This error message has the same limitations as the other
`--warn-unreachable` ones: we suppress them if the isinstance check
is inside a function using TypeVars with multiple values.

However, we *do* end up always inferring an intersection type when
possible -- that logic is never suppressed.

I initially thought we might have to suppress the new logic as well
(see python#3603 (comment)),
but it turns out this is a non-issue in practice once you add in
the check that disallows impossible intersections.

For example, when I tried running this PR on the larger of our two
internal codebases, I found about 25 distinct errors, all of which
were legitimate and unrelated to the problem discussed in the PR.

(And if we don't suppress the extra error message, we get about
100-120 errors, mostly due to tests repeatdly doing `result = blah()`
followed by `assert isinstance(result, X)` where X keeps changing.)
Michael0x2a added a commit that referenced this issue Jan 23, 2020
This diff makes `isinstance(...)` and `issubclass(...)` try
generating ad-hoc intersections of Instances when possible.

For example, we previously concluded the if-branch is unreachable
in the following program. This PR makes mypy infer an ad-hoc
intersection instead.

    class A: pass
    class B: pass

    x: A
    if isinstance(x, B):
        reveal_type(x)  # N: Revealed type is 'test.<subclass of "A" and "B">'

If you try doing an `isinstance(...)` that legitimately is impossible
due to conflicting method signatures or MRO issues, we continue to
declare the branch unreachable. Passing in the `--warn-unreachable`
flag will now also report an error about this:

    # flags: --warn-unreachable
    x: str

    # E: Subclass of "str" and "bytes" cannot exist: would have
    #    incompatible method signatures
    if isinstance(x, bytes):
        reveal_type(x)  # E: Statement is unreachable

This error message has the same limitations as the other
`--warn-unreachable` ones: we suppress them if the isinstance check
is inside a function using TypeVars with multiple values.

However, we *do* end up always inferring an intersection type when
possible -- that logic is never suppressed.

I initially thought we might have to suppress the new logic as well
(see #3603 (comment)),
but it turns out this is a non-issue in practice once you add in
the check that disallows impossible intersections.

For example, when I tried running this PR on the larger of our two
internal codebases, I found about 25 distinct errors, all of which
were legitimate and unrelated to the problem discussed in the PR.

(And if we don't suppress the extra error message, we get about
100-120 errors, mostly due to tests repeatdly doing `result = blah()`
followed by `assert isinstance(result, X)` where X keeps changing.)
@JukkaL
Copy link
Collaborator

JukkaL commented Jan 29, 2020

@Michael0x2a Do you think that we can close this issue now? If there is remaining work, maybe it would be better to create specific follow-up issues instead of keeping this open?

@Michael0x2a
Copy link
Collaborator

Yeah, I think closing this is probably fine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants