[WIP] Fix crash when joining tuple and NamedTuple #3129

pkch · 2017-04-04T08:19:40Z

First commit intentionally will fail because I'm modifying builtins stub to make the test fail (to match what happens with production code).

Only after that, I'll push commit that actually fixes the failing test.

ilevkivskyi · 2017-04-04T09:40:23Z

mypy/subtypes.py

+            if not t.args or not right.args:
+                return True
+            if len(t.args) != len(right.args):
+                return False


I think this is wrong place to make these checks. If the problem is with TupleType, then it should be fixed there.

I can do that, but isn't it wrong for visit_instances to return True when the two types have a different number of type parameters? Tuple[int] vs Tuple[int, str] is one example (which caused the crash), but in the future (or even now, though I can't think of any) there may be more cases where two types have different number of type parameters. Shouldn't we, generally speaking, refuse to accept X[A] in place of X[B, C] or vice versa, when A, B, C are not Any?

At this point (after mapping instance to supertype) they should always have same numbers of arguments, because t and right are now Instances of the same TypeInfo. You could put assert len(t.args) == len(right.args) == len(right.type.defn.type_vars) and see what happens. If the assert is triggered, then it means there is a bug somewhere else.

This assert fails 298 times on mypy mypy.

I checked just one example, it's basically something like x = {1: 'a'}. In this example, in order to verify that the dict constructor has the right argument, is_subtype is asked whether Tuple[builtins.int, builtins.str] is a subtype of Tuple[builtins.int*, builtins.str*] (what does the * on the RHS mean btw?).

That in turn requires verifying that the fallback of LHS (builtins.tuple[builtins.object]) is a subtype of the fallback of RHS (builtins.tuple). visit_instance is used to verify that.

map_instance_to_supertype returns right away here because both sides are builtins.tuple. Then the code we're discussing is reached, with t.args having length 1 ([builtins.object]) and right.args having length 0.

I thought that's not a problem, since 0 arguments is treated as Any; and I thought Any is also ok. But I thought in every other situation, it's not a bug, but merely the type args don't match (so I can return False).

But I may be wrong; are you saying that map_instance_to_supertype is supposed to do that check, so if I see mismatched type args after it returns, it's a bug?

I think the bug happens even before. At this stage all the type arguments should have correct counts. For example, if you look at third pass in typeanal.py, then you will find code that puts [AnyType()]*whatever_is_needed. But maybe mypy code itself does not always create Instances cleanly. For example, builtins.tuple is not something valid, it should be builtins.tuple[Any].

(what does the * on the RHS mean btw?)

* indicates this type is a result of type variable substitution. Typically this happens as a result of type inference.

but merely the type args don't match

We don't have variadic generic classes, so logically this should never happen.

I think, ideally, you could try to find where those strange Instances are created (I suppose there are only few such places) and fix them to return Instances with correct number of type variables.

I'll look, but AFAIU builtins.tuple[Any] isn't the same as as builtins.tuple; the latter has an arbitrary number of arguments, while the former must have exactly 1. Are you saying that mypy should never create Instance that corresponds to Tuple (without type arguments)?

Instances that cause problems represent fallback for various tuples. I tried to force all tuples to be created with len(fallback) == len(tuple.items()) by changing stuff like

# checker.py return TupleType(type_parameters, self.named_type('builtins.tuple'))`

into

return TupleType( type_parameters, self.named_generic_type('builtins.tuple', [AnyType()]*len(type_parameters))

But (a) I'm not sure it's even correct; and (b) star args need to have variadic fallback.

So I could not get rid of assertion failure.

I pushed the code I tried to a separate branch, if you like to take a look.

No, buitlins.tuple should always have one type argument, it represents a fallback Instance for comparisons with other Instances. For example, is Tuple[int, float] a subtype of Sequence[float], etc. I have found this code:

fallback_item = join.join_type_list(items) return TupleType(items, self.chk.named_generic_type('builtins.tuple', [fallback_item]))

This is what should ideally happen always.
Note that in typeanal.py types are not fully analysed in second pass, so that it is better to fix the tuple fallback in the third pass.

pkch · 2017-04-05T10:14:31Z

I made a table that helped me understand representation of tuple-related types:

BUILTIN_TUPLE = <Symbol Node corresponding to 'builtins.tuple'>
BUILTIN_INT = <Symbol Node corresponding to 'builtins.int'>

Type            Representation

# Fixed-length Tuple represented as TupleType
Tuple[int]      TupleType([BUILTINS_INT], fallback=Instance(BUILTIN_TUPLE, [AnyType()]))
Tuple[Any]      TupleType([AnyType()], fallback=Instance(BUILTIN_TUPLE, [AnyType()]))
Tuple[()]       TupleType([], fallback=Instance(BUILTIN_TUPLE, [AnyType()]))

# Variadic Tuple represented as Instance(BUILTINS_TUPLE) with 1 argument
Tuple[int, ...] Instance(BUILTINS_TUPLE, [int])

# Tuple with all type checking disabled can be written in 3 different ways in python,
# and is represented as Instance(BUILTINS_TUPLE) with 0 or 1 argument
Tuple[Any, ...] Instance(BUILTINS_TUPLE, [AnyType()])
Tuple           Instance(BUILTINS_TUPLE, [])
tuple           Instance(BUILTINS_TUPLE, [])

# Invalid types
tuple[int, ...] TupleType([BUILTINS_INT, AnyType()], fallback=Instance(BUILTIN_TUPLE, [AnyType()]))
tuple[int]      TupleType([BUILTINS_INT], fallback=Instance(BUILTIN_TUPLE, [AnyType()]))

The last two types are invalid (both mypy and runtime report an error when they are used), so the code that creates them can be deleted (tests pass with it deleted):

# in TypeAnalyser.visit_unbound_type
if len(t.args) > 0 and info.fullname() == 'builtins.tuple':
    return TupleType(self.anal_array(t.args),
                     Instance(info, [AnyType()], t.line),
                     t.line)

Also note that the types Tuple and Tuple[Any, ...], which seem to have the same meaning, are represented by different objects (the difference is in the fallback used). I'm not sure if it's intentional, but I can't think of any problem with this apart from inconsistency. If we choose to standardize, I'd prefer to do so on Instance(BUILTIN_TUPLE, [AnyType()]), since it's more explicit.

ilevkivskyi · 2017-04-05T10:28:15Z

@pkch

The last two types are invalid (both mypy and runtime report an error when they are used), so the code that creates them can be deleted (tests pass with it deleted):

I will be glad if this code is removed.

Concerning tuple vs Tuple vs Tuple[Any, ...], PEP 484 clearly says they all should be equivalent. I think the internal representation for all three should be Instance(BUILTIN_TUPLE, [AnyType()]).

The representation of Tuple[int] should be TupleType([BUILTINS_INT], fallback=Instance(BUILTIN_TUPLE, [BUILTINS_INT])) (note that fallback is different from what you write). For tuples with length more than 1, the fallback should contain the join of item types, as I mentioned before.

Again this is all what should happen ideally, I will be happy if you could implement this in code.

ilevkivskyi

I think we should continue working on this. I believe this is important. Here are some comments.

ilevkivskyi · 2017-04-26T18:19:52Z

test-data/unit/check-tuples.test

@@ -91,7 +91,7 @@ from typing import Tuple
 t1 = None # type: Tuple[A, A]
 t2 = None # type: tuple

-t1 = t2 # E: Incompatible types in assignment (expression has type Tuple[Any, ...], variable has type "Tuple[A, A]")
+t1 = t2 # E: Incompatible types in assignment (expression has type tuple, variable has type "Tuple[A, A]")


Maybe put tuple in quotes like this "tuple"?

But the test would fail then -- or did you mean to change the code the generates error messages?

or did you mean to change the code the generates error messages?

Sure.

I made a change, but for consistency, also changed a few other similar cases (which involve types other than tuple). lmk if that's ok.

ilevkivskyi · 2017-04-26T18:20:30Z

mypy/types.py

@@ -833,6 +834,8 @@ def __init__(self, items: List[Type], fallback: Instance, line: int = -1,
                 column: int = -1, implicit: bool = False) -> None:
        self.items = items
        self.fallback = fallback
+        # TODO: assert not (isinstance(fallback, Instance) and fallback.type and


I would prefer to see the TODO items resolved before this is merged.

I haven't looked into this yet.

And again, not sure why assert fails. I'll commit fixes to the other problems first.

ilevkivskyi · 2017-04-26T18:22:58Z

mypy/typeanal.py

-                return instance
+            if info.fullname() == 'builtins.tuple':
+                assert not t.args
+                t.args = [AnyType()]


Is this change causing big diff really needed? Or you just wanted to save an indentation level?

No, I just wanted to save the indent and a slight duplication of code. Should I revert?

Should I revert?

This would simplify reviewing (this PR is already non-trivial).

ilevkivskyi · 2017-04-26T18:24:22Z

mypy/subtypes.py

            # Map left type to corresponding right instances.
            t = map_instance_to_supertype(left, right.type)
-
+            # TODO: assert len(t.args) == len(right.args)


Why this is still a TODO? I thought the main idea is to fix this specific point.

Because it seemed to pass in the real world, but fail in tests. Let me figure out what stubs it needs... Or maybe I'll just sneak in some no-fixture tests into the test suite while nobody's watching.

Or maybe I'll just sneak in some no-fixture tests into the test suite while nobody's watching.

:-) I like the idea of running test with real full stubs, but let us postpone this for some time.
It is better to update fixtures for now, to make them match real stubs more.

Actually, the fact that some stubs can crash here is a bit dangerous, maybe we are still missing something?

The problem with Awaitable is fixed, the problem with NewType I'll wait for you to fix it in your PR as we discussed. There may be other problems that I haven't investigated yet.

Hmm still not sure why the assert fails, I'll look into it later.

ilevkivskyi · 2017-06-04T09:39:31Z

@pkch What is the status of this PR? Could you please fix the merge conflicts?

pkch · 2017-06-05T09:30:13Z

@ilevkivskyi I think you suggested to pause this until you do a separate PR you discussed with Jukka that involved prohibiting named_type in favour of named_generic_type.

ilevkivskyi · 2017-06-05T18:45:20Z

@pkch

I think you suggested to pause this until you do a separate PR you discussed with Jukka that involved prohibiting named_type in favour of named_generic_type.

That PR was merged few weeks ago, although named_type is still there, it just now automatically calls named_generic_type with a correct number of Anys IIRC.

I will give a brief review now.

ilevkivskyi · 2017-06-05T18:46:48Z

mypy/expandtype.py

@@ -7,6 +7,8 @@
    FunctionLike, TypeVarDef
 )

+import mypy


Why do you need this import here?

ilevkivskyi · 2017-06-05T18:49:09Z

mypy/messages.py

@@ -253,14 +253,17 @@ def format_simple(self, typ: Type, verbosity: int = 0) -> str:
                base_str = itype.type.fullname()
            else:
                base_str = itype.type.name()
-            if itype.args == []:
+            if itype.args == [] or len(itype.args) == 1 and type(itype.args[0]) == AnyType:


Do you really need message formatting changes in this same PR?
Note that there is another open PR #3430 that is going to update Instances formatting.

My other changes caused a large number of [Any] to be added to error messages. I don't think error messages are improved by that, and even if they are, it would require changing a lot of updates to test cases' expected output.

On second thought, yeah let me remove this code change, and update the error messages instead.

ilevkivskyi · 2017-06-05T18:51:51Z

mypy/typeanal.py

-                                 t.line)
+            if info.fullname() == 'builtins.tuple':
+                assert not t.args
+                return Instance(info, self.anal_array([AnyType()]), t.line, t.column)


This change looks a bit strange. Could you please add a comment explaining this?

Added comment; more details are in the last two paragraphs of the earlier comment in this PR.

ilevkivskyi · 2017-06-05T18:54:33Z

mypy/typeanal.py

+        # if it's not builtins.tuple, then its bases should have tuple[Any]
+        # TODO: put assert here if it's not too slow
+        if isinstance(t.fallback, Instance) and t.fallback.type.fullname() == 'builtins.tuple':
+            fallback_item = UnionType.make_simplified_union(t.items)


I think it is not a good idea to use unions at this stage, since not all types are analyzed (we might need a separate pass for this in a separate PR).

So for now I just removed this code.

ilevkivskyi · 2017-06-05T18:56:01Z

mypy/types.py

@@ -1490,7 +1496,7 @@ def visit_tuple_type(self, t: TupleType) -> str:
        s = self.list_str(t.items)
        if t.fallback and t.fallback.type:
            fallback_name = t.fallback.type.fullname()
-            if fallback_name != 'builtins.tuple':
+            if fallback_name not in ('builtins.tuple', 'builtins.object'):


Again, do you really need this formatting change?

I intended this change to preserve existing message format that otherwise would have changed; but I can't produce any specific examples where it's necessary, so I can get rid of this.

ilevkivskyi · 2017-06-05T18:58:44Z

mypy/types.py

@@ -1741,7 +1747,9 @@ def set_typ_args(tp: Type, new_args: List[Type], line: int = -1, column: int = -
    if isinstance(tp, Instance):
        return Instance(tp.type, new_args, line, column)
    if isinstance(tp, TupleType):
-        return tp.copy_modified(items=new_args)
+        fallback_args = [UnionType.make_simplified_union(new_args)]


Using unions is not safe here, since this function can be called during the second phase of semantic analysis. Probably just remove this change.

If you didn't tell me, I would've never noticed that. Is there any way to protect our codebase from these types of mistakes (where a construct is used too early in the analysis process)?

There is an item in the roadmap about documenting the details of different passes.
(Also there is an item about re-working semantic analysis passes.)

Ok, I removed this as well. How would we remember to add this logic and the one here (both of which I removed) to a new PR in the future?

ilevkivskyi · 2017-06-05T19:02:38Z

test-data/unit/fixtures/list.pyi

@@ -22,7 +22,8 @@ class list(Iterable[T], Generic[T]):
    def append(self, x: T) -> None: pass
    def extend(self, x: Iterable[T]) -> None: pass

-class tuple(Generic[T]): pass
+_T_co = TypeVar('_T_co', covariant=True)
+class tuple(Sequence[_T_co], Generic[_T_co]): pass


Are you interested in fixing tuple in other fixtures? Maybe you can do this even in this PR and then turn on some asserts? (It is not necessary to inherit from Sequence everywhere, just make them generic in one variable)

Yup, I'll fix it in this PR.

I replaced all tuples in fixtures to use covariant type variable.

Except in tuple-simple.pyi, where somehow it says "True is not defined."

ilevkivskyi · 2017-06-05T19:04:39Z

@pkch
(Also it looks like you still have an unnecessary typeshed commit change in this PR.)

pkch · 2017-06-24T20:05:20Z

test-data/unit/pythoneval.test

@@ -1332,8 +1332,8 @@ reveal_type(g)
 with f('') as s:
    reveal_type(s)
 [out]
-_program.py:13: error: Revealed type is 'def (x: builtins.int) -> contextlib.GeneratorContextManager[builtins.str*]'
-_program.py:14: error: Revealed type is 'def (*x: builtins.str) -> contextlib.GeneratorContextManager[builtins.int*]'
+_program.py:13: error: Revealed type is 'def (x: builtins.int) -> contextlib.ContextManager[builtins.str*]'


@ilevkivskyi this change in error message does not seem right; do you know if this is a problem I introduced with my PR?

Nm, it disappeared... I swear, it happened, but I have no explanation as to how. Ghosts?

pkch · 2017-06-25T10:38:13Z

I'm looking into a strange failure in commit f292ea4, where Travis python 3.5 build failed (while all other Travis and Appveyor builds succeeded). The error is seemingly unrelated to this PR:

FAILURE  #7 run eval-test-G
Expected:
  fooo                                          (diff)
Actual:
  (empty)
Traceback (most recent call last):
  File "/home/travis/build/python/mypy/mypy/test/data.py", line 273, in run
    self.perform(self)
  File "/home/travis/build/python/mypy/mypy/test/testpythoneval.py", line 94, in test_python_evaluation
    testcase.file, testcase.line))
  File "/home/travis/build/python/mypy/mypy/test/helpers.py", line 86, in assert_string_arrays_equal
    raise AssertionFailure(msg)
AssertionFailure: Invalid output (/home/travis/build/python/mypy/test-data/unit/pythoneval.test, line 430)
test_testpythoneval_PythonEvaluationSuite.testGenericPatterns failed

The test which failed is:

[case testGenericPatterns]
from typing import Pattern
import re
p = None  # type: Pattern[str]
p = re.compile('foo*')
b = None  # type: Pattern[bytes]
b = re.compile(b'foo*')
print(p.match('fooo').group(0))
[out]
fooo

I pushed an empty commit to see if this is reproducible.

pkch · 2017-06-25T17:18:58Z

Ok this is not good. I introduced a Heisenbug with this PR. Commit f292ea4 fails the build, and the empty commit 52b0c0f following it passes. I'll look into it.

gvanrossum · 2017-06-25T20:02:00Z

The "Heisenbug" might be a Travis-CI issue. So far I've only seen it on Python 3.4 runs, but maybe it's also occurring for others? #3543. I suspect there's a subprocess that fails and some parent isn't checking exit statuses.

ilevkivskyi · 2017-06-25T22:01:37Z

@pkch

I replaced all tuples in fixtures to use covariant type variable.

Sorry, I was not very clear. The invariance in fixtures is not a big problem (if a problem at all). The problem is that dozens of fixtures use

class tuple: pass

and/or same same for list, dict, and set, i.e. they are not generic at all.

Except in tuple-simple.pyi, where somehow it says "True is not defined."

This is because the corresponding stub does not define class bool: pass

ilevkivskyi · 2017-07-18T22:07:08Z

@pkch Are you still working on this? I would rather see this merged sooner than later (especially if you fix the fixtures in this PR).

pkch · 2017-07-18T22:24:02Z

Yes I'll work on it this weekend!

…

On Jul 18, 2017 3:07 PM, "Ivan Levkivskyi" ***@***.***> wrote: @pkch <https://github.com/pkch> Are you still working on this? I would rather see this merged sooner than later (especially if you fix the fixtures in this PR). — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#3129 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ABhRMEmTcXx_9976Aup77RKuOfJF6-x9ks5sPSyPgaJpZM4Myhv6> .

pkch · 2017-07-25T08:27:33Z

Remains to do: normalize tuple -> Tuple[Any] in ThirdPass, and add asserts about number of type args

pkch · 2017-07-26T04:26:46Z

This PR contained three components:

fix the crash with namedtuple
make tuples generic in stubs
normalize tuple -> Tuple[Any]

is already taken care of in master since this PR was opened.
is taken care of in Make tuple generic in most stubs #3767 that I just opened
ended up being too difficult to do without causing more problems. I can still try to do it, but the chance I'll succeed is growing smaller, and I don't want to delay 2).

For now, I'm going to close this PR but if I or anyone else can work on it again, we can of course reopen.

Make tuples generic in stubs to reduce incompatibility between stubs and production. This is a carve-out from #3129.

pkch added 2 commits April 4, 2017 01:17

Update builtins stub to make test fail when production code fails

a72c71b

Fix join of Tuple and NamedTuple

2e90e9f

ilevkivskyi reviewed Apr 4, 2017

View reviewed changes

pkch added 4 commits April 6, 2017 03:41

Fix Tuple behavior

936bbe6

CR fixes

5810f55

Merge branch 'master' into namedtuple-join

bc631d9

Type check fix

865ec24

pkch mentioned this pull request Apr 20, 2017

Infer types from issubclass() calls #3005

Merged

ilevkivskyi reviewed Apr 26, 2017

View reviewed changes

ilevkivskyi mentioned this pull request Apr 28, 2017

Fine grained crash when class is turned into a generic class #3279

Closed

pkch added 2 commits May 1, 2017 07:18

Merge branch 'master' into namedtuple-join

a255694

CR fixes

a27613a

ilevkivskyi self-assigned this May 31, 2017

pkch added 2 commits June 5, 2017 02:21

Merge branch 'master' into namedtuple-join

84b8fae

Fix *[Any] -> * in expected error messages

ea2b1b4

ilevkivskyi reviewed Jun 5, 2017

View reviewed changes

pkch force-pushed the namedtuple-join branch 4 times, most recently from 3bc53ca to c9dba66 Compare June 24, 2017 18:57

CR fixes (partial)

a230dd9

pkch force-pushed the namedtuple-join branch from ed22724 to a230dd9 Compare June 24, 2017 19:13

pkch added 2 commits June 24, 2017 12:18

Merge branch 'master' into namedtuple-join

02961f4

Update error messages

a993ba6

pkch force-pushed the namedtuple-join branch from d83faaf to a993ba6 Compare June 24, 2017 20:04

pkch commented Jun 24, 2017

View reviewed changes

pkch added 3 commits June 24, 2017 13:14

Updated typeshed to recent master

fb4ce35

Revert weird change to test output

f292ea4

Empty commit to test Travis

52b0c0f

pkch changed the title ~~Fix crash when joining tuple and NamedTuple~~ [WIP] Fix crash when joining tuple and NamedTuple Jun 25, 2017

pkch added 5 commits July 24, 2017 18:43

Merge branch 'master' into namedtuple-join

07e7e64

Fix error message in test

f991a2b

Revert tuple normalization

c1304ba

Revert changes to pyi

959e6c9

Make tuple generic in most stubs

8edde23

pkch mentioned this pull request Jul 26, 2017

Make tuple generic in most stubs #3767

Merged

pkch closed this Jul 26, 2017

ilevkivskyi pushed a commit that referenced this pull request Jul 26, 2017

Make tuple generic in most stubs (#3767)

9104d55

Make tuples generic in stubs to reduce incompatibility between stubs and production. This is a carve-out from #3129.

[WIP] Fix crash when joining tuple and NamedTuple #3129

[WIP] Fix crash when joining tuple and NamedTuple #3129

Conversation

pkch commented Apr 4, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pkch commented Apr 5, 2017 • edited Loading

ilevkivskyi commented Apr 5, 2017

ilevkivskyi left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pkch May 1, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ilevkivskyi commented Jun 4, 2017

pkch commented Jun 5, 2017

ilevkivskyi commented Jun 5, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ilevkivskyi commented Jun 5, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pkch commented Jun 25, 2017 • edited Loading

pkch commented Jun 25, 2017

gvanrossum commented Jun 25, 2017

ilevkivskyi commented Jun 25, 2017

ilevkivskyi commented Jul 18, 2017

pkch commented Jul 18, 2017 via email

pkch commented Jul 25, 2017

pkch commented Jul 26, 2017

pkch commented Apr 5, 2017 •

edited

Loading

pkch May 1, 2017 •

edited

Loading

pkch commented Jun 25, 2017 •

edited

Loading