-
-
Notifications
You must be signed in to change notification settings - Fork 30.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bpo-43224: Draft implementation of PEP 646 #30398
Conversation
Formatted, the generated AST is as follows. `def f(*args: *Ts): pass`: ``` ( "Module", [ ( "FunctionDef", (1, 0, 1, 23), "f", ( "arguments", [], [], ( "arg", (1, 7, 1, 16), "args", ( "Starred", (1, 13, 1, 16), ("Name", (1, 14, 1, 16), "Ts", ("Load",)), ("Load",), ), None, ), [], [], None, [], ), [("Pass", (1, 19, 1, 23))], [], None, None, ) ], [], ) ``` `def f(*args: *tuple[int, ...]): pass`: ``` ( "Module", [ ( "FunctionDef", (1, 0, 1, 36), "f", ( "arguments", [], [], ( "arg", (1, 7, 1, 29), "args", ( "Starred", (1, 13, 1, 29), ( "Subscript", (1, 14, 1, 29), ("Name", (1, 14, 1, 19), "tuple", ("Load",)), ( "Tuple", (1, 20, 1, 28), [ ("Name", (1, 20, 1, 23), "int", ("Load",)), ("Constant", (1, 25, 1, 28), Ellipsis, None), ], ("Load",), ), ("Load",), ), ("Load",), ), None, ), [], [], None, [], ), [("Pass", (1, 32, 1, 36))], [], None, None, ) ], [], ) ``` `def f(*args: *tuple[int, *Ts]): pass`: ``` ( "Module", [ ( "FunctionDef", (1, 0, 1, 36), "f", ( "arguments", [], [], ( "arg", (1, 7, 1, 29), "args", ( "Starred", (1, 13, 1, 29), ( "Subscript", (1, 14, 1, 29), ("Name", (1, 14, 1, 19), "tuple", ("Load",)), ( "Tuple", (1, 20, 1, 28), [ ("Name", (1, 20, 1, 23), "int", ("Load",)), ( "Starred", (1, 25, 1, 28), ("Name", (1, 26, 1, 28), "Ts", ("Load",)), ("Load",), ), ], ("Load",), ), ("Load",), ), ("Load",), ), None, ), [], [], None, [], ), [("Pass", (1, 32, 1, 36))], [], None, None, ) ], [], ) ```
Formatted, generated AST is as follows. `def f() -> tuple[*Ts]: pass`: ``` ( "Module", [ ( "FunctionDef", (1, 0, 1, 27), "f", ("arguments", [], [], None, [], [], None, []), [("Pass", (1, 23, 1, 27))], [], ( "Subscript", (1, 11, 1, 21), ("Name", (1, 11, 1, 16), "tuple", ("Load",)), ( "Tuple", (1, 17, 1, 20), [ ( "Starred", (1, 17, 1, 20), ("Name", (1, 18, 1, 20), "Ts", ("Load",)), ("Load",), ) ], ("Load",), ), ("Load",), ), None, ) ], [], ) ``` `def f() -> tuple[int, *Ts]: pass`: ``` ( "Module", [ ( "FunctionDef", (1, 0, 1, 32), "f", ("arguments", [], [], None, [], [], None, []), [("Pass", (1, 28, 1, 32))], [], ( "Subscript", (1, 11, 1, 26), ("Name", (1, 11, 1, 16), "tuple", ("Load",)), ( "Tuple", (1, 17, 1, 25), [ ("Name", (1, 17, 1, 20), "int", ("Load",)), ( "Starred", (1, 22, 1, 25), ("Name", (1, 23, 1, 25), "Ts", ("Load",)), ("Load",), ), ], ("Load",), ), ("Load",), ), None, ) ], [], ) ``` `def f() -> tuple[int, *tuple[int, ...]]: pass`: ``` ( "Module", [ ( "FunctionDef", (1, 0, 1, 45), "f", ("arguments", [], [], None, [], [], None, []), [("Pass", (1, 41, 1, 45))], [], ( "Subscript", (1, 11, 1, 39), ("Name", (1, 11, 1, 16), "tuple", ("Load",)), ( "Tuple", (1, 17, 1, 38), [ ("Name", (1, 17, 1, 20), "int", ("Load",)), ( "Starred", (1, 22, 1, 38), ( "Subscript", (1, 23, 1, 38), ("Name", (1, 23, 1, 28), "tuple", ("Load",)), ( "Tuple", (1, 29, 1, 37), [ ("Name", (1, 29, 1, 32), "int", ("Load",)), ("Constant", (1, 34, 1, 37), Ellipsis, None), ], ("Load",), ), ("Load",), ), ("Load",), ), ], ("Load",), ), ("Load",), ), None, ) ], [], ) ```
Formatted, generated AST is as follows: `x: tuple[*Ts]`: ``` ( "Module", [ ( "AnnAssign", (1, 0, 1, 13), ("Name", (1, 0, 1, 1), "x", ("Store",)), ( "Subscript", (1, 3, 1, 13), ("Name", (1, 3, 1, 8), "tuple", ("Load",)), ( "Tuple", (1, 9, 1, 12), [ ( "Starred", (1, 9, 1, 12), ("Name", (1, 10, 1, 12), "Ts", ("Load",)), ("Load",), ) ], ("Load",), ), ("Load",), ), None, 1, ) ], [], ) ``` `x: tuple[int, *Ts]`: ``` ( "Module", [ ( "AnnAssign", (1, 0, 1, 18), ("Name", (1, 0, 1, 1), "x", ("Store",)), ( "Subscript", (1, 3, 1, 18), ("Name", (1, 3, 1, 8), "tuple", ("Load",)), ( "Tuple", (1, 9, 1, 17), [ ("Name", (1, 9, 1, 12), "int", ("Load",)), ( "Starred", (1, 14, 1, 17), ("Name", (1, 15, 1, 17), "Ts", ("Load",)), ("Load",), ), ], ("Load",), ), ("Load",), ), None, 1, ) ], [], ) ``` `x: tuple[int, *tuple[str, ...]]`: ``` ( "Module", [ ( "AnnAssign", (1, 0, 1, 31), ("Name", (1, 0, 1, 1), "x", ("Store",)), ( "Subscript", (1, 3, 1, 31), ("Name", (1, 3, 1, 8), "tuple", ("Load",)), ( "Tuple", (1, 9, 1, 30), [ ("Name", (1, 9, 1, 12), "int", ("Load",)), ( "Starred", (1, 14, 1, 30), ( "Subscript", (1, 15, 1, 30), ("Name", (1, 15, 1, 20), "tuple", ("Load",)), ( "Tuple", (1, 21, 1, 29), [ ("Name", (1, 21, 1, 24), "str", ("Load",)), ("Constant", (1, 26, 1, 29), Ellipsis, None), ], ("Load",), ), ("Load",), ), ("Load",), ), ], ("Load",), ), ("Load",), ), None, 1, ) ], [], ) ```
E.g. tuple[*tuple[int, str]].
and fix a couple of bugs the tests revealed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow, this is seriously impressive work and a very strong start! I don't have time to review everything right now, so far I've covered genericaliasobject.c
and compile.c
. I still have typing.py
and friends left. I'm hopeless at the parser and grammar so I'll leave those to someone else.
|
||
if (gi->exhausted) { | ||
gi->obj = NULL; | ||
return NULL; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like you could potentially be missing a decref here for gi->obj
since I see it being increfed in gi_iter
. If so,
return NULL; | |
Py_SETREF(gi->obj, NULL); | |
return NULL; |
If I'm just shooting my mouth off, then just please ignore what I said :).
One way to test for refleaks is to run the tests with the refleak hunter (requires a debug build):
python_d.exe -m test test_typing -R 3:3
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, super nice about the refleak hunter - I had no idea that was a thing!
And good catch about the decref too. One thing I'm unsure about is where we should do it, though. Reading https://docs.python.org/3/c-api/gcsupport.html, am I right in thinking that the iterator object gi
counts as a container type, because it contains references to another object? If so, that same page says "Before fields which refer to other containers are invalidated, PyObject_GC_UnTrack() must be called". That would imply we should be setting gi->obj
to NULL in the deallocator, in which case we'd also call the setref there?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Friendly poke :)
@@ -780,7 +787,7 @@ primary[expr_ty]: | |||
|
|||
slices[expr_ty]: | |||
| a=slice !',' { a } | |||
| a[asdl_expr_seq*]=','.slice+ [','] { _PyAST_Tuple(a, Load, EXTRA) } | |||
| a[asdl_expr_seq*]=','.(slice | starred_expression)+ [','] { _PyAST_Tuple(a, Load, EXTRA) } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer to move starred_expression
as another option of slice
as we already have named_expression
there
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just tried this, but then I get SyntaxError: can't use starred expression here
for e.g. a[*b]
.
I think what happens is, if we move starred_expression
to slice
(as | a=starred_expression { a }
), then when we do a[*b]
, it triggers the first option of slices
, where the starred_expression
doesn't get processed by _PyAST_Tuple
.
I can't see any way to make sure _PyAST_Tuple
definitely gets called for starred_expression
without doing something to slices
, so I think the original attempt with (slice | starred_expression)
is cleanest. Wdyt?
@@ -331,14 +333,19 @@ kwds[arg_ty]: '**' a=param_no_default { a } | |||
param_no_default[arg_ty]: | |||
| a=param ',' tc=TYPE_COMMENT? { _PyPegen_add_type_comment_to_arg(p, a, tc) } | |||
| a=param tc=TYPE_COMMENT? &')' { _PyPegen_add_type_comment_to_arg(p, a, tc) } | |||
param_no_default_star_annotation[arg_ty]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not join these to param_no_default
? That way we can remove the extra rule in star_etc
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, wouldn't that mess with other things that use param_no_default
though? E.g. kwds
uses param_no_default
, so if we joined these to param_no_default
, wouldn't we accidentally allow def foo(***args)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Friendly poke :)
Based on Fidget-Spinner's feedback: 1. Remove the explicit flag for 'exhausted', instead using 'gi->obj == NULL' to tell whether the iterator is exhausted. 2. Construct `starred_tuple` using the `Py_GenericAlias` constructor instead of constructing it manually. 3. Fix reference counting. 4. Make `Py_GenericAliasIterType` static.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a couple of things I noticed while working on the typing-extensions implementation.
I'm not going to be the one to merge this, but I think it might make sense to split this PR into separate PRs for the grammar changes and the typing.py changes, since they touch very different parts of the codebase and need review from different people.
class StarredTuple(_Final, _Immutable, _root=True): | ||
"""Implementation of starred tuple for older versions of Python. | ||
|
||
A starred tuple is e.g. tuple[*tuple[int, str]]. In newer versions of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But this implementation is only used in new versions of Python, so why do we need this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yeah, thanks for the reminder on this too - I was unsure about this.
The two things I had in mind were:
- It was easier to just put everything in one PR so that when doing code review we could compare between the 'new' and the 'backwards compatible' implementations.
- We probably want some of the 'backwards compatible' implementations even in modern
typing.py
for the sake of code (or maybe even pickle?) compatibility.Unpack
is probably the most important one, so we can read backwards-compatible code written asclass C(Generic[Unpack[Ts]])
. I'm admittedly less sure aboutStarredTuple
; I don't think there's ever a reason to instantiate it explicitly (and in fact maybe it should be private?). In any case, not sure what best practice is in cases like this - @gvanrossum can you comment?
def __class_getitem__(cls, item): | ||
if isinstance(item, types.GenericAlias) and item.__origin__ is tuple: | ||
# tuple[...] | ||
return StarredTuple(item) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this return an instance of Unpack instead so we don't need to add an additional class?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's definitely an option, but I'd prefer to keep Unpack
as only implementing the operator, rather implementing both the operator and the wrapper class for unpacked items - in line with the principle of "Each component should only do one thing".
Also, I think Unpack
could get really messy if we have to merge in the guts of both StarredTuple
and UnpackedTypeVarTuple
. And there are a bunch of places where we want to check for an unpacked TypeVarTuple
but not an unpacked StarredTuple
, so they would get more complicated too.
Maybe there's context I'm missing though. Is there a big downside (e.g. maintenance burden) to adding another class?
elif isinstance(item, TypeVarTuple): | ||
return item._unpacked | ||
else: | ||
raise TypeError( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer to not raise an error here, so we don't hobble potential future extensions. For example, I anticipate that we may want to support Unpack[TypedDict]
to type **kwargs
in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a) If the concern is future extensions, couldn't we just amend this function when that need arises in the future?
b) Since the result of unpacking depends on what's being unpacked, what would we do instead of raising an error here?
This PR has grown pretty big, and Pradeep agrees with Jelle that it might be better to split this up into separate parts - so I'm going to close this PR, and open a few new ones instead. |
return self._name | ||
|
||
|
||
class UnpackedTypeVarTuple(_Final, _Immutable, _root=True): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pradeep90 In the previous review, you mentioned you'd prefer to avoid this middleman class, instead for example making it so that starring a TypeVarTuple results in an instance of Unpack
.
That example solution seems suboptimal to me for two reasons:
- In a few places in the rest of typing.py, we want to check whether an object is an unpacked TypeVarTuple, and
instance(obj, UnpackedTypeVarTuple)
seems more readable to me thanisinstance(obj, Unpacked)
. - It seems potentially confusing to have
Unpack
both be an operator and an object.
Another option would be something like:
class TypeVarTuple:
def __init__(self, name, _starred=False, _id=None):
self.name = name
self._starred = starred
self._id = id
def __iter__(self):
yield TypeVarTuple(self.name, _starred=True, _id=id(self))
def __eq__(self, other):
if self._id is not None and other._id is not None:
return self._id == other._id
else:
return object.__eq__(self, other) # Not sure whether this is right, but you get the gist
def __repr__(self):
return ('*' if self._starred else '') + self.name
I lean away from this solution too because I feel like manually managing IDs could lead to subtle bugs, and isinstance(obj, UnpackedTypeVarTuple)
is still simpler than isinstance(obj, TypeVarTuple) and obj._starred
, but WDYT?
@@ -823,6 +874,95 @@ def __init__(self, name, *constraints, bound=None, | |||
self.__module__ = def_mod | |||
|
|||
|
|||
class TypeVarTuple(_Final, _Immutable, _root=True): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pradeep90 You mentioned in the previous review you'd prefer to have this inherit from _TypeVarLike
. I'm not sure I understood your reasoning - could you explain what you meant by "The flags are things that the typechecker has to deal with"?
|
||
if (gi->exhausted) { | ||
gi->obj = NULL; | ||
return NULL; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, super nice about the refleak hunter - I had no idea that was a thing!
And good catch about the decref too. One thing I'm unsure about is where we should do it, though. Reading https://docs.python.org/3/c-api/gcsupport.html, am I right in thinking that the iterator object gi
counts as a container type, because it contains references to another object? If so, that same page says "Before fields which refer to other containers are invalidated, PyObject_GC_UnTrack() must be called". That would imply we should be setting gi->obj
to NULL in the deallocator, in which case we'd also call the setref there?
|
||
if (gi->exhausted) { | ||
gi->obj = NULL; | ||
return NULL; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Friendly poke :)
@@ -331,14 +333,19 @@ kwds[arg_ty]: '**' a=param_no_default { a } | |||
param_no_default[arg_ty]: | |||
| a=param ',' tc=TYPE_COMMENT? { _PyPegen_add_type_comment_to_arg(p, a, tc) } | |||
| a=param tc=TYPE_COMMENT? &')' { _PyPegen_add_type_comment_to_arg(p, a, tc) } | |||
param_no_default_star_annotation[arg_ty]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Friendly poke :)
@JelleZijlstra @pablogsal @Fidget-Spinner @pradeep90 Oh, gosh, I'm so sorry - I've just realised I'd misunderstood a critical part of GitHub's code review interface! I'd assumed all these comments had been posted weeks ago - I'd missed the part where you actually have to click the 'Submit code review' button to post them! Uff. I'll copy over these threads to the new PRs. |
@mrahtz I can review and help with the GC part. Feel free to add me as a reviewer in that PR |
PEP 646 (Variadic Generics) hasn't yet been completely approved by the steering council, but here's a draft of its implementation so we can get started on code review.Update 2021-01-19: the SC has approved the PEP, so full steam ahead!
TODO:
typing.get_args
andtyping.get_origin
(rather than just checking__args__
etc)A couple of notes below.
@pradeep90 @gvanrossum @Fidget-Spinner
Starred tuple types
I realised there's a tricky issue with starred tuple types.
If we have e.g.
tuple[int, *tuple[str, bool]]
, the obvious thing to do would be to unpack the innertuple
at runtime, resulting intuple[int, str, bool]
. This would imply thatiter(tuple[str, bool])
creates an iterator returningstr
thenbool
.But this would cause problems when using starred tuple types as the type of
*args
, e.g.def foo(*args: *tuple[str, bool])
:foo.__annotations__['args']
should definitely not betuple[str, bool]
. That would instead implydef foo(*args: tuple[str, bool])
*args
, its iterator should return only a single value. (Or at least, this is the behaviour specified for*args: *Ts
; @pradeep90, I've just realised, should we update that section of the PEP to make it explicit that that starred tuples can be also used for*args
?)I think the simplest solution to both of these problems is to not unpack tuple types as runtime, having
*tuple[stuff]
instead creating an iterator that returns only a single item, an instance of a new 'starred tuple'. (Note that such a type does have to exist anyway, in order to represent e.g.*tuple[int, ...]
.)For the native
tuple
type, I've tentatively implemented this by adding a newstarred
flag togaobject
ingenericaliasobject.c
rather than creating a whole new type. This makes implementation easier, but maybe it's suboptimal considering thatgaobject
is used for lots of things other thantuple
- feedback appreciated. Fortyping.Tuple
, I've implemented a new class intyping.py
calledStarredTuple
, which should behave in the same way. (This does mean that*tuple[int] != *Tuple[int]
, but given thattuple[int] != Tuple[int]
, this seems fine.)Implementation of
*tuple
I'm pretty uncertain about my implementation in
genericaliasobject.c
. In particular:typing.py
, the implementation is as simple asdef __iter__(self): yield StarredTuple(self)
. Is there a way to do something similar inga_iter
, without the need to also implementga_iternext
?https://bugs.python.org/issue43224