Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to specify a default value for a generic parameter #3737

Open
rowillia opened this issue Jul 19, 2017 · 43 comments
Open

Unable to specify a default value for a generic parameter #3737

rowillia opened this issue Jul 19, 2017 · 43 comments

Comments

@rowillia
Copy link
Contributor

Simplified Example:

from typing import TypeVar

_T = TypeVar('_T')
def foo(a: _T = 42) -> _T:  # E: Incompatible types in assignment (expression has type "int", variable has type "_T")
    return a

Real World example is something closer to:

_T = TypeVar('_T')
def noop_parser(x: str) -> str:
    return x

def foo(value: str, parser: Callable[[str], _T] = noop_parser) -> _T:
    return parser(value)
@gvanrossum
Copy link
Member

I think the problem here is that in general, e.g. if there are other parameters also using _T in their type, the default value won't work. Since this is the only parameter, you could say that this is overly restrictive, and if there are no parameters, the return type should just be determined by that default value (in your simplified example, you'd want it to return int). But I'm not so keen on doing that, since it breaks as soon as another generic parameter is used.

To make this work, you can use @overload, e.g.

@overload
def foo() -> int: ...
@overload
def foo(a: _T) -> _T: ...
def foo(a = 42):  # Implementation, assuming this isn't a stub
    return a

That's what we do in a few places in typeshed too (without the implementation though), e.g. check out Mapping.get() in typing.pyi.

@OddBloke
Copy link
Contributor

I'm running in to a similar issue when trying to iron out a bug in the configparser stubs. Reducing RawConfigParser to the relevant parts gives:

_section = Mapping[str, str]

class RawConfigParser:
    def __init__(self, dict_type: Mapping[str, str] = ...) -> None: ...

    def defaults(self) -> _section: ...

This is incorrect at the moment, because dict_type should (as the name indicates) be a type. Furthermore, defaults() returns an instance of dict_type, so this seems like a good application of generics. The following works as expected when a dict_type is passed (i.e. reveal_type(RawConfigParser(dict_type=dict).default()) gives builtins.dict*[Any, Any]):

_T1 = TypeVar('_T1', bound=Mapping)

class RawConfigParser(Generic[_T1]):
    def __init__(self, dict_type: Type[_T1] = ...) -> None: ...

    def defaults(self) -> _T1: ...

but when you don't pass a dict_type, mypy can't infer what type it should be using and asks for a type annotation where it is being used. This is less than ideal for the default instantiation of the class (which will be what the overwhelming majority of people are using). So I tried adding the appropriate default to the definition:

_T1 = TypeVar('_T1', bound=Mapping)

class RawConfigParser(_parser, Generic[_T1]):
    def __init__(self, dict_type: Type[_T1] = OrderedDict) -> None: ...

    def defaults(self) -> _T1: ...

but then I hit the error that OP was seeing:

stdlib/3/configparser.pyi:51: error: Incompatible default for argument "dict_type" (default has type Type[OrderedDict[Any, Any]], argument has type Type[_T1])

This feels like something that isn't currently supported, but I'm not sure if I'm missing a way that I could do this...

@ilevkivskyi
Copy link
Member

@OddBloke
This default is, strictly speaking, not safe. This would require a lower bound for _T1 in order to be safe.

@OddBloke
Copy link
Contributor

@ilevkivskyi I don't follow, I'm afraid; could you expand on that a little, please?

@ilevkivskyi
Copy link
Member

Consider this code:

class UserMapping(Mapping):
    ...
RawConfigParser[UserMapping]()

In this case OrderedDict will be "assigned" to dict_type, which will be Type[UserMapping], which is not a supertype of Type[OrderedDict]. In order for the default value to be safe it is necessary to have some guarantee that _T1 will be always a supertype of OrderedDict (this is called a lower bound), but this is not supported, only upper bounds are allowed for type variables.

@OddBloke
Copy link
Contributor

@ilevkivskyi Thanks, that makes sense. Do you think what I'm trying to do is unrepresentable as things stand?

@ilevkivskyi
Copy link
Member

I didn't think enough about this, but my general rule is not to be "obsessed" with precise types, sometimes Any is OK. If a certain problem appears repeatedly, then maybe we need to improve something (also now we have plugin system). In this particular case my naive guess would be to play with overloads, as Guido suggested (note that __init__ can also be overloaded).

@nematsakis
Copy link

nematsakis commented Jan 21, 2018

Allowing default values for generic parameters might also enable using type constraints to ensure that generics are only used in specified ways. This is a bit esoteric, but here is an example of how you might use the same implementation for a dict and a set while requiring that consumers of the dict type always provide a value to insert() while consumers of the set type never do.

class Never(Enum):
    """An uninhabited type"""

class Only(Enum):
    """A type with one member (kind of like NoneType)"""
    ONE = 1

NOT_PASSED = Only(1)

class MyGenericMap(Generic[K, V]):

    def insert(key, value=cast(Never, NOT_PASSED)):
        # type: (K, V) -> None
        ...
       if value is NOT_PASSED:
           ....

class MySet(MyGenericMap[K, Never]):
    pass

class MyDict(MyGenericMap[K, V]):
    pass

my_set = MySet[str]()
my_set.insert('hi')           # this is fine, default value matches concrete type
my_set.insert('hi', 'hello')  # type error, because 'hello' is not type never

my_dict = MyDict[str, int]()
my_dict.insert('hi')          # type error because value is type str, not Never
my_dict.insert('hi', 22)      # this is fine, because the default value is not used

@gvanrossum
Copy link
Member

Huh, this just came up in our code.

@ilevkivskyi
Copy link
Member

This request already appeared five times, so I am raising priority to high.

I think my preferred solution for this would be to support lower bounds for type variables. It is not too hard to implement (but still a large addition), and at the same time it may provide more expressiveness in other situations.

@hynekcer
Copy link

hynekcer commented Feb 19, 2019

A second part for the workaround with overload should be noted: The header of implementation of the overloaded function must be without a generic annotation for that variable, but it is not acceptable for more complex functions where the body should be checked. Only the binding between input and output types is checked by overloading. The most precise solution for the body seems to encapsulate it by a new function with the same types, but without a default.

@overload
def foo() -> int: ...
@overload
def foo(a: _T) -> _T: ...

def foo(a = 42):  # unchecked implementation
    return foo_internal(a)

def foo_internal(a: _T) -> _T:
    # a checked complicated body moved here
    return a

Almost everything could be checked even in more complicated cases, but the number of necessary overloaded declarations could rise exponentially by 2 ** number_of generic_vars_with_defaults.

EDIT

@hynekcer
Copy link

Another solution is to use a broad static type for that parameter and immediately assign it in the body to the original exact generic type. It is a preferable solution for a class overload.
An example is a database cursor with a generic row type:

_T = TypeVar('_T', list, dict)

class Cursor(Generic[_T]):
    @overload
    def __init__(self, connection: Connection) -> None: ...
    @overload
    def __init__(self, connection: Connection, row_type: Type[_T]) -> None: ...
    def __init__(self, connection: Connection, row_type: Type=list) -> None:
        self.row_type: Type[_T] = row_type  # this annotation is important
        ...
    def fetchone(self) -> Optional[_T]: ...
    def fetchall(self) -> List[_T]: ...   # more methods depend on _T type

cursor = Cursor(connection, dict)  # cursor.execute(...)
reveal_type(cursor.fetchone())     # dict

@andersk
Copy link
Contributor

andersk commented Nov 12, 2019

I think the problem here is that in general, e.g. if there are other parameters also using _T in their type, the default value won't work.

Wait. Why is that a problem? If default value doesn’t work for some calls, that should be a type error at the call site.

_T = TypeVar('_T')
def foo(a: List[_T] = [42], b: List[_T] = [42]) -> List[_T]:
    return a + b

foo()  # This should work.
foo(a=[17])  # This should work.
foo(b=[17])  # This should work.
foo(a=["foo"])  # This should be a type error (_T cannot be both str and int).
foo(b=["foo"])  # This should be a type error (_T cannot be both int and str).
foo(a=["foo"], b=["foo"])  # This should work.

This default is, strictly speaking, not safe. This would require a lower bound for _T1 in order to be safe.

Only if we retain the requirement that the default value always works, which is not what’s being requested here. The desired result is:

_T1 = TypeVar('_T1', bound=Mapping)
class RawConfigParser(_parser, Generic[_T1]):
    def __init__(self, dict_type: Type[_T1] = OrderedDict) -> None: ...

    def defaults(self) -> _T1: ...

class UserMapping(Mapping):
    ...

RawConfigParser[OrderedDict]()  # This should work.
RawConfigParser[UserMapping]()  # This should be a type error (_T1 cannot be both UserMapping and OrderedDict).
RawConfigParser[UserMapping](UserMapping)  # This should work.

@radeksz
Copy link

radeksz commented Dec 17, 2019

I ran into this trying to write

_T = TypeVar('_T')
def str2int(s:str, default:_T = None) -> Union[int, _T]: # error: Incompatible default for argument "default" (default has type "None", argument has type "_T")
    try:
        return int(s)
    except ValueError:
        return default

(after figuring out that I need --no-implicit-optional mypy parameter as well). @overload is not ideal because then the body of the function is no longer type-checked.

@overload
def str2int(s:str) -> Optional[int]: ...
@overload
def str2int(s:str, default:_T) -> Union[int, _T]: ... 

def str2int(s, default = None): 
    the_body_is_no_longer_type_checked

I think this raises the bar for type-checking newbies.

@couling
Copy link

couling commented Feb 5, 2024

A second part for the workaround with overload should be noted: The header of implementation of the overloaded function must be without a generic annotation for that variable, but it is not acceptable for more complex functions where the body should be checked. Only the binding between input and output types is checked by overloading.

May I ask if there's a little more detail on this design decision? Why are funciton implementations not checked for overloads?

Searches on the topic are giving me blanks. I'm curious why mypy would not check a defanition for each overload variant. It would only need to revist one function per overload so I can't see the cost would be that high, nor the implementation that complex. I wonder what I'm missing?

erlendvollset added a commit to cognitedata/cognite-sdk-python that referenced this issue Jul 2, 2024
Generic parameters and defaults don't mix well (see python/mypy#3737) so we need to use overloads instead, or InstancesResult is always InstancesResult[Any, Any]. Overloads don't work on retrieve() due to defaults and ordering of params, so we create two new methods that can be safely typed.
@bswck
Copy link

bswck commented Aug 9, 2024

Let's close this due to PEP 696?
@JelleZijlstra

@hauntsaninja
Copy link
Collaborator

hauntsaninja commented Aug 9, 2024

IIUC I don't think this should be closed. I don't think this should require specifying a default on the TypeVar explicitly and also mypy's behaviour doesn't change if you do so (default type is little different from default value)

@hauntsaninja hauntsaninja reopened this Aug 9, 2024
@bswck
Copy link

bswck commented Aug 9, 2024

@hauntsaninja In that case--sorry for causing the mess.

@septatrix
Copy link
Contributor

It seems like there are some workarounds available for class methods and normal function but does someone have a workaround for class attributes? E.g.

config_parse_string: ConfigParseCallback[str]

class ConfigSetting(Generic[T]):
    parse: ConfigParseCallback[T] = config_parse_string

@mcmanustfj
Copy link

mcmanustfj commented Dec 5, 2024

I'm looking at this and it seems like one potential solution would be updating TypeChecker.check_default_args to replace the types of generic arguments with the types of their initializers - i.e def f[T](var: T = "asdf") -> T: ... is actually checked as def f(var: str = "asdf") -> str, but I'm not sure how to infer str and not Literal["asdf"] from the context of a type checker.

@andersk
Copy link
Contributor

andersk commented Dec 5, 2024

@mcmanustfj That is not a solution because we do not want the default’s type to constrain call sites where the default is not used. In your example, we need to accept f(42) returning int. See my comment above for details of what should be accepted and rejected.

@mcmanustfj
Copy link

@andersk My thought was to only change the function type checking and not the invocation type checking, but I could definitely be misunderstanding how check_default_args works.

My test patch has the following change:

    def check_default_args(self, item: FuncItem, body_is_trivial: bool) -> None:
        if any(
            arg.initializer is not None and type(arg.type_annotation) is UnboundType
            for arg in item.arguments
        ):
            args_without_generic_defaults = [
                Argument(
                    variable=arg.variable,
                    type_annotation=arg.type_annotation,
                    initializer=None
                    if type(arg.type_annotation) is UnboundType
                    else arg.initializer,
                    kind=arg.kind,
                    pos_only=arg.pos_only,
                )
                for arg in item.arguments
            ]

            item_without_generic_defaults: FuncItem = FuncItem(
                arguments=args_without_generic_defaults,
                body=item.body,
                typ=item.type,
                type_args=item.type_args,
            )

            self.check_default_args(item_without_generic_defaults, body_is_trivial)
            return
    ... # rest of check_default_args

Right now,

def test_func(var: T = "asdf") -> T:
    return var

if __name__ == "__main__":
    x = test_func()
    reveal_type(x)
    reveal_type(test_func())
    reveal_type(test_func("asdf"))
    reveal_type(test_func(True))
    reveal_type(test_func(1))

has the opposite problem and doesn't give the default invocation a type at all,

example.py:31: note: Revealed type is "Any"
example.py:32: note: Revealed type is "Never"
example.py:33: note: Revealed type is "builtins.str"
example.py:34: note: Revealed type is "builtins.bool"
example.py:35: note: Revealed type is "builtins.int"

Not sure where to go from here, unfortunately.

@andersk
Copy link
Contributor

andersk commented Dec 10, 2024

@mcmanustfj Just conceptually, changing only the function type checking as you describe is still not a solution, because it would fail to reject this wrong definition:

def f[T](var: T = "asdf") -> T:
    # we should not be able to assume var: str here
    return var.lower()

f(42)

and fail to reject this wrong invocation:

def f[T](var: T = "asdf") -> T:
    return var

# when the default is used, we should not let T be anything but str
ret: int = f()

The correct solution needs to involve type-checking the default value at each call site if the default is used there. That is, we need to treat

def f[T](var: T = "asdf") -> T:
    return var

f()
f(arg)

as the syntactic sugar it is for something like

f_var_default = "asdf"
def f[T](var: T) -> T:
    return var

f(f_var_default)
f(arg)

(with perhaps an additional definition-site check that there exists some T that unifies the types of f_var_default and var, but that’s not required for soundness, as the consequence if there doesn’t would be that a call relying on the default is safely prohibited at the call site).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests