Skip to content

Add a possible type hint to the @requires decorator. #1504

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Feb 22, 2021

Conversation

obi1kenobi
Copy link
Contributor

Description

This PR type-hints the @requires decorator, which seems to be used in over 100 places in the repository at the moment. Since decorators are a bit tricky to type-hint correctly, and still have some unsolved edge cases with regard to type-hinting (python/mypy#3157 comes to mind), I wanted to open this PR as an example of the available options and to provide a bit more context for #1498 where we were discussing whether to use @requires or not. Also tangentially related to #1496.

The gist of the still-open mypy issue is that there's no great way to type-hint decorators that work on functions with arbitrary type signatures but change the input function's type signature in some way. The @requires decorator falls in this category because it works over functions with arbitrary inputs and outputs, and simply wraps the function's return type in an Optional.

This limitation leaves us with two options.

  • Option 1 (implemented in this PR): allow @requires to work on arbitrary functions, but make the resulting decorated function's input argument signature appear to be f(*args, **kwargs).

This is an excerpt of what that looks like:

RequiresReturnTypeT = TypeVar("RequiresReturnTypeT")

def __call__(
    self, func: Callable[..., RequiresReturnTypeT]
) -> Callable[..., Optional[RequiresReturnTypeT]]:

The Callable[..., Foo] syntax means "this function allows any positional and keyword arguments". The equivalent way to define a function with such a signature in Python would be this:

def foo(*args: Any, **kwargs: Any) -> Foo:
    ...

Pro: @requires can be used with arbitrary functions
Con: Functions decorated with @requires thus lose their type signatures, and mypy is no longer able to statically check their call sites since they appear callable with any input arguments.

  • Option 2 (viable alternative, but limits @requires functionality): only allow @requires to be used on functions with one positional argument (i.e. ones that look like def foo(self) -> Foo), then preserve the function signature through the decoration process.

I didn't want to implement this option without asking for approval first, since it does limit what @requires can do and I'm not sure if it is always used with only a single argument. However, if this option seems reasonable to all the maintainers, it is definitely preferable from a mypy perspective since it preserves the ability for mypy to type-check call sites of functions decorated with @requires.

This is what that would look like, as an excerpt:

ArgTypeT = TypeVar("ArgTypeT")
RequiresReturnTypeT = TypeVar("RequiresReturnTypeT")

def __call__(
    self, func: Callable[[ArgTypeT], RequiresReturnTypeT] 
) -> Callable[[ArgTypeT], Optional[RequiresReturnTypeT]]:
    ...

The pro/con here are essentially the reverse of Option 1's ones above. You can see that the decorated function has a specific type signature including the function's argument types, and the generics (the TypeVar values) allow us to express the "same type as in the input function" type relationships between the function being decorated and the final produced function. This allows mypy to perform type-checking of calls to the decorated function, at the cost of only allowing a single argument to the function.

In principle, similar variants of this decorator could be written for 2-argument or arbitrary k-argument versions of the same functionality. I admit that's not the most elegant option, but until the relevant portion of python/mypy#3157 is resolved, this is the best we can do.

I'd be happy to implement either of these options, based on whatever the maintainers decide is the best way forward for this project.

Checklist

  • Follows official PR format
  • Code style correct (follows pylint and black guidelines)

@codecov
Copy link

codecov bot commented Jan 19, 2021

Codecov Report

Merging #1504 (2909b30) into main (226d141) will decrease coverage by 0.01%.
The diff coverage is 83.33%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #1504      +/-   ##
==========================================
- Coverage   90.00%   89.99%   -0.02%     
==========================================
  Files         107      107              
  Lines       11528    11532       +4     
==========================================
+ Hits        10376    10378       +2     
- Misses       1152     1154       +2     
Impacted Files Coverage Δ
arviz/data/io_pymc3.py 90.68% <50.00%> (-0.63%) ⬇️
arviz/data/base.py 97.88% <100.00%> (+0.03%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 226d141...1005d2d. Read the comment docs.

@ColCarroll
Copy link
Member

Thanks for this! I followed the rabbithole of that issue for a while, and it seemed like there was particular interest (from hypothesis and GvR) in this particular use case, but maybe it hasn't yet landed?

Would it help to change requires to take a single argument, and just make sure it isn't None? I'd be happy with changing

@requires('posterior', 'prior')
def my_function(self):
    ...

to a tower of

@requires('posterior')
@requires('prior')
def my_function(self):
    ...

@OriolAbril
Copy link
Member

We currently have both uses of multiple arguments (acts as a logical and IIRC) and uses of cascading (acts as a logical or IIRC, could be the other way around), however, the multiple argument cases could be modified to using a list of strings as a single argument.

I think that option 2 is too restrictive though, there must be at least 1 positional argument in addition to self, otherwise the function can't know which attributes to check for.

@obi1kenobi
Copy link
Contributor Author

Thanks for this! I followed the rabbithole of that issue for a while, and it seemed like there was particular interest (from hypothesis and GvR) in this particular use case, but maybe it hasn't yet landed?

Would it help to change requires to take a single argument, and just make sure it isn't None? I'd be happy with changing

@requires('posterior', 'prior')
def my_function(self):
    ...

to a tower of

@requires('posterior')
@requires('prior')
def my_function(self):
    ...

Ah sorry for the confusion, this doesn't need to change one way or the other. Having more than one parameter would be a problem on the decorated function, not in the decorator invocation. Here's an example of something that would not be supported:

@requires('posterior')
def my_function(self, other_arg):
    ...

This is because the decorator would be expecting a function that looks like Callable[[SelfType], ReturnType] but my_function has a type signature of Callable[[SelfType, OtherArgType], ReturnType].

The @requires tower example would similarly also work so long as the decorated function has only a single argument. Let's unpack this step by step:

  • The first (innermost) @requires would rewrite the function's signature from Callable[[SelfType], ReturnType] to Callable[[SelfType], Optional[ReturnType]].
  • The subsequent @requires take that Callable[[SelfType], Optional[ReturnType]] function and turn it into Callable[[SelfType], Optional[Optional[ReturnType]]].
  • However, observe that Optional[Optional[T]] = Optional[T] for all types T. This is because Optional[T] means "T or None" (formally, Optional[T] = Union[T, NoneType]), so Optional[Optional[T]] means "T or None or None" (formally Optional[Optional[T]] = Union[T, NoneType, NoneType] = Union[T, NoneType]) and we see that the last two cases are actually identical i.e. the same case.
  • This means that the subsequent @requires actually produces a type signature of Callable[[SelfType], Optional[Optional[ReturnType]]] = Callable[[SelfType], Optional[ReturnType]] which is identical to the type signature produced by the first @requires. Therefore, these subsequent @requires applications don't actually further change the type signature and it follows that we can stack towers of @requires without issues.

I think that option 2 is too restrictive though, there must be at least 1 positional argument in addition to self, otherwise the function can't know which attributes to check for.

@OriolAbril I'm assuming this was referring to the arguments to the @requires itself and not to the function @requires is decorating? If so, I think this might have gotten caught up in the confusion above, sorry about that.

Would you also find it overly restrictive to prevent @requires from being used on functions that look like the following?

@requires('posterior')
def my_function(self, other_arg):
    ...

To be explicit, the issue here would be that my_function takes more than one argument, and not the number of arguments to @requires or the number of times @requires is stacked in a decorator tower.

@OriolAbril
Copy link
Member

OriolAbril commented Jan 20, 2021

Oh, yeah, I though the restriction was on what is passed to requires. I am not 100% sure it is possible as is, but with the decorated function being functions(self) is feasible. It may require some work but I am convinced its feasible

@obi1kenobi
Copy link
Contributor Author

There is one function that takes multiple arguments and is currently using @requires:

def _extract_log_likelihood(self, trace):

Otherwise, this refactor should be totally doable. This is what the essential portion of it would look like:

    def __call__(
        self, func: Callable[[RequiresArgTypeT], RequiresReturnTypeT]
    ) -> Callable[[RequiresArgTypeT], Optional[RequiresReturnTypeT]]:  # noqa: D202
        """Wrap the decorated function."""

        def wrapped(arg: RequiresArgTypeT) -> Optional[RequiresReturnTypeT]:
            """Return None if not all props are available."""
            for prop in self.props:
                prop = [prop] if isinstance(prop, str) else prop
                if all([getattr(arg, prop_i) is None for prop_i in prop]):
                    return None
            return func(arg)

        return wrapped

I'm happy to do whatever you think is best here. We could easily inline the None checks for that one multi-parameter function and then make @requires take only a single argument, and that would probably be the option I'd prefer if I were the maintainer of this code.

@obi1kenobi
Copy link
Contributor Author

I just realized there's an alternative that, while a bit clunky, covers both the 1-argument and the 2-argument cases: explicit @overload definitions for the decorator.

@overload  # decorating 1-argument function
def __call__(
    self, func: Callable[[RequiresArg1TypeT], RequiresReturnTypeT]
) -> Callable[[RequiresArg1TypeT], Optional[RequiresReturnTypeT]]:
    ...

@overload  # decorating 2-argument function
def __call__(
    self, func: Callable[[RequiresArg1TypeT, RequiresArg2TypeT], RequiresReturnTypeT]
) -> Callable[[RequiresArg1TypeT, RequiresArg2TypeT], Optional[RequiresReturnTypeT]]:
    ...

def __call__(self, func):
    # actual implementation goes here
    pass

For more info, check out typing.overload's docs: https://docs.python.org/3/library/typing.html#typing.overload

I just checked this locally and it seems to work as far as I can tell. It's a bit repetitive, and doesn't solve for cases that use 0 arguments or more than 2 arguments, but it does work for all current uses of the @requires decorator without changing code (only adding type hints). It should also be pretty obvious how to extend this pattern to accommodate 0 arguments or 3+ arguments in the function.

@OriolAbril
Copy link
Member

There is one function that takes multiple arguments and is currently using @requires:

def _extract_log_likelihood(self, trace):

If this is the only function with arguments that uses requires I would simply remove the requires from this function and be done with it. It is called only from log_likelihoods_to_xarray which already has the requires

@obi1kenobi
Copy link
Contributor Author

My thinking with the @overload approach is that it could be used to signal the intent behind the @requires decorator — that it's intended to work with any function — and only has the current limitations because of the underlying limitations in the Python type system. This would communicate to the reader that if/when that mypy issue is resolved in the future, they should go ahead and update the type signature of @requires, without worrying that they might be causing a bug somewhere or that @requires for some reason is only supposed to work in the 1-argument case.

We might also not want people to have to contort their code in the future if they need @requires on a multi-parameter function, and have to instead combine the parameters into one parameter to fit the @requires type signature.

If you still think that only handling only the 1-argument case is best, I'm happy to do that. Just wanted to put my 2 cents in :)

@OriolAbril
Copy link
Member

I am leaning towards removing the requires in _extract_log_likelihood (which should probably be removed whatever the final outcome) and going with original option 2, maybe commenting there is the overload option or linking here. It is not written anywhere, but the <name>_to_xarray(self) methods using requires decorators are a de facto convention of ArviZ converters. Thus, even when not constrained by requires, probably one way or another we would have searched for a way to keep this exact call signature anyways.

Base automatically changed from master to main January 26, 2021 19:44
@obi1kenobi
Copy link
Contributor Author

I've gotten side-tracked by a couple of other things so I haven't had a chance to update this PR; I hope to do so in the coming couple of weeks.

I wanted to share a recent update on the topic of decorator type-hinting that might be of interest. Python 3.10 will ship with typing.ParamSpec (PEP 612) which will resolve the problems we're running into here. Assuming ParamSpec gets backported into typing_extensions for prior Python versions, this should be the ultimate solution to the "how do I properly type-hint decorators" problem for all currently-maintained Python versions.

@OriolAbril
Copy link
Member

Sounds good! Thanks!

@obi1kenobi obi1kenobi force-pushed the type_hint_requires_decorator branch from 847d68d to 1005d2d Compare February 22, 2021 05:33
@OriolAbril OriolAbril merged commit cc89ade into arviz-devs:main Feb 22, 2021
@obi1kenobi obi1kenobi deleted the type_hint_requires_decorator branch July 4, 2021 14:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants