Use Python's typing instead of attaching `Type` objects to `Variable`s #134

brandonwillard · 2020-10-28T17:10:52Z

Instead of creating "type" classes (i.e. Types) and manually attaching them to Aesara objects (i.e. via Variable.type), we should actual use Python's built-in type/class inheritance. I'm not immediately aware of any required functionality in the current Aesara type system that wouldn't allow this.

Here's an illustration of the current situation:

import aesara.tensor as at


x = at.vector()

We created a simple Aesara vector, x; now, let's inspect its (Python) types:

>>> type(x).mro()

[aesara.tensor.var.TensorVariable,
 aesara.tensor.var._tensor_py_operators,
 aesara.gof.graph.Variable,
 aesara.gof.graph.Node,
 aesara.gof.utils.object2,
 object]

Oddly, there's also a separate Variable.type field/member/property introduced by inheriting from Variable:

>>> x.type

TensorType(float64, vector)

>>> x.type.Variable

aesara.tensor.var.TensorVariable

>>> x.type.Constant

aesara.tensor.var.TensorConstant

As we can see, this extra Type object holds information about the variable's dtype and broadcastable dimensions (i.e. the vector means x.type.broadcastable equals [False]/[True]). From the latter two properties, it's also clear that the actual Python type (i.e. TensorVariable) is directly associated with the Aesara type (i.e. TensorType) as illustrated by the value of TensorType.Variable.

This leads to the question: why doesn't TensorVariable just inherit from its x.type type/class (i.e. some TensorType)?

If anything, the mechanics behind such inheritance might seem non-standard, as it could require some form of automatic type generation for sub-classes like TensorVariable, but—even so—that's pretty straightforward to do in a __new__ class method (e.g. we already do that in symbolic-pymc).

Regarding actual code changes, it seems like we would at least need to change all the Variable.type invocations with the standard type(...) calls, and/or more direct calls to isinstance(x, TensorType) instead of things like x.type == TensorType(...).

We could also keep Variable.type and have it return type(self) just to make the initial transition easy.

The text was updated successfully, but these errors were encountered:

brandonwillard · 2021-03-12T19:07:24Z

@LegrandNico, since you're becoming Aesara's type expert, I would like to introduce you to this potentially revolutionary refactoring topic...

brandonwillard · 2022-04-07T20:17:54Z

The issue fixed in #892 could possibly have been caught by Mypy's type checking if the approach described here was in place.

In that example, the problematic at.minimum(x, y)-like expression could've been noticed by Mypy, since at.minimum could be declared to have a (FloatLikeType, FloatLikeType) inputs signature, which would've failed when given the incompatible y: BoolLikeType that it was being given.

brandonwillard · 2022-04-07T23:17:34Z

It looks like we would need to do something exactly like NumPy is currently doing with its ndarray typing: i.e. use typing.Generic parameterized on dtype and shape information.

Here's a simple example:

import numpy as np
import typing
import typing_inspect


DataType = typing.TypeVar("DataType")


class TensorType(typing.Generic[DataType]):

    @classmethod
    @property
    def dtype(cls) -> DataType:
        generic_bases = typing_inspect.get_generic_bases(cls)
        type_base = next((b for b in generic_bases if typing_inspect.get_origin(b) == TensorType))
        dtype = typing_inspect.get_args(type_base)[0]
        return dtype


Float32TensorType = TensorType[np.float32]


assert Float32TensorType == TensorType[np.float32]


class Float32Variable(Float32TensorType):
    pass


scalar_float_inst = Float32Variable()

assert scalar_float_inst.dtype is np.float32

assert isinstance(scalar_float_inst, TensorType)
assert isinstance(scalar_float_inst, Float32Variable)

Unfortunately, checks based on the parameterized TensorTypes won't work, but it might not matter given that checks like the preceding Float32Variable one work:

# Fails
assert isinstance(scalar_float_inst, Float32TensorType)

dhirschfeld · 2022-04-08T01:05:41Z

There might possibly be some useful typing examples in the array-api project:

https://github.com/data-apis/array-api/tree/main/spec/API_specification/signatures

brandonwillard · 2022-04-08T19:25:39Z

To clarify, adding this kind of parameterized typing will allow type checking (i.e. Mypy) to catch issues involving incompatible dtypes and (static) shapes. Currently, these kinds of issues require explicit unit tests to catch, and that's quite a burden on our testing requirements.

Also, it should be possible to add this on top of the existing codebase without making the changes implied by this issue.

In other words, we don't need to make the change from using TensorType instances (i.e. where TensorType is a standard class and we create instance of that class for each specific dtype and static shape/ndim) to explicit TensorType types (i.e. where TensorType is a "base" type and "instances" are new type-compatible objects specific to each dtype and static shape/ndim).

markusschmaus · 2022-09-22T13:18:28Z

Based on @brandonwillard's code example above I developed a more complete example of how a TensorType based on Python's typing system could look like.

https://github.com/markusschmaus/Documents/blob/main/aesara_typing_demo.py

One key difference between his code and mine, is that @brandonwillard used inspection to extract the dtype information from the GenericAlias. Since python's typing system is still rapidly evolving, runtime behavior of GenericAlias has changed in the past (isinstance no longer works with GenericAlias), and typing_inspection is still marked experimental, I decided to use a different approach. Since, as he suggested, we need to dynamically generate sub-classes, we can store the relevant information in these sub-classes when they are created in a way which can be easily retrieved later.

The demo passes mypy without errors and contains a few asserts which also succeed.

I tried to get TensorType((3, 7), np.float64) to work without mypy errors, but I believe this is currently impossible using __call__ (see python/mypy#6721 (comment)).

Based on this demo I believe that implementing this suggestion is possible and I would be interested in working on this. However this would be a major refactoring and only makes sense if there is support from existing aesara developers.

One aspect I would like to explore, when working on this, is the role of Variables as part of the shape specification and how they interact with broadcasting, and how they could be leveraged for aesara to support dimensions (see #1013).

brandonwillard · 2022-09-22T16:38:05Z

One key difference between his code and mine, is that @brandonwillard used inspection to extract the dtype information from the GenericAlias. Since python's typing system is still rapidly evolving, runtime behavior of GenericAlias has changed in the past (isinstance no longer works with GenericAlias), and typing_inspection is still marked experimental, I decided to use a different approach. Since, as he suggested, we need to dynamically generate sub-classes, we can store the relevant information in these sub-classes when they are created in a way which can be easily retrieved later.

Yes, that example was completely experimental; we definitely don't need to take an approach like that. As long as what we're currently calling Variables become instances of Types, and we can adequately track "static"/type-level information like dtype and (partial) shapes using those new Types, we're much better off.

Based on this demo I believe that implementing this suggestion is possible and I would be interested in working on this. However this would be a major refactoring and only makes sense if there is support from existing aesara developers.

I'm ready to make these changes sooner than later, so, if you want to work on this, we can start now. A draft PR is the next step.

One aspect I would like to explore, when working on this, is the role of Variables as part of the shape specification and how they interact with broadcasting, and how they could be leveraged for aesara to support dimensions (see #1013).

Be sure to read the comments here: #1013 (reply in thread). In other words, we shouldn't need anything at the type-level to support most kinds of dimension tracking. Regardless, these changes would help any related efforts a lot.

brandonwillard · 2022-09-22T21:31:45Z

Before I forget, here's some related (and interesting) work on Python typing for shapes: https://peps.python.org/pep-0646/

brandonwillard · 2022-09-22T21:37:39Z

Come to think of it, we could start using TypeVarTuple and Unpack from typing_extensions right now.

markusschmaus · 2022-09-23T12:39:42Z

Here is the PR
#1207

brandonwillard added important refactor This issue involves refactoring help wanted Extra attention is needed labels Oct 28, 2020

brandonwillard changed the title ~~Use actual types for Theano types~~ Inherit from Theano types Oct 28, 2020

michaelosthege added this to the 2.0 milestone Dec 15, 2020

brandonwillard removed this from the 2.0 milestone Feb 2, 2021

brandonwillard added the question Further information is requested label Feb 2, 2021

brandonwillard mentioned this issue Feb 26, 2021

Add mypy to pre-commit hooks AmpersandTV/pymc3-hmm#61

Merged

brandonwillard changed the title ~~Inherit from Theano types~~ Use actual Python types instead of separate Type classes Mar 12, 2021

brandonwillard changed the title ~~Use actual Python types instead of separate Type classes~~ Use actual Python types instead of attaching Type objects to Variables Mar 12, 2021

brandonwillard changed the title ~~Use actual Python types instead of attaching Type objects to Variables~~ Use Python's inheritance instead of attaching Type objects to Variables Mar 12, 2021

brandonwillard changed the title ~~Use Python's inheritance instead of attaching Type objects to Variables~~ Use Python's OOP typing instead of attaching Type objects to Variables Mar 12, 2021

LegrandNico mentioned this issue Mar 26, 2021

Add type hints and apply mypy fix for function attributes where needed #350

Closed

brandonwillard mentioned this issue May 21, 2021

Create a fixed-length/shape TensorType and TensorVariable #431

Closed

brandonwillard changed the title ~~Use Python's OOP typing instead of attaching Type objects to Variables~~ Use Python's OOP typing instead of attaching Type objects to Variables Dec 31, 2021

brandonwillard added the graph objects label Jan 7, 2022

brandonwillard changed the title ~~Use Python's OOP typing instead of attaching Type objects to Variables~~ Use Python types instead of attaching Type objects to Variables Apr 7, 2022

brandonwillard changed the title ~~Use Python types instead of attaching Type objects to Variables~~ Use Python's typing instead of attaching Type objects to Variables Apr 7, 2022

brandonwillard added the typing Work related to type checking label Apr 7, 2022

This comment was marked as off-topic.

Sign in to view

markusschmaus linked a pull request Sep 23, 2022 that will close this issue

Unify Type with Python's type system #1207

Draft

10 tasks

brandonwillard linked a pull request Sep 23, 2022 that will close this issue

Unify Type with Python's type system #1207

Draft

10 tasks

This comment was marked as off-topic.

Sign in to view

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use Python's typing instead of attaching `Type` objects to `Variable`s #134

Use Python's typing instead of attaching `Type` objects to `Variable`s #134

brandonwillard commented Oct 28, 2020 •

edited

Loading

brandonwillard commented Mar 12, 2021

brandonwillard commented Apr 7, 2022 •

edited

Loading

brandonwillard commented Apr 7, 2022 •

edited

Loading

dhirschfeld commented Apr 8, 2022

brandonwillard commented Apr 8, 2022 •

edited

Loading

markusschmaus commented Sep 22, 2022

brandonwillard commented Sep 22, 2022 •

edited

Loading

This comment was marked as off-topic.

This comment was marked as off-topic.

brandonwillard commented Sep 22, 2022

brandonwillard commented Sep 22, 2022

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

markusschmaus commented Sep 23, 2022

This comment was marked as off-topic.

This comment was marked as off-topic.

Use Python's typing instead of attaching Type objects to Variables #134

Use Python's typing instead of attaching Type objects to Variables #134

Comments

brandonwillard commented Oct 28, 2020 • edited Loading

brandonwillard commented Mar 12, 2021

brandonwillard commented Apr 7, 2022 • edited Loading

brandonwillard commented Apr 7, 2022 • edited Loading

dhirschfeld commented Apr 8, 2022

brandonwillard commented Apr 8, 2022 • edited Loading

markusschmaus commented Sep 22, 2022

brandonwillard commented Sep 22, 2022 • edited Loading

This comment was marked as off-topic.

This comment was marked as off-topic.

brandonwillard commented Sep 22, 2022

brandonwillard commented Sep 22, 2022

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

markusschmaus commented Sep 23, 2022

This comment was marked as off-topic.

This comment was marked as off-topic.

Use Python's typing instead of attaching `Type` objects to `Variable`s #134

Use Python's typing instead of attaching `Type` objects to `Variable`s #134

brandonwillard commented Oct 28, 2020 •

edited

Loading

brandonwillard commented Apr 7, 2022 •

edited

Loading

brandonwillard commented Apr 7, 2022 •

edited

Loading

brandonwillard commented Apr 8, 2022 •

edited

Loading

brandonwillard commented Sep 22, 2022 •

edited

Loading