Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TYP: Arraylike #31574

Closed
wants to merge 24 commits into from
Closed

Conversation

simonjayhawkins
Copy link
Member

@simonjayhawkins simonjayhawkins added the Typing type annotations, mypy/pyright type checking label Feb 2, 2020
@jbrockmendel
Copy link
Member

makes sense from the part i understand (i.e. not Protocol)

Does it matter that MultiIndex doesnt have ._data? (that would be inaccurate in the status quo, not unique to this PR)

@@ -584,7 +585,7 @@ def _shallow_copy(self, obj, **kwargs):
return self._constructor(obj, **kwargs)


class IndexOpsMixin:
class IndexOpsMixin(Generic[ArrayLike]):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm I wouldn't really consider a Mixin to be a Generic class - have you seen this pattern used elsewhere?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm I wouldn't really consider a Mixin to be a Generic class

see errors in OP of #31518

pandas/core/indexes/base.py:246: error: Type variable "pandas._typing.ArrayLike" is unbound
pandas/core/indexes/base.py:246: note: (Hint: Use "Generic[ArrayLike]" or "Protocol[ArrayLike]" base class to bind "ArrayLike" inside a class)
pandas/core/indexes/base.py:246: note: (Hint: Use "ArrayLike" in function signature to bind "ArrayLike" inside a function)

The goal here is to use the ArrayLike alias from pandas._typing to replace uses of Union[ExtensionArray, np.ndarray]. ArrayLike is a typevar and hence mypy reports errors if the typevar is not bound.

In Index we want the return type of _values to be the same type as self._data. The way to bind typevars within a class is to use Generic.

More value would come from having the correct type available in calling code instead of a Union type to reduce casts, asserts and type ignores to perform narrowing.

I'll mark this as draft till this is needed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. So my question was more around the general approach of dealing with Mixins here. The mypy suggested approach I think is using Protocol:

https://mypy.readthedocs.io/en/stable/more_types.html#mixin-classes

So just curious if you found this approach elsewhere in the docs or are using another project as a precedent

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The mypy suggested approach I think is using Protocol:

https://mypy.readthedocs.io/en/stable/more_types.html#mixin-classes

from those docs

one can define a protocol that defines common functionality for host classes instead of adding required abstract methods to every mixin

At the moment _values has already been added (not in this PR) as an abstract method on L680 so changing to Protocol is orthogonal to this PR.

PEP 544 states

Programmers are free to not use them even if they use type annotations.
There is no intent to make protocols non-optional in the future.

we can have the discussion regarding the use of protocol and the migration path in #33274

This PR is about binding Typvars.

In common with the Generic abc, Protocol is generic and would still need the typevar in the (Protocol) definition class...
(example from PEP 544)

class Box(Protocol[T_co]):
def content(self) -> T_co:
...

so if we adopt Protocols in the future we would use SomeClassWithJustTheAbstractMethods[Protocol[ArrayLike]] instead of IndexOpsMixin(Generic[ArrayLike]) with the abstract methods included.

So my question was more around the general approach of dealing with Mixins here

hopefully answered as not relevant.

So just curious if you found this approach elsewhere in the docs or are using another project as a precedent

following advice from mypy pandas/core/indexes/base.py:246: note: (Hint: Use "Generic[ArrayLike]" or "Protocol[ArrayLike]" base class to bind "ArrayLike" inside a class)

@simonjayhawkins simonjayhawkins marked this pull request as draft April 11, 2020 09:02
@simonjayhawkins
Copy link
Member Author

Does it matter that MultiIndex doesnt have ._data? (that would be inaccurate in the status quo, not unique to this PR)

something we need to watch out for.

@simonjayhawkins
Copy link
Member Author

simonjayhawkins commented Apr 11, 2020

There are many occurrences of isinstance(..., MultiIndex) that would presumably be unnecessary if MutliIndex obeyed Liskov but that's another issue.

Comment on lines +1145 to +1148

def map_f(values, f):
return values.map(f)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is changed to silence pandas/core/base.py:1151: note: Internal mypy error checking function redefinition

@@ -321,7 +321,9 @@ def test_constructor_mixed(self):
def test_constructor_simple_new(self):
idx = period_range("2007-01", name="p", periods=2, freq="M")

with pytest.raises(AssertionError, match="<class .*PeriodIndex'>"):
with pytest.raises(
TypeError, match="_simple_new expects PeriodArray, got PeriodIndex"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: i prefer msg = "_simple_new ..." on the previous line so the with pytest.raises ` can be a 1-liner. not a big deal

if not isinstance(values, PeriodArray):
raise TypeError(
f"_simple_new expects PeriodArray, got {type(values).__name__}"
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think an assertion is correct here since this is internal and annotated

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think an assertion is correct here since this is internal and annotated

we have tests to test these.

@jbrockmendel
Copy link
Member

At least some of the test failures look related to the exception mesages

@simonjayhawkins
Copy link
Member Author

At least some of the test failures look related to the exception mesages

don't really want to be changing these.

from https://docs.python.org/3/library/typing.html

Changed in version 3.7: Generic no longer has a custom metaclass.

when using Generic, we get test failures on the output of type() when using assert, e.g. from assert isinstance(values, PeriodArray), type(values)

The test failures are only 3.6, so assume it is due to the metaclass.

from https://www.python.org/dev/peps/pep-0484/#instantiating-generic-classes-and-type-erasure

Outside the class definition body, a class attribute cannot be assigned, and can only be looked up by accessing it through a class instance that does not have an instance attribute with the same name:

the errors we are getting are presumably from the way assert gets the repr from a class attribute in python 3.6

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Typing type annotations, mypy/pyright type checking
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants