Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make package PEP 561 compatible #28831

Closed
wants to merge 1 commit into from
Closed

Conversation

c4f3a0ce
Copy link

@c4f3a0ce c4f3a0ce commented Oct 7, 2019

Pandas has growing number of type annotations but these are not properly advertised to type checking tools. According to PEP 561 packages providing annotations should contain py.typed files in corresponding packages.

py.typed is applied recursively, if placed in the root, and can accept missing annotations, if contains partial\n.

This PR adds such file and includes it in package_data.

@WillAyd
Copy link
Member

WillAyd commented Oct 8, 2019

Cool thanks for submitting this - makes sense to do in the long run.

Any chance you have personal experience doing this on other projects? My main concern would be the maturity of annotations; we have them but they aren't necessarily comprehensive yet.

That may or may not be a blocker to doing this just curious if you or anyone you know has experience to know the pros and cons of putting these out there

@simonjayhawkins simonjayhawkins added the Typing type annotations, mypy/pyright type checking label Oct 8, 2019
@WillAyd
Copy link
Member

WillAyd commented Oct 8, 2019

@simonjayhawkins might have thoughts as well

@zero323
Copy link

zero323 commented Oct 8, 2019

we have them but they aren't necessarily comprehensive yet.

I think that a bigger problem is that Pandas doesn't type check at the moment. Right now (bee17d5) there are 55 errors. Some are easy fix (for example #28843, or some other minor fixes I am working on), other might be much harder to tackle (there is a bunch of inheritance issues, which are real pain).

Nonetheless I'd say that, conditioning on fixing or type-ignoring aforementioned problem exposing (even partial) annotations is useful (please bear in mind that I am not objective, as lack of Pandas annotations is blocker for my own work), as long as individual modules (not necessarily all) provide complete, even if mostly dynamic annotations.

@WillAyd
Copy link
Member

WillAyd commented Oct 8, 2019

I think that a bigger problem is that Pandas doesn't type check at the moment. Right now (bee17d5) there are 55 errors

We do type check with mypy as part of our CI - are those failures accounting for what's in our setup.cfg to ignore some third party libs without annotations?

@zero323
Copy link

zero323 commented Oct 8, 2019

I think that a bigger problem is that Pandas doesn't type check at the moment. Right now (bee17d5) there are 55 errors

We do type check with mypy as part of our CI - are those failures accounting for what's in our setup.cfg to ignore some third party libs without annotations?

I am running Mypy directly with --ignore-missing-imports --no-implicit-optional (I believe these are the ones that you use in CI pipeline).

$ git rev-parse HEAD                                                    
39602e7d7c5b663696f5a6eca9e66e65483bc868
$ mypy -V    
mypy 0.740+dev.beec11a28fd835acb1e4ee8b2bf14d969c4e5922

The result is

pandas/io/formats/css.py:232: error: Argument 1 to "_side_expander" has incompatible type "str"; expected "CSSResolver"
pandas/io/formats/css.py:233: error: Argument 1 to "_side_expander" has incompatible type "str"; expected "CSSResolver"
pandas/io/formats/css.py:234: error: Argument 1 to "_side_expander" has incompatible type "str"; expected "CSSResolver"
pandas/io/formats/css.py:235: error: Argument 1 to "_side_expander" has incompatible type "str"; expected "CSSResolver"
pandas/io/formats/css.py:236: error: Argument 1 to "_side_expander" has incompatible type "str"; expected "CSSResolver"
pandas/core/dtypes/dtypes.py:220: error: Incompatible types in assignment (expression has type "dtype", base class "PandasExtensionDtype" defined the type as "None")
pandas/core/dtypes/dtypes.py:657: error: Incompatible types in assignment (expression has type "dtype", base class "PandasExtensionDtype" defined the type as "None")
pandas/core/dtypes/dtypes.py:818: error: Incompatible types in assignment (expression has type "dtype", base class "PandasExtensionDtype" defined the type as "None")
pandas/core/dtypes/dtypes.py:977: error: Incompatible types in assignment (expression has type "dtype", base class "PandasExtensionDtype" defined the type as "None")
pandas/core/dtypes/common.py:144: error: Function "numpy.array" is not valid as a type
pandas/core/dtypes/common.py:144: note: Perhaps you need "Callable[...]" or a callback protocol?
pandas/io/common.py:576: error: Incompatible types in assignment (expression has type "str", variable has type "bytes")
pandas/io/common.py:583: error: Incompatible return value type (got "bytes", expected "str")
pandas/core/nanops.py:312: error: Incompatible types in assignment (expression has type "Type[int64]", variable has type "dtype")
pandas/core/nanops.py:314: error: Incompatible types in assignment (expression has type "Type[float64]", variable has type "dtype")
pandas/core/nanops.py:666: error: Incompatible types in assignment (expression has type "Union[bool, Any]", variable has type "ndarray")
pandas/core/nanops.py:1121: error: Incompatible types in assignment (expression has type "int", variable has type "ndarray")
pandas/core/nanops.py:1124: error: Incompatible types in assignment (expression has type "int", variable has type "ndarray")
pandas/core/arrays/sparse/dtype.py:67: error: Incompatible default for argument "dtype" (default has type "Type[float64]", argument has type "Union[str, dtype, ExtensionDtype]")
pandas/core/construction.py:267: error: Incompatible types in assignment (expression has type "Union[Type[ExtensionDtype], str]", variable has type "Union[str, dtype, ExtensionDtype, None]")
pandas/core/ops/dispatch.py:130: error: Argument 1 to "array" has incompatible type "ndarray"; expected "Sequence[object]"
pandas/core/ops/__init__.py:153: error: "datetime64" has no attribute "astype"
pandas/core/arrays/datetimelike.py:453: error: Argument 1 of "__setitem__" is incompatible with supertype "ExtensionArray"; supertype defines the argument type as "Union[int, ndarray]"
pandas/core/arrays/datetimes.py:604: error: Return type "Union[dtype, DatetimeTZDtype]" of "dtype" incompatible with return type "ExtensionDtype" in supertype "ExtensionArray"
pandas/core/indexes/frozen.py:71: error: Incompatible types in assignment (expression has type "Callable[[FrozenList, Any], Any]", base class "list" defined the type as "Callable[[List[Any], List[Any]], List[Any]]")
pandas/core/indexes/frozen.py:113: error: Incompatible types in assignment (expression has type "Callable[[FrozenList, VarArg(Any), KwArg(Any)], Any]", base class "list" defined the type as overloaded function)
pandas/core/indexes/frozen.py:113: error: Incompatible types in assignment (expression has type "Callable[[FrozenList, VarArg(Any), KwArg(Any)], Any]", base class "list" defined the type as "Callable[[List[Any], Union[int, slice]], None]")
pandas/core/indexes/frozen.py:114: error: Incompatible types in assignment (expression has type "Callable[[FrozenList, VarArg(Any), KwArg(Any)], Any]", base class "list" defined the type as "Callable[[List[Any], int], Any]")
pandas/core/indexes/frozen.py:114: error: Incompatible types in assignment (expression has type "Callable[[FrozenList, VarArg(Any), KwArg(Any)], Any]", base class "list" defined the type as "Callable[[List[Any], Any], None]")
pandas/core/indexes/frozen.py:114: error: Incompatible types in assignment (expression has type "Callable[[FrozenList, VarArg(Any), KwArg(Any)], Any]", base class "list" defined the type as "Callable[[List[Any], Iterable[Any]], None]")
pandas/core/indexes/frozen.py:114: error: Incompatible types in assignment (expression has type "Callable[[FrozenList, VarArg(Any), KwArg(Any)], Any]", base class "list" defined the type as "Callable[[List[Any], DefaultNamedArg(Optional[Callable[[Any], Any]], 'key'), DefaultNamedArg(bool, 'reverse')], None]")
pandas/core/indexes/frozen.py:138: error: Incompatible types in assignment (expression has type "Callable[[FrozenNDArray, VarArg(Any), KwArg(Any)], Any]", base class "ndarray" defined the type as overloaded function)
pandas/core/arrays/categorical.py:521: error: Incompatible return value type (got "Categorical", expected "ndarray")
pandas/core/arrays/categorical.py:522: error: Incompatible return value type (got "Categorical", expected "ndarray")
pandas/core/arrays/categorical.py:528: error: Incompatible return value type (got "ndarray", expected "ExtensionArray")
pandas/core/arrays/categorical.py:528: error: Argument "dtype" to "array" has incompatible type "Union[str, dtype, ExtensionDtype]"; expected "Union[dtype, None, type, str, Tuple[Any, int], Tuple[Any, Union[int, Sequence[int]]], List[Union[Tuple[Union[str, Tuple[str, str]], Any], Tuple[Union[str, Tuple[str, str]], Any, Union[int, Sequence[int]]]]], Dict[str, Union[Sequence[str], Sequence[Any], Sequence[int], Sequence[Union[bytes, str, None]], int]], Dict[str, Tuple[Any, int]], Tuple[Any, Any]]"
pandas/core/indexes/interval.py:1367: error: Argument 1 to "_setop" has incompatible type "str"; expected "IntervalIndex"
pandas/core/indexes/interval.py:1368: error: Argument 1 to "_setop" has incompatible type "str"; expected "IntervalIndex"
pandas/core/indexes/interval.py:1369: error: Argument 1 to "_setop" has incompatible type "str"; expected "IntervalIndex"
pandas/io/formats/format.py:871: error: Argument 2 to "_binify" has incompatible type "Optional[int]"; expected "Union[int32, int]"
pandas/io/formats/format.py:1523: error: Unsupported operand types for >= ("List[Union[int, float]]" and "int")
pandas/io/formats/format.py:1523: error: Unsupported operand types for >= ("List[float]" and "int")
pandas/io/formats/format.py:1523: error: Unsupported operand types for >= ("List[Union[str, float]]" and "int")
pandas/io/formats/format.py:1523: note: Left operand is of type "Union[ndarray, List[Union[int, float]], List[float], List[Union[str, float]]]"
pandas/io/formats/format.py:1524: error: Unsupported operand types for <= ("List[Union[int, float]]" and "int")
pandas/io/formats/format.py:1524: error: Unsupported operand types for <= ("List[float]" and "int")
pandas/io/formats/format.py:1524: error: Unsupported operand types for <= ("List[Union[str, float]]" and "int")
pandas/io/formats/format.py:1524: note: Left operand is of type "Union[ndarray, List[Union[int, float]], List[float], List[Union[str, float]]]"
pandas/io/formats/format.py:1529: error: Item "List[Union[int, float]]" of "Union[Any, List[Union[int, float]], List[Union[str, float]]]" has no attribute "astype"
pandas/io/formats/format.py:1529: error: Item "List[Union[str, float]]" of "Union[Any, List[Union[int, float]], List[Union[str, float]]]" has no attribute "astype"
pandas/io/formats/format.py:1532: error: Item "List[Union[int, float]]" of "Union[Any, List[Union[int, float]], List[Union[str, float]]]" has no attribute "astype"
pandas/io/formats/format.py:1532: error: Item "List[Union[str, float]]" of "Union[Any, List[Union[int, float]], List[Union[str, float]]]" has no attribute "astype"
pandas/core/series.py:803: error: "Type[ndarray]" has no attribute "__array_ufunc__"
pandas/core/frame.py:472: error: Argument "dtype" to "array" has incompatible type "Union[str, dtype, ExtensionDtype, None]"; expected "Union[dtype, None, type, str, Tuple[Any, int], Tuple[Any, Union[int, Sequence[int]]], List[Union[Tuple[Union[str, Tuple[str, str]], Any], Tuple[Union[str, Tuple[str, str]], Any, Union[int, Sequence[int]]]]], Dict[str, Union[Sequence[str], Sequence[Any], Sequence[int], Sequence[Union[bytes, str, None]], int]], Dict[str, Tuple[Any, int]], Tuple[Any, Any]]"
pandas/core/frame.py:482: error: Argument 1 to "len" has incompatible type "Iterable[Any]"; expected "Sized"
pandas/core/groupby/groupby.py:1889: error: Incompatible types in assignment (expression has type "str", variable has type "Optional[Type[int64]]")
pandas/compat/pickle_compat.py:71: error: Incompatible return type for "__new__" (returns "Series", but must return a subtype of "_LoadSparseSeries")
pandas/compat/pickle_compat.py:85: error: Incompatible return type for "__new__" (returns "DataFrame", but must return a subtype of "_LoadSparseFrame")
pandas/io/excel/_odfreader.py:63: error: Cannot find replacement for named format specifier "name"
pandas/io/excel/_odfreader.py:63: error: Not all arguments converted during string formatting
pandas/core/window/rolling.py:243: error: Item "None" of "Union[ndarray, Any, None]" has no attribute "dtype"
pandas/core/window/rolling.py:250: error: Incompatible return value type (got "Optional[ndarray]", expected "ndarray")
pandas/core/window/rolling.py:448: error: Argument 3 to "_get_roll_func" of "_Window" has incompatible type "Optional[ndarray]"; expected "ndarray"
pandas/core/window/rolling.py:788: error: Missing return statement
pandas/core/window/rolling.py:788: error: Return type "ndarray" of "_get_window" incompatible with return type "int" in supertype "_Window"
Found 62 errors in 21 files (checked 805 source files)

(difference in numbers accounts for things I fixed locally, sorry for that).

As far as I can tell none of these is directly caused by missing annotations. Some are just incompatible (created by MonkeyType?), other are small correctness issues or already mentioned inheritance problems. Not sure about the rest yet, I just started reviewing these.

@simonjayhawkins
Copy link
Member

@c4f3a0ce if you merge master the Linux py36_32bit build should pass

@simonjayhawkins
Copy link
Member

@WillAyd

The mypy errors reported above are not a blocker for this PR.

Is this something we want to do at this early stage? see #28843 (comment) onwards.

@WillAyd
Copy link
Member

WillAyd commented Oct 8, 2019

I could be convinced otherwise but I think this is premature. I'd rather annotate the entire API internally first before release to public

@jreback
Copy link
Contributor

jreback commented Oct 8, 2019

i agree we are not ready to release typing even for external dev use
maybe next release will be

@simonjayhawkins
Copy link
Member

@c4f3a0ce Thanks for looking into this.

This effort is definitely not wasted as it provides a good basis for further discussion in #28142

but closing for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Typing type annotations, mypy/pyright type checking
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Typing Stubs and PEP 561 compatibility
5 participants