Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use of NumPy 1.21 style npt.NDArray typing hints #3748

Closed
hmacdope opened this issue Jul 6, 2022 · 8 comments
Closed

Use of NumPy 1.21 style npt.NDArray typing hints #3748

hmacdope opened this issue Jul 6, 2022 · 8 comments
Labels

Comments

@hmacdope
Copy link
Member

hmacdope commented Jul 6, 2022

Problem

With the merge of #3737 NumPy 1.20 is now our minimum supported NumPy version as per NEP29.
However some open PRs (eg @hmacdope #3730 and @umak1106 #3746) are using the NumPy 1.21 style typing:

import numpy.typing as npt
def blah(arr: npt.NDArray)

This is not going to play nice with our CI or advertised supported versions so we will have to decide how to proceed.

See the release notes for NumPy 1.21 to see what npt.NDArray actually is, but long and the short of it is that it is an alias for

np.ndarray[Any, np.dtype[~Scalar]]

from looking at the NEP29 table the important dates are:

Jun 21, 2022 3.8+ 1.20+
Jan 31, 2023 3.8+ 1.21+
-- -- --

Some possible ideas to consider:

  • Roll back some of the typing to the plain old arr: np.ndarray
  • Embed the alias ourselves until we have NumPy 1.21 (bit hacky)
  • Make typing a future branch and postpone merging it until we have NumPy 1.21 underneath
  • postpone a release until we have NumPy 1.21 underneath (6 Months)
  • Break NEP29
  • Anything else?

Tagging @MDAnalysis/coredevs for input? This is probably also relevant to all GSOC and Outreachy people @MDAnalysis/gsoc-mentors (particularly for @umak1106's project).

@hmacdope hmacdope changed the title Use of NumPY 1.21 style NDArray typing hints Use of NumPy 1.21 style NDArray typing hints Jul 6, 2022
@hmacdope hmacdope changed the title Use of NumPy 1.21 style NDArray typing hints Use of NumPy 1.21 style npt.NDArray typing hints Jul 6, 2022
@tylerjereddy
Copy link
Member

For one data point, I grepped around the SciPy codebase a bit and found that we indeed don't use NDArray in any .py files, opting for np.ndarray as you note above, but we do use the fancy new stuff in the static type stubs like .pyi files. The latter must be safe then, since it would only be used by mypy and you can just require a newer NumPy for the static checks that leverage the stub files in CI.

I guess it depends what mentors think, but maybe we could adopt a similar compromise for the next 6 months or so.

@tylerjereddy
Copy link
Member

I think it is common to use the stub files for things like type hinting Cython modules. That said, I believe it is also possible to use the static stub files alongside regular Python modules, though I've personally almost never seen the latter. A big exception is the Python standard library I believe--it is entirely typed by a separate stubbing project if I'm reading correctly. I suppose if we really want we could do that for regular Python modules as well, though that seems debatably quite complex vs. using inline typing for conventional Python modules.

So if it were a vote I'd probably favor using the slightly older syntax inline for now rather than producing a bunch of stub files for plain Python modules, but perhaps good to be clear that both may be options IIUC.

@tylerjereddy
Copy link
Member

Ok, I stand corrected--pandas just merged the various third-party type stubbing projects and created their own official separate type stubbing library: https://github.com/pandas-dev/pandas-stubs

It cites this part of PEP561: https://peps.python.org/pep-0561/#stub-only-packages

In short:

For package maintainers wishing to ship stub files containing all of their type information, it is preferred that the *.pyi stubs are alongside the corresponding *.py files. However, the stubs can also be put in a separate package and distributed separately.

Anyway, lots of interesting options/flexibility there.

@jbarnoud
Copy link
Contributor

jbarnoud commented Jul 7, 2022

As long as npt.NDArray does not provide ways of specifying shapes, I would just roll back to np.ndarray.

I did consider stub files but I am concerned new contributors will not realise they're there and older ones will occasionally forget about them. Already the documentation is not always up to date.

@IAlibay
Copy link
Member

IAlibay commented Jul 7, 2022

So my 2 cent here is that we'll be dropping 1.20 for 1.21 in exactly 6 months. As much as I want to do 3 monthly releases, we could wait the 6 month cycle.

Otherwise sticking to np.ndarray is fine, but we should accompany those entries with some kind of comment tag. That way we can quickly find & replace them all once we move to 1.21?

It's a bit cumbersome but maybe something like # typing: numpy comment before the method/class def?

@hmacdope
Copy link
Member Author

I'm happy with using arr: np.ndarray for the meantime with the tag @IAlibay proposes. I will follow this approach in #3730

@Shankhadeep2000
Copy link

Hi I am a beginner in open source can I contribute in this project. I got the skills of python, data science, machine learning.
please help me with this..

@orbeckst
Copy link
Member

@Shankhadeep2000 welcome to MDAnalysis. This particular issue is effectively part of an ongoing Outreachy project so I don’t think we’re looking for any additional help with type hinting. But there’s are many other issues that need help, as you can see when looking through the issue tracker. I suggest you join the developer list and our discord (see https://www.mdanalysis.org/#participating ) and start a discussion there.

Please also see the contributor guide https://userguide.mdanalysis.org/stable/contributing.html

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants