Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pre-commit hook for running numpydoc validation #454

Merged
merged 35 commits into from
Jun 29, 2023

Conversation

stefmolin
Copy link
Contributor

@stefmolin stefmolin commented Mar 4, 2023

Closes #450

I've included information on setting up the pre-commit hook in the validation page in the docs. If you want to test on a repo of your own before this is merged, you can add the following to your .pre-commit-config.yaml file:

  - repo: https://github.com/stefmolin/numpydoc
    rev: <latest commit hash>
    hooks:
      - id: numpydoc-validation

The hook can also be run on the command line after you reinstall the package:

$ python -m numpydoc.hooks.validate_docstrings --help

Specific checks can be ignored with inline comments (# numpydoc ignore=GL08), or globally ignored by passing them in to the hook via a command line argument or a configuration file (pyproject.toml or setup.cfg). I've added options for regex to determine overrides for GL08 and SS05, which frequently come up in development; the former for inherited docstrings or dunder methods, and the latter for valid starting words like "Process" or "Access."

Is there any interest in running the hook on further development of numpydoc? I can set that up here as well, if desired.

@akshara08
Copy link

Any update on this? Would be cool to get be able to use it!

@stefmolin
Copy link
Contributor Author

@akshara08 - could you help test this?

Follow the instructions here and add the following to your .pre-commit-config.yaml file:

  - repo: https://github.com/stefmolin/numpydoc
    rev: 6a1a606
    hooks:
      - id: numpydoc-validation

Here's an example of that within a separate repository.

@HealthyPear
Copy link

What is the status of this PR?

I was looking for exactly this functionality!

@stefmolin
Copy link
Contributor Author

@HealthyPear - We are looking for someone to test it on their repo. Are you able to test?

@HealthyPear
Copy link

@HealthyPear - We are looking for someone to test it on their repo. Are you able to test?

Sure! Though from the instructions it's not clear to me if I should install numpydoc from here checking out this PR or from your fork.

@HealthyPear
Copy link

To start I installed numpydoc from here checking out this PR and added the pre-commit configuration as explained in the instructions.

I get this error when trying to committing (and failing),

An error has occurred: InvalidManifestError:
=====> /Users/michele/.cache/pre-commit/repogsdv64xc/.pre-commit-hooks.yaml is not a file
Check the log at /Users/michele/.cache/pre-commit/pre-commit.log

This is the log file
pre-commit.log

@HealthyPear
Copy link

Never mind I was using a wrong commit hash, it seems to work!

Though I think the documentation needs to be clearer on the pyproject.toml possible configurations with more examples and an explanation of the syntax.

@HealthyPear
Copy link

Also, you should probably release a new minor version after this, it's worth having it installed after an update without going for the development version.

@stefmolin
Copy link
Contributor Author

@HealthyPear – thanks for testing. I'm not a maintainer of this package, but @jarrodmillman and @larsoner are. This has been added to the 1.6.0 milestone – rest assured that just as these docs won't be published until that release, this functionality won't be advertised until it is released.

@jarrodmillman jarrodmillman requested a review from rossbar June 21, 2023 06:42
Copy link
Contributor

@rossbar rossbar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this @stefmolin , generally speaking this looks great! I'm no pre-commit expert, so I'm inclined to lean on your expertise w.r.t. the implementation and packaging of the hook, assuming you'll be able to help maintain it :)

The only two (minor) concerns I have at this stage are:

  1. The added dependencies, and
  2. The configuration differences between pre-commit and sphinx.

I just want to get your thoughts on these two points. Re: 1. - first and foremost numpydoc is a sphinx extension, and I just want to be certain that adding more dependencies is not going to interfere with ppl's doc builds. I'm not familiar with tabulate nor tomli - they seem to be supported (though tomli hasn't had a release in over a year). If these are well-known, widely-used, Pure-python packages then I'm not too concerned. Even better would be if there were reasonable alternatives in the standard library (e.g. pprint? I know toml parsing was added in 3.11, so that's less of a concern)

Item 2 is mostly about reducing cognitive load on users - if someone wants to switch from running validation during sphinx build -> using a pre-commit hook (or vice-versa, or both) it'd be a huge benefit if the configuration options supported by both the sphinx-validation and the pre-commit validation were consistent. The sphinx configuration supports adding & excluding checks, but does not currently support e.g. the overrides. It'd be great if it did! One possibility would be to break the overriding functionality into a separate PR and add support on both the pre-commit and sphinx sides. However, if that's a ton of work then I don't think it's worth the undue burden. WDYT?

Copy link
Contributor

@stefanv stefanv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this PR is a fantastic improvement, and am in favor of merging it as-is. I'm not concerned about the added pure-Python dependency (tomli dependency is not required for Python >= 3.11).

I agree with @rossbar that it would be ideal to reuse the rules definitions in Sphinx. @stefmolin do you have any interest on bringing that piece in with another PR (I don't think we should hold up this PR for that reason alone)? It's one of the long-outstanding parts of the numpydoc roadmap, and I'd love to see it done.

Also, I think now having two CLIs is not ideal: we could reasonably unify them both into python -m numpydoc, and do a better job at communicating to the user what those do. It would also be convenient to have numpydoc be installed as a user script, so it can be invoked directly.


.. code-block:: bash

$ python -m numpydoc.hooks.validate_docstrings --help
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fine for now, but longer term I'm wondering if we shouldn't consolidate everything under python -m numpydoc, which currently has questionable semantics. E.g., to a new user this must be clear as mud:

$ python -m numpydoc --help
usage: __main__.py [-h] [-c CONFIG] [--validate] import_path

Implementing `python -m numpydoc` functionality.

positional arguments:
  import_path           e.g. numpy.ndarray

options:
  -h, --help            show this help message and exit
  -c CONFIG, --config CONFIG
                        key=val where val will be parsed by literal_eval, e.g. -c use_plots=True. Multiple -c can be used.
  --validate            validate the object and report errors

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed. I'm happy to work on this, but I think we first need to discuss whether we are comfortable changing this interface (for example, if we had subparsers to split up the current functionality and the new hook). Probably best to save this discussion for an issue dedicated to the topic though 😊

``pyproject.toml``::

[tool.numpydoc_validation]
ignore = [
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Loving this!

@HealthyPear
Copy link

@jarrodmillman @larsoner this functionality will be very useful in the future and I think it's a notable step forward for this project.

Do you plan a minor release after merging this PR?

@jarrodmillman
Copy link
Member

jarrodmillman commented Jun 28, 2023

I will merge this and make the 1.6.0 release by early next week:
https://github.com/numpy/numpydoc/milestone/13

Before merging, I will wait a bit longer to see if @stefmolin has time to respond or comment on any of the above comments. But it looks like this is good to go as is and can be improved in subsequent PRs.

@stefmolin
Copy link
Contributor Author

In response to comments from @rossbar:

Thanks for this @stefmolin , generally speaking this looks great! I'm no pre-commit expert, so I'm inclined to lean on your expertise w.r.t. the implementation and packaging of the hook, assuming you'll be able to help maintain it :)

Absolutely!

The only two (minor) concerns I have at this stage are:

  1. The added dependencies, and

Using tomli for Python < 3.11 is the standard, since a version of tomli was added to the standard library in Python 3.11 as tomllib. From the tomli README:

A version of Tomli, the tomllib module, was added to the standard library in Python 3.11 via PEP 680. Tomli continues to provide a backport on PyPI for Python versions where the standard library module is not available and that have not yet reached their end-of-life.

While I have not tried using the standard library to print out the findings in tabular form, I imagine it would be reinventing the wheel. tabulate is a pure-Python package. It is also used by pandas for formatting into Markdown (see here and here).

  1. The configuration differences between pre-commit and sphinx.

I'm happy to look into adding the ability to override checks in the sphinx configuration in a separate PR. I do agree it would be a good idea, but I would have to look into it to see how much work it would be.

@stefmolin
Copy link
Contributor Author

In response to @stefanv

I agree with @rossbar that it would be ideal to reuse the rules definitions in Sphinx. @stefmolin do you have any interest on bringing that piece in with another PR (I don't think we should hold up this PR for that reason alone)? It's one of the long-outstanding parts of the numpydoc roadmap, and I'd love to see it done.

Yes, I'm more than happy to work on this 😊 If you have an open issue, feel free to assign it to me.

Also, I think now having two CLIs is not ideal: we could reasonably unify them both into python -m numpydoc, and do a better job at communicating to the user what those do. It would also be convenient to have numpydoc be installed as a user script, so it can be invoked directly.

I'm happy to handle this as well in a separate PR. We can definitely make an entry point to just numpydoc directly on the command line, and since numpydoc uses argparse, we can maybe explore using subparsers.

@stefanv
Copy link
Contributor

stefanv commented Jun 29, 2023

Well, that's all very exciting, thanks @stefmolin.

@stefanv stefanv merged commit a58ab90 into numpy:main Jun 29, 2023
@stefmolin stefmolin deleted the ast-validate-hook branch June 30, 2023 01:12
@stefmolin stefmolin restored the ast-validate-hook branch June 30, 2023 01:12
@stefmolin stefmolin deleted the ast-validate-hook branch September 26, 2023 01:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Incorporate a pre-commit hook for numpydoc validation
8 participants