Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: merge duplicate sections (rather than raise) #67

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

tacaswell
Copy link
Contributor

This allows the docs for matplotlib 1.5.x to continue building.

This is a follow on to #65

The ability to merge sections enables decorators to easily add sections to the docstring without having to pay the cost (at import time) of parsing the docstring and re-writing it.

attn @NelleV (who I know disagrees).

@NelleV
Copy link
Contributor

NelleV commented Oct 16, 2016

FYI, I am working on a patch to remove the docstring appending on matplotlib.

As @tacaswell I think this is a bad idea (apart from the fact it does allow our doc to build right now). Automatic appending of elements create docstring that are hard to read in a terminal and hard to maintain. Having two sections should never happen elsewiise.

@NelleV
Copy link
Contributor

NelleV commented Oct 16, 2016

More exactly, I think this patches makes matplotlib developers life much easier, but it a really bad patch for all of the other projects that use numpydoc as it allows a flexibility that should not be tolerated.
The bare minimum would be to raise a warning (which would also break matplotlib's documentation, as warnings are not allowed).

@tacaswell
Copy link
Contributor Author

I do not understand the hostility to this flexibility. Non-trivial decorators may add functionality which needs to be documented in the docstring of the final function, and injecting it via the decorator makes a good deal of sense.

I am still not convinced that mpl is going to drop docstring appending / parameterization.

@rgommers
Copy link
Member

Let's not confuse the two issues too much. There are use cases for decorators in docstrings, but the merits of those are orthogonal to this PR. There are no good reasons to have duplicate sections in docstrings (even if decorators are used), but if MPL has those now we can consider to postpone raising an exception.

This is only an issue only with numpydoc master right, after gh-65? And since there's no release of numpydoc yet with gh-65, there's no issue yet right now?

Can MPL just use numpydoc==0.6.0 in the 1.5.x maintenance branch, and fix things in master?

@tacaswell
Copy link
Contributor Author

Even with 0.6.0 it is doing something funny in it is only keep the last version of a section in the docstring so it is still broken.

The reason there needs to be two Notes section is that (without parsing the docstring) the decorator can not know what the last section in the wrapped functions docstring is. The only safe way to tack more text onto the docstring is to put it under a Notes section (for example, if the last section is Returns a paragraph of text causes parsing issues). If there is already a Notes section it in v0.6.0 over-writes in when rendered (which is not great).

As near as I can tell the only way to allow decorators to append to the the docstring safely is to merge duplicates like this.

@rgommers
Copy link
Member

Why can't the decorator look for a Notes section itself, append if there is one, and create one if there isn't (i.e. put the merge in a decorator instead of in numpydoc)? Can you point me to an example?

A decorator anyway can't just blindly append, even if numpydoc does the merge. Sections will be in the wrong order, and the docstring will be incorrectly ordered in a terminal.

@tacaswell
Copy link
Contributor Author

https://github.com/matplotlib/matplotlib/blob/master/lib/matplotlib/__init__.py#L1616

The decorator is adding a kwarg to the decorated function. We do not want to pay the import time cost to parse the docstrings (this happen on basically every plotting method).

I am not super concerned about duplicate / misordered sections in the terminal (in fact it was not obvious to me that there was a defined order after the summary).

@rgommers
Copy link
Member

The decorator is adding a kwarg to the decorated function. We do not want to pay the import time cost to parse the docstrings (this happen on basically every plotting method).

That use case makes sense. Adding the _DATA_DOC_APPENDIX will basically put the info that should be in the Parameters section in the Notes section instead (at best, if we merge duplicate sections). So with your current decorator you'd get a better result when putting the two lines

data : indexable instance, optional
    Replaces the keywords that follow it by ``data[<arg>]``.

(in fact it was not obvious to me that there was a defined order after the summary).

There is, it's the ordering specified in https://github.com/numpy/numpy/blob/master/doc/HOWTO_DOCUMENT.rst.txt#id5

Long term I think numpydoc should raise on duplicate sections, this is in almost every case a clear bug. Issuing a warning in the meantime I don't really care about either way, because with the hundreds of warnings that Sphinx typically spits out those are really hard to spot anyway.

@stefanv
Copy link
Contributor

stefanv commented Oct 17, 2016

Parsing a docstring can be done very quickly if all you are interested in is finding the notes section. Is the delay of locating the notes section really enough to make this unworkable?

From the numpydoc perspective, duplicate sections should raise errors.

@tacaswell
Copy link
Contributor Author

I agree it would be better to inject that into the parameters section, however not all of mpl's docs are in proper numpydoc format yet so we can not assume that we have a Parameters section or an existing Notes section (which is why we settled on just tacking it onto the end).

It is not clear to me (and this may be my reading comprehension at fault) that the numbering in HOWTO_DOCUMENT is prescriptive, rather than just enumerated by happenstance. That should probably get its own issue?

@stefanv
Copy link
Contributor

stefanv commented Oct 18, 2016

The are prescriptive, because that is the standard. Would you have preferred them be more flexible?

@tacaswell
Copy link
Contributor Author

I have been reading it as flexible for a few years now, this conversation really is the first time I even considered the order of the sections mattering.

It mostly needs some extra words indicating that it is meant to be prescriptive, that is all.

@NelleV
Copy link
Contributor

NelleV commented Oct 18, 2016

I don't think numpydoc even builds if the order isn't preserved. At least, it used to be this way.

@NelleV
Copy link
Contributor

NelleV commented Oct 18, 2016

Anyhow, numpydoc should have a much proper documentation (cf #66)
@stefanv We should put some students on this.

@stefanv
Copy link
Contributor

stefanv commented Oct 18, 2016 via email

@rgommers
Copy link
Member

The SciPy docs also fail to build with current master, there's at least one docstring with two Notes sections. I think we should merge this after the addition to raise a warning on doing a section merge.

@jnothman
Copy link
Member

jnothman commented Nov 1, 2017

Would it help to have this explicit, where the library can from numpydoc.docscrape import merge_docstrings and then merge_docstrings(base_doc, additional_doc)?

I think this would be better than a warning. Also, importing numpydoc.docscrape no longer has any sphinx dependencies, or dependencies beyond standard library...

This allows the docs for matplotlib 1.5.x to continue building.
The default value was a string, if there is input these will be
filled as lists.
@tacaswell tacaswell force-pushed the enh_merge_duplicate_sections branch from 832d2ee to ed9f5e6 Compare November 2, 2017 14:56
@tacaswell
Copy link
Contributor Author

Did the rebase (which mostly reverts #95).

@jnothman what would you expect the return value of merge_docstrings to be?

@jnothman
Copy link
Member

jnothman commented Nov 2, 2017 via email

@hmaarrfk
Copy link
Contributor

hmaarrfk commented Aug 12, 2018

I would like to chime in saying that I think the ability to merge docstrings would be really cool.

I’m working on a few decorators that can make it easier to transition functions to new behaviors.

The ability to add to the docstring is quite an important feature as it would avoid writing even more boiler plate warning messages.

The docstring appending would happen at import time and not run time so I don’t think it is a big issue.

I was thinking of using the numpydoc structure, but I think making numpydoc a hard dependency of the “deprecation-factory” would be a mistake. Adding a badly formatted string (appending it) is better than adding a hard dependency.

Edit: finished my sentence

@rgommers rgommers removed this from the v0.9.0 milestone Apr 21, 2019
Base automatically changed from master to main March 5, 2021 12:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants