Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change single backticks in Sphinx to render as code #13519

Open
asmeurer opened this issue Oct 23, 2017 · 48 comments · May be fixed by #26137
Open

Change single backticks in Sphinx to render as code #13519

asmeurer opened this issue Oct 23, 2017 · 48 comments · May be fixed by #26137

Comments

@asmeurer
Copy link
Member

Right now single backticks, like `this` in a docstring render as math. You need double backticks like ``this`` to render as code.

This confuses just about everyone because it's different from markdown, which is what GitHub uses for comments and so on.

Apparently you can set the single backticks to anything. I think the mathjax extension just sets it to math by default.

We do have a decent amount of math in our docstrings, but you can also use :math:`math`. I wonder if it would also be possible to make it use dollar signs. I suspect for every docstring currently using single backquotes for math there are two using it incorrectly for code.

@gxyd
Copy link
Contributor

gxyd commented Oct 23, 2017

Yup a lot of places make use of single and double backticks inconsistently. Add 'Easy to Fix'?

@asmeurer
Copy link
Member Author

Well first we should discuss if we actually want to do this.

Also I don't know how easy this is to fix, since it requires modifying the Sphinx configuration. Hopefully it's not too hard, though.

@skirpichev
Copy link
Contributor

This confuses just about everyone because it's different from markdown

More important, probably, that this break numpy docstring standard. Using single backticks required to enclose variables, e.g. in Parameters section.

I think the mathjax extension just sets it to math by default.

It's not mathjax, it's sympy devs (default_role sphinx option).

@jksuom
Copy link
Member

jksuom commented Oct 24, 2017

Changing the role of single backticks is easy to do in Sphinx config file, but I think that should be done only if single dollars can be used for inline math. MathJax can be configured to do that; in fact, this is in its config/default.js file:

    inlineMath: [
//    ['$','$'],      // uncomment this for standard TeX math delimiters
      ['\\(','\\)']
    ],

The hard part is making the MathJax server use the modified configuration. It is not necessary to completely replace the standard combined configuration file (TeX-AMS-HTML-full); it will suffice to add a small local file to the config list provided the server can access it. That can apparently be arranged in two ways, either having the file locally on the server location or having it at some publicly accessible URL. There is already a skeleton local config file that does nothing. We could prepare a similar file local/inline-math.js for inline math. Our problem would then be solved if that could be placed at some public location. It could even be possible to have it at cdn.mathjax.org since we would certainly not be the only ones interested in it.

@asmeurer
Copy link
Member Author

It might be simpler to use a Sphinx extension to convert $math$ to the appropriate inline math directive.

@asmeurer
Copy link
Member Author

@certik
Copy link
Member

certik commented Oct 24, 2017

I agree, I think `this` should render code, and $ can render math. One can use the extension, that seems to work pretty well.

@vishalg2235
Copy link
Contributor

Should I change `this` to render code and $this$ to render math part in matrices.py docstring ?

@asmeurer
Copy link
Member Author

asmeurer commented Nov 6, 2017

The change here will need to happen in all modules at once when we make it, as it would change a single backtick to do code everywhere.

@vishalg2235
Copy link
Contributor

ok.. is there something that I could help ?

@asmeurer
Copy link
Member Author

asmeurer commented Nov 6, 2017

Since it will be a lot of work to change all single backticks throughout the code, a good first step would be to add the ability to use $math$, without messing with backticks just yet. Then we can transition over a few PRs instead of trying to do it all in one.

@vishalg2235
Copy link
Contributor

can you please guide me how to add that ability ?

@asmeurer
Copy link
Member Author

asmeurer commented Nov 6, 2017

We need to copy the Sphinx extension from @certik that I mentioned above.

@vishalg2235
Copy link
Contributor

vishalg2235 commented Nov 7, 2017

Ok so we pass every module from that extension and then every math part will be changed to $math$ . can we also use https://regexr.com/ for that

@vishalg2235
Copy link
Contributor

@asmeurer I have gone through extension code. So how can I help (what is my task) ?
@certik https://github.com/certik/sphinx-jax/blob/master/exts/math_dollar.py#L36 I didn't understood this ?

@certik
Copy link
Member

certik commented Nov 8, 2017

@vishalg2235 I tried to document this part here:

https://github.com/certik/sphinx-jax/blob/master/exts/math_dollar.py#L15

There is an example there ($f(n) = 0 \text{ if $n$ is prime}$) where you don't want to change the inner $n$ into math, since mathjax or latex themselves will do that. So then the comment here:

https://github.com/certik/sphinx-jax/blob/master/exts/math_dollar.py#L27

explains that you substitute $n$ for a temporary string, then convert the rest of dollar signs to math, and then substitute back.

@vishalg2235 let me know if it is clear, and then once it is, can you please improve the documentation and comments in the code using your own words so that it's clear to others as well? That would be very helpful.

@skirpichev
Copy link
Contributor

Besides backticks vs dollar, using latex in docstring - probably a bad idea in general (except for optional sections like suggested by numpy standard, e.g. Notes).

@certik
Copy link
Member

certik commented Nov 8, 2017

Yeah, I agree, I think it's best is to keep latex to minimum for docstrings.

@asmeurer
Copy link
Member Author

asmeurer commented Nov 8, 2017

I like the use of LaTeX math. It makes docstrings like the ones in the integral transforms docs easier to read. And even for small things, I think using LaTeX makes the docs look more professional. I think most people consume the SymPy documentation online.

Even for those that use ?, I think most people use the notebook. It is possible to render docstrings as html in the notebook. Unfortunately, it doesn't yet support LaTeX math. It also isn't enabled by default. So we should work with the IPython guys to get that fixed. But when we do the docs will render nicely with LaTeX math for almost everyone.

The main downside to too much LaTeX is for those of us who primarily look at docstrings in the terminal, or in the SymPy sources (myself included). It may actually be possible to extend the above docrepr idea to the terminal IPython so that ? opens the docstring in the browser.

Regardless, I think we should primarily focus on the majority of users, who use the html docs or the notebook.

@certik
Copy link
Member

certik commented Nov 8, 2017

I meant especially things like like $f(n) = 0 \text{ if $n$ is prime}$, which looks complicated, i.e., I think we should use rest or markdown to do the formatting, not latex. A simple latex math, like the one used in the integrals page, is fine I think.

@skirpichev
Copy link
Contributor

It is possible to render docstrings as html in the notebook. Unfortunately, it doesn't yet support LaTeX math.

numpy docstring from example seems to be too ugly even without LaTeX rendering. Yet this feature looks promising.

But when we do the docs will render nicely with LaTeX math for almost everyone.

For Diofant, I'm thinking about using unicode pretty-printing for math, e.g. for above integral transform definitions (except for optional heavy-math sections). This should work both for sphinx docs and in any Jupyter frontend. (Or even in plain CPython console.)

@moorepants
Copy link
Member

I would also vote for moving most, if not all, latex into the "Notes" section of the docstrings. The main portion of the docstrings should be readable in ascii or unicode. I, for example, almost exclusively read SymPy doc strings in the terminal or notebook which only show the ascii/unicode. We followed this pattern in the mechanics package and the balance works pretty good. You can understand the docstrings when working interactively and look up the rendered versions if you want to know more details. The primary use of the docstring is to figure out quickly what to type as arguments to a method/function. The first lines of the docstring should reflect this.

@asmeurer
Copy link
Member Author

asmeurer commented Nov 9, 2017

If you want to refactor our docstrings, go for it. It's a separate question of whether we should use ` for code and $ for math, which I assume we all agree on.

@vishalg2235
Copy link
Contributor

@certik Ok I got that.. I'm getting an idea what math_dollar.py is doing. Still I don't understand many lines because regular expressions is new to me.
I can improve documentation and comments in code. So should I send a PR in sphinx_jax repository ?

@vishalg2235
Copy link
Contributor

vishalg2235 commented Nov 10, 2017

also while checking $...$ why we need r"(?<!$)(?<!\)$([^\$]+?)$" in Line 41 . What is it doing

@asmeurer
Copy link
Member Author

I suggest a multi-stage process for changing this:

  1. Make $math$ work for inline math.
  2. Change default_role to "error" and modify all the warnings to use double backticks if they should be code or dollar signs if they should be math. The recently added --keep-going flag to Sphinx should help with this.
  3. Do a mass find and replace of all double backticks with single backticks. Double backticks will continue to work, but single backticks will be preferred.

Step 1 can and should be done as a separate pull request.

The default_role="error" should only be done to find and replace, not merged into master. Steps 2 and 3 can be done in the same PR.

The idea is that there are a lot of misuses of single backticks still in the codebase that should be double backticks, so we should do an audit first to fix them all. We know that audit will be complete if they are all replaced with double backticks or dollar signs, which shouldn't be in error (hence the default_role="error").

As a side note, let's discuss whether or not LaTeX should be used at #14964, and not do any mass changes as part of this issue.

@asmeurer
Copy link
Member Author

I added math_dollar at #17605.

I don't plan on replacing the backticks throughout SymPy with dollars, so if someone wants to take that up, please do. Note that it is not as simple as constructing a regular expression replacement, because many backticks in the current docstrings are actually supposed to be code (double backticks), so each replacement needs to be manually checked.

@asmeurer asmeurer added this to the Docstring Style Guide milestone Sep 23, 2019
@moorepants
Copy link
Member

The numpydoc spec does say "Enclose variables in single backticks. The colon must be preceded by a space, or omitted if the type is absent." But I guess changing single backticks to code is better than having them render as math.

@mgeier
Copy link
Contributor

mgeier commented Oct 11, 2019

The numpydoc spec does say "Enclose variables in single backticks. [...]"

I think that's an unfortunate choice from their part.

This assumes that backticks have the default docutils meaning of "emphasis".

I think it would be much better to recommend enclosing variables in asterisks, like this:

Description of parameter *x*.

This has the same result as intended by the NumPy people, but it doesn't depend on the default_role setting.

Ideally, this should be changed in the NumPy docstring guidelines (https://numpydoc.readthedocs.io/en/latest/format.html#sections).

I personally always simply ignored this specific recommendation (and used asterisks), and I think SymPy should do so, too.

@asmeurer
Copy link
Member Author

To me, variables should be rendered as code, not italics. So it should be double backticks. Unless the variable can be cross referenced, then you can use the colon-backtick syntax.

@mgeier
Copy link
Contributor

mgeier commented Oct 11, 2019

This is of course a matter of taste (and my taste isn't important here), but I wanted to stress that both NumPy and Python use "emphasis" instead of "code" in this case.

@asmeurer
Copy link
Member Author

"code" is obviously the correct thing to use in terms of strict markup elements. Most importantly, code with single backticks matches the way things are typically written in Markdown, which is used in more and more places, including here on GitHub issues. In my experience, people are so used to Markdown that they tend to write things that way in docstrings. Sometimes they do it even if they know better (I do it by accident myself all the time). So for that reason alone I think we should use single backticks.

Note that it's quite easy to change the way that code elements look in the final HTML. It's just a matter of tweaking the CSS in the theme. IMO the best formatting is to make variables monospace, with the exact same font as is used for any source code examples such as doctests and for the function parameters (the current docs don't quite do this right). That way, it is clear that they refer to elements of code, as opposed to say a mathematical variable. And it helps avoid ambiguities for variables that are also normal English words (like a).

@moorepants
Copy link
Member

This statement in the sphinx docs is relevant: https://www.sphinx-doc.org/en/master/usage/restructuredtext/roles.html#roles

The default role (content) has no special meaning by default. You are free to use it for anything you like, e.g. variable names; use the default_role config value to set it to a known role – the any role to find anything or the py:obj role to find Python objects are very useful for this.

@mgeier
Copy link
Contributor

mgeier commented Oct 11, 2019

I don't want to argue about taste, I just want to point out what others (NumPy, Python) are doing.

Philosophically, one could argue that variables (at least positional arguments) are not actually code, right?

In fact, Sphinx (or the autodoc extension, to be more specific) doesn't format them like code, see for example:

I'm talking about the highlighted lines, BTW.

So in reality it would be less consistent to format function arguments as "code" in the docstrings!

@asmeurer
Copy link
Member Author

I guess you mean because the user wouldn't use those variable names directly?

Technically, any positional argument can be used as a keyword argument. It also is "code" in the sense that it appears literally in the source code, which is useful for anyone reading the documentation at that level.

But either way, I think that's looking at the wrong thing. The important thing is the user knowing what a piece of text in the documentation refers to. In SymPy in particular there is a lot of potential for ambiguity, because something could be referring to a variable name (including a function parameter), or a mathematical variable, or something else. And sometimes they would all use the exact same spelling, so you have to use formatting to distinguish them. There are some examples of this in the style guide PR.

In fact, Sphinx (or the autodoc extension, to be more specific) doesn't format them like code, see for example:

I guess it uses different formatting to distinguish the different parts of the function definition. Also the function definition line in Sphinx isn't strictly valid Python (at least the ones in the Python docs aren't), so it wouldn't be correct for them to be monospace code formatted.

@mgeier
Copy link
Contributor

mgeier commented Oct 12, 2019

I guess it uses different formatting to distinguish the different parts of the function definition. Also the function definition line in Sphinx isn't strictly valid Python (at least the ones in the Python docs aren't), so it wouldn't be correct for them to be monospace code formatted.

Yeah, I guess that was the original idea behind formatting them with "emphasis".
And then, to be consistent, they also used "emphasis" for function arguments mentioned in the description text.

This is now the default in Sphinx, and just about everyone uses this. Most importantly, NumPy and probably most of the scientific Python projects use it like this.

Now if you (@asmeurer) want to format function arguments as "code" in the description text, to be consistent, you should also change the formatting of the (auto-generated) line with the function definition.

That's one of my points: consistency.

My other point: why not just do it like everybody else?

But either way, I think that's looking at the wrong thing. The important thing is the user knowing what a piece of text in the documentation refers to. In SymPy in particular there is a lot of potential for ambiguity, because something could be referring to a variable name (including a function parameter), or a mathematical variable, or something else. And sometimes they would all use the exact same spelling, so you have to use formatting to distinguish them.

AFAICT, there is no ambiguity:

  • variables, meaning function arguments: "emphasis", e.g. *a* (typeset as slanted)
  • variables, meaning math symbols: "math", e.g. $a$ (typeset with MathJax font, italics)
  • Python objects: e.g. :attr:`a` (typeset as link with a bold upright font)
  • actual code: e.g. `a` (typeset as monospaced, assuming that default_role will be changed)
  • normal text: e.g. a (typeset in the default font)

Did I forget something?
All those have different and distinguishable formatting.

One counter-argument: people might want to use "emphasis" (using asterisks) just to emphasize things in the description text, which could be confused with function arguments.
Response: I think it should be obvious from context what is what. And it won't happen that often anyway (I guess?).

@moorepants (#13519 (comment)):

This statement in the sphinx docs is relevant: https://www.sphinx-doc.org/en/master/usage/restructuredtext/roles.html#roles

The default role (content) has no special meaning by default. You are free to use it for anything you like, e.g. variable names; use the default_role config value to set it to a known role – the any role to find anything or the py:obj role to find Python objects are very useful for this.

Sadly, this isn't really solid advice (and should be changed in the Sphinx docs).

If you use default_role = any, you'll get warnings like this (referring to a parameter `a` in the example):

WARNING: 'any' reference target not found: a

So this is a bad choice, because you would flood your Sphinx output with warnings, hiding more important warnings.

If you use default_role = py:obj, you won't get any warnings, and variables will be typeset in an upright bold font like links, but they won't be actual links, because no Python object will be found with that name (or it would be an unrelated Python object that happens to have the same name).
But even worse, if you use this for actual Python objects, but make a typo, you will not get a warning (whereas with default_role = any you will get a warning).

So neither one of those choices should be recommended.

Apart from that, it doesn't make sense anyway, because function argument names simply aren't part of the linkable API. They aren't even Python objects in the module namespace, they are only Python objects within the function scope (which isn't really relevant for the documentation).

Anyway, this is just a response to @moorepants, and it is not really relevant in the discussion because using any or py:obj as default_role wasn't actually suggested.

@asmeurer
Copy link
Member Author

So when you say "variables" you specifically mean things that are function parameters, not Python variables in general?

@mgeier
Copy link
Contributor

mgeier commented Oct 13, 2019

Oh, sorry if that has been confusing ... yes, I was using the word "variables" for "function parameters", so were other people in the comments above.
It is used like this in https://numpydoc.readthedocs.io/en/latest/format.html#sections, and we were just repeating it. I'm pretty sure they do not mean Python variables in general. This should probably be changed in the NumPy styleguide.

And no, I don't mean Python variables in general. Those would be "actual code" (as mentioned in my comment above), and I would suggest formatting those in a monospace font, by using backticks (once the default_role is changed, or by double backticks before that).
However, I don't think that "normal" Python variables will appear much in docstrings, right?
Probably when talking about a code snippet in the "Examples" section?

@asmeurer
Copy link
Member Author

If you only mean parameters I'm not opposed to it. My main concern is it would be another rule that will require more work to actually maintain the consistency.

@theWiseAman
Copy link
Contributor

@asmeurer I have a little question. Why we didn't use single backticks as "code" snippets in the first place? Let's say I am willing to undo the double backticks, can we now implement single backticks as "code" so that it is consistent with the GitHub markdown typing habit styles? Can't the style configuration for single backticks be changed to render as "code" rather than "math"? Is there some problem with changing the style configuration?

I acknowledge it would be much simpler to correct single backtick uses but if we consider long-term perspective using single backticks as code would mean lesser errors from future contributors. Anyways correcting single backticks or converting double backticks amounts to the same effort at the end of the day.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

10 participants