Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve the make_rst.py parser for BBCode tags #64230

Merged

Conversation

YuriSizov
Copy link
Contributor

@YuriSizov YuriSizov commented Aug 10, 2022

This PR reorganizes make_rst.py, adds some comments, and tries to make the parser logic clearer overall. It also introduces additional checks, that help catch some issues with documentation formatting.

While working on #64164 we've discovered an annoying formatting issue that is hard to detect with our current BBCode parser in make_rst.py. Documentation contributors sometimes improperly close [code]foo[/code] tags and instead of the closing tag write the name of the property, e.g. [code]foo[/foo]. This goes under the radar, because this bit is often followed by another [code]foo[/code] sequence, and together they form one valid code block. At least that's what our formatter thinks.

And it looks like this in the end: [code]foo[/foo] some text in between [code]foo[/code], which results in this:

chrome_2022-08-10_19-45-05

Now, we actually need to support writing tag-like sequences inside of the code block and inline code strings. We use this to document RichTextLabel, among other related things. So we can't explicitly forbid this kind of sequence inside code blocks. The introduction of [param foo] should make it less likely for these mistakes to happen, but I've decided to try and improve the logic of the parser anyway, and add new warnings that identify if a bit inside of a code tag looks like a closing tag.

It has 16 false positives in the current code base, but at least it's some form of reporting to work from. If needed, we could add some kind of extra argument, e.g. [code unsafe] that would remove warnings from these places and require the writer to pay attention to formatting and disabling the check.

I've also removed some unused code that has been introduced back in 2016 in the initial implementation 3ee4f4f. I checked, and the removed bits never get hits in the current state of the documentation. The refactoring also helped me to fix another formatting issue that wouldn't be detectable without a rework:

chrome_2022-08-10_19-43-07

Besides that and the fixes from the XMLs, the resulting RSTs are identical to the existing ones. So I count it as no regressions.

@YuriSizov
Copy link
Contributor Author

YuriSizov commented Aug 10, 2022

To however is going to review this, note that make_type, make_enum, make_method_signature, make_heading, make_footer, escape_rst, format_codeblock were simply moved around and shouldn't have any functional changes.

doc/tools/make_rst.py Show resolved Hide resolved
doc/tools/make_rst.py Outdated Show resolved Hide resolved
@asmaloney
Copy link
Contributor

I think if you're adding warnings, adding a way to suppress them for false positives is essential because otherwise you have to work through the entire list every time and it gets hard to spot new ones.

I would make it very explicit (unsafe is a bit obscure) - something like [code ignore-warning].

@YuriSizov
Copy link
Contributor Author

I think if you're adding warnings, adding a way to suppress them for false positives is essential because otherwise you have to work through the entire list every time and it gets hard to spot new ones.

I agree in principle, but I wanted to keep the changes from being intrusive. Which is why it's a warning. If we add a way suppress them, then we can as well make them errors, since the effect would be the same. We can wait for Akien to return from his holidays and see what he thinks.

@YuriSizov YuriSizov force-pushed the makerst-more-robust-tag-parser branch 2 times, most recently from dff2df2 to f17c845 Compare August 11, 2022 16:00
@YuriSizov
Copy link
Contributor Author

I've tried replacing the [, ] characters inside of code blocks with their HTML entities, but that didn't work out. We already do this for some characters that would make XMLs invalid, such as <, but in case of brackets it really impacts the readability with either &rbrack; format or &#93; one. Not only that, but our Python XML parser actually converts HTML entities into normal strings on read, so the issue remains unhandled, because BBCode parser gets the same exact input.

So that's one idea that didn't work.

@YuriSizov YuriSizov force-pushed the makerst-more-robust-tag-parser branch from f17c845 to d953d95 Compare August 15, 2022 14:45
@mhilbrunner
Copy link
Member

mhilbrunner commented Aug 15, 2022

I reviewed this twice now and I like the changes. I didn't spot anything of note yet. At first I was a bit hesistant because there aren't too many of these errors, but to me there is not much additional complexity and the reorg makes sense IMO.

Would love if @Calinou could give this a look too maybe :)

@mhilbrunner mhilbrunner merged commit 5347732 into godotengine:master Aug 17, 2022
@mhilbrunner
Copy link
Member

Tested it again locally, found no issues. Merging! Thanks 🎉

@YuriSizov YuriSizov deleted the makerst-more-robust-tag-parser branch August 17, 2022 12:53
asmaloney added a commit to asmaloney/godot that referenced this pull request Aug 30, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants