-
-
Notifications
You must be signed in to change notification settings - Fork 14.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
nixos-render-docs: add manual html renderer, use it for the nixos manual #217342
Conversation
{manpage} already exapnds to a link but akkoma wants to link to a specific setting. split the mention for clarity. networkd just straight up duplicated what {manpage} generates anyway, so that link can go away completely.
our renderers carry significantly more state than markdown-it wants to easily cater for, and the html renderer will need even more state still. relying on the markdown-it-provided rendering functions has already proven to be a nuisance, and since parsing and rendering are split well enough we can just replace the rendering part with our own stuff outright. this also frees us from the tyranny of having to set instance variables before calling super().__init__ just to make sure that the renderer creation callback has access to everything it needs.
these weren't used for anything. options never was (and does not contain any information for the renderer that we *want* to honor), and env is not used because typed renderer state is much more useful for all our cases.
ultimately it's the renderer that needs it, for the options rendering that will be simplified in a bit.
we should really be rendering options at *rendering* time, not at parse time. currently this is just an academic exercise, but the html renderer will have to inspect the options.json data after the entire document has been parsed, but before anything gets rendered.
the html renderer will need all of these functions as well. some extensions will be needed, but we'll add those as they become necessary.
for most of our data classes we can use dataclasses.dataclass with frozen=True or even plain named tuples. the TOC structure we'll need to generate proper navigation links is most easily represented and used as a cyclic structure though, and for that we can use neither. if we want to make the TOC structures immutable (which seems like a good idea) we'll need a hack of *some* kind, and this hack seems like the least intrusive.
check that all required headings are present during parsing, not during rendering. building a correct TOC will need this since every TOC entry needs a heading to set its title, and every included substructure needs a title. also improve the error message on repeated title headings slightly, giving the end line turns out to not be very useful.
while not technically necessary for correct rendering of *contents* we do need to disallow heading levels being skipped to build a correct TOC. treating headings that have skipped a number of levels to actually be headings that many levels up only gets confusing, and inserting artifical intermediate headings suffers from problems, such as which ids to use and what to call them.
without this we cannot build a TOC to arbitrary depth without generating ids for headings, but generated ids are fragile and liable to either break or point to different things if the manual changes shape. we already have the convention that all headings should have an id, this formalizes it.
text content in the toplevel file of a book will not render properly. the first proper element will be a preface, part, or chapter anyway, and those require includes to produce. parts do not currently allow headings in the part file itself, but that's mainly a renderer limitation. we can add support for headings in part intros when we need them in all other cases includes must be followed by either another include, a heading, or end of file. text content could not be properly linked to from a TOC without a preceding heading.
while docbook relies on external chunk-toc info to do chunking of the rendered manual we have nothing of the sort for html. there it seems easiest to add annotations to blocks to create new chunks. such annotations could be extended to docbook to create the chunk-toc instead of passing it in externally, but with docbook on the way out that seems like a waste of effort.
the docbook toolchain uses docbook-xsl to generate its TOC, our html renderer will have to do this on its own. this generator uses a very straight-forward algorithm of only inspecting headings, but anything else could be inspected as well. (examples come to mind, but those do not have titles and would thus make for bad toc entries) we also use path information (that will be taken from include block args in the html renderer) to produce navigation information. the algorithm we use mirrors what docbook does, linking to the next/previous files in depth-first toc order. toc entries are linked to the tokens they refer to for easy use later.
the basic html renderer. it doesn't have all the docbook compatibility codes embedded into it, but there is a good amount. this renderer is unaware of manual structure and does not traverse structural include tokens (if it finds any it'll just fail), that task falls to derived classes. once we have more uses for structural includes than just the manual we may revisit this decision.
it's not hooked up to anything yet, but that will come soon. there's a bit of docbook compat here that must be interoperable with the actual docbook exporter, but luckily it's not all that much.
this will be necessary for html since there we have to do chunking into multiple files ourselves. writing one file from the caller of the converter and all others from within the converter is unnecessarily spread out, and returning a dict of file names and their contents is not quite as meaningful for docbook (which has only one file to begin with).
this converter is currently supposed to be able to reproduce the docbook-generated html DOMs exactly, though not necessarily the html *files*. it mirrors many docbook behaviours that seem rather odd, such as top-level sections in chapters using the same heading depth as understood by html as their parent chapters do. over time we can hopefully remove all special casing needed to reproduce docbook rendering, but for now at least it doesn't hurt *too* much.
this reproduces the docbook-generated html manual exactly enough to appease the compare workflows while we still support both toolchains. it's also a lot faster than the docbook toolchain, rendering the entire html manual in about two seconds on this machine (while docbook needs about 20).
Yay! Thanks so much for putting in this huge effort. It's a lot to review, so I started with trying to build on Build failed
A local build with Make output
|
that seems unrelated at first glance? so far we've done nothing to the nixpkgs manual (that we can recall), it's all been localized to edit cannot reproduce that error on x86_64-linux it seems. we did mange to get a failure from a non-cleaned
|
That definitely looks like a "dirty tree" issue. There's already a I think this should have been fixed by #215121? EDIT: there's no EDIT: #217865 |
Sorry, accidentally linked my PR to this one so that it closed it... |
@pennae yeah sorry for the noise, I've built the Nixpkgs manual because I followed the wrong manual's instructions. It happens they look almost exactly the same and the bookmarks sit next to each other in my browser... The NixOS manual build works and looks correct on superficial inspection; checked off We may want to update the build instructions for the manual though, because it currently says to build |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Went through the commit messages to follow design decisions, without inspecting the actual code too deeply, only ensuring the gist of each diff matches the intention laid out by the corresponding commit message.
I really like the approach and the result. Thank you so much, it's a great improvement on many fronts, and will enable lots of future work to improve technical coherence and readability of the manuals.
This pull request has been mentioned on NixOS Discourse. There might be relevant details there: https://discourse.nixos.org/t/2023-02-23-documentation-team-meeting-notes-29/25731/1 |
This pull request has been mentioned on NixOS Discourse. There might be relevant details there: |
barring new reviews or complaints we'd like to merge this tomorrow or friday. it's currently opt-in and thus probably won't hurt anyone, and the real test will only come with post-23.05 disabling docbook by default anyway. |
This pull request has been mentioned on NixOS Discourse. There might be relevant details there: https://discourse.nixos.org/t/tweag-nix-dev-update-45/26397/1 |
Description of changes
at last! the html renderer for the nixos manual (if
! allowDocBook
). we reproduce the docbook-generated manuals almost exactly (as compared by the manual compare workflow scripts, which run each candidate through html-tidy first to allow for whitespace changes). we also do that in about 10% the time needed by the docbook manual toolchain (realizing a similar speedup to what manpages saw).there's a good bit of docbook compat code in here, most of which can be either dropped when docbook goes away or formalized as the new (old) nixos manual style.
an epub converter should be not too far off from here, there's already infrastructure in place to allow arbitrary chunking of the input document and as far as we can tell our present epub manual is little more than a more aggressively chunked html manual packed in a zip file.
draft because this depends on #214709 for a couple options rendering internals, but we could extract those and merge this first if someone wants to expedite this.Things done
sandbox = true
set innix.conf
? (See Nix manual)nix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD"
. Note: all changes have to be committed, also see nixpkgs-review usage./result/bin/
)