Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate modules docs to markdown #175586

Closed
roberth opened this issue May 31, 2022 · 47 comments
Closed

Migrate modules docs to markdown #175586

roberth opened this issue May 31, 2022 · 47 comments
Labels
0.kind: enhancement Add something new 6.topic: documentation Meta-discussion about documentation and its workflow 6.topic: module system About "NixOS" module system internals

Comments

@roberth
Copy link
Member

roberth commented May 31, 2022

Project description

Migrate module docs to markdown.

Pandoc should be able to mix and match 🤞

Suggested pipeline

Suggested migration path

  1. Implement the pipeline, so we support multiple languages
  2. Start the conversion to description = lib.docMD "...", etc
  3. Forbid non-wrapped description (and other such attrs)
  4. Switch the default to markdown, remove the lib.docMD calls.
  5. Optionally, simplify the pipeline

From the broader ecosystem perspective, it would be good to keep the flexible pipeline, as we can use it to dismantle the cottage industry of custom module docs generators. It's a matter of wrapping it with the right pandoc syntax.

Technical details

(just to get a feel for it)

description = lib.docDB "This is <literal>docbook</literal>.";

->

description = { type = "doc"; lang = "docbook"; text = "This is <literal>docbook</literal>."; };

->

Some XML

->

Markdown:

    Description:

    ```{=docbook}
    This is <literal>docbook</literal>.
    ```

Alternatives

  • Maybe XML + XSLT should be JSON + Python. @pennae has already converted some things to python as part of their eval performance work.

Metadata

  • homepage URL: module system does not have a dedicated homepage
  • source URL: lib/, nixos/, pkgs/
  • license: mit, bsd, gpl2+ , ...
  • platforms: unix, linux, darwin, ...
@jtojnar
Copy link
Member

jtojnar commented May 31, 2022

It would be nice but there was opposition to including pandoc in the NixOS system closure. So we would either need to decouple the manual from the system, choose some lighter parser (maybe lowdown + HTML→DocBook via XSL), or cache pre-generated DocBook like we currently do with the NixOS manual.

@roberth
Copy link
Member Author

roberth commented May 31, 2022

opposition [to increase in build closure size because of pandoc]

Sure! If we don't want pandoc in the build closure for NixOS, we can choose to implement step 5. NixOS' dependency on pandoc will be a temporary affair.

lowdown + HTML→DocBook via XSL

We'll need a tool that can convert tons of snippets at pace. Pandoc can do this (as linked before). Shelling out will not be an option, so I suspect that this would involve a custom tool. Could be worth it.

cache pre-generated DocBook

This is terrible developer UX. If we want people to write less docs, this is one way to do it.
It will be even more complicated, because or custom tooling can't process this file-by-file like we can in the manual.

@pennae
Copy link
Contributor

pennae commented May 31, 2022

Maybe XML + XSLT should be JSON + Python. @pennae has already converted some things to python as part of their eval performance work.

probably, yeah. there's not that much literal docbook in nixos modules that's properly labelled as such, and among the great many docbook-using descriptions that are not properly labelled by far the most use for docbook uses only a few tags (literal, option, filename, programlisting, ...). parsing descriptions as xml fragments and running a custom transform over them actually looks quite feasible!

@roberth
Copy link
Member Author

roberth commented May 31, 2022

Hmm yeah, that makes me think that perhaps my migration path is overly correct. We could just do it and fix the occurrences of the small set of tags used after switching.

parsing descriptions as xml fragments and running a custom transform over them actually looks quite feasible!

I would avoid writing any and all parsing logic, and only go from semi-structured data to {markdown + pandoc raw attribute} in our own tooling.
It's not worth writing and maintaining a complex solution just to have ok-looking docs during the migration. Of course this is subjective, but the temporary nature is evident. If such a custom transform is a migration tool rather than an integral part of the docs pipeline, I'm all for it. That also takes the pressure off its correctness, as we can fix a few things up by hand.

@zimbatm
Copy link
Member

zimbatm commented May 31, 2022

I bet it's possible to introduce and get rid of pandoc until the next release. It would be a shame to block the refactor for a temporary dependency, as long as it doesn't stay around forever.

@jtojnar
Copy link
Member

jtojnar commented May 31, 2022

We do not have a time frame, so it might take longer than the next release. Especially when we have not even settled on a toolchain yet (currently the candidates are #105036 and #108063). And people who would be annoyed by the closure do often use unstable.

So, while deciding that having them choose between having pandoc in their system closure and disabling local docs temporarily can be a reasonable trade off for the long term goal of improving the docs infrastructure, it might still turn out few sed invocations is actually a better interim solution.

@edolstra
Copy link
Member

edolstra commented Jun 1, 2022

I would keep it simple and just automatically convert all description strings to Markdown in one commit, so without introducing docMD, and drop support for Docbook descriptions immediately. That way we don't need pandoc in the system closure.

P.S. NixOS/nix#6583 already assumes that option descriptions are in markdown. Supporting other markup languages would just make life more complicated.

@pennae
Copy link
Contributor

pennae commented Jun 1, 2022

we've made an attempt at producing a script to migrate docs.

perl -pi -e 's,<literal>([^`]*?)</literal>,`$1`,g' nixos/modules/**/*.nix
perl -pi -e 's,<replaceable>([^»]*?)</replaceable>,«$1»,g' nixos/modules/**/*.nix
perl -pi -e 's,<filename>([^`]*?)</filename>,`$1`,g' nixos/modules/**/*.nix
perl -pi -e 's,<code>([^`]*?)</code>,`$1`,g' nixos/modules/**/*.nix
perl -pi -e 's,<option>([^`]*?)</option>,`$1`,g' nixos/modules/**/*.nix
perl -pi -e 's,<command>([^`]*?)</command>,`$1`,g' nixos/modules/**/*.nix
perl -pi -e 's,<link xlink:href="(.+?)" ?/>,$1,g' nixos/modules/**/*.nix
perl -pi -e 's,<link xlink:href="(.+?)">(.*?)</link>,[$2]($1),g' nixos/modules/**/*.nix
perl -pi -e 's,<package>([^`]*?)</package>,`$1`,g' nixos/modules/**/*.nix
perl -pi -e 's,<emphasis>([^*]*?)</emphasis>,*$1*,g' nixos/modules/**/*.nix

perl -pi -e 'BEGIN{undef $/;} s,<citerefentry>\s*<refentrytitle>\s*(.*?)\s*</refentrytitle>\s*<manvolnum>\s*(.*?)\s*</manvolnum>\s*</citerefentry>,`$1 ($2)`,smg' nixos/modules/**/*.nix

does a reasonable job of replacing most docbook tag with markdown constructs. around 2000 matches that look like xml tags remain after that, but a lot of that is either actual xml than can be safely ignored (eg fontconfig) or extended lists that have to converted with more thought anyway.

@pennae
Copy link
Contributor

pennae commented Jun 2, 2022

having spent a couple hours trying to hack together something to process the docbook xml snippets, it doesn't quite seem to be worth the trouble over using regexes. descriptions are not well-formed xml to begin with, interpolations split them up even further (and are a mess to track), some descriptions aren't even valid xml fragments—and the main benefit of automatic translation (being able to translate eg varlists) requires all of that to not be like it is. and we'll need sapient inspection of the results anyway. :/

@roberth
Copy link
Member Author

roberth commented Jun 3, 2022

descriptions are not well-formed xml to begin with, interpolations split them up even further (and are a mess to track), some descriptions aren't even valid xml fragments

This is why I proposed doing it as part of the existing pipeline, where we do have quality data.

Another option is to write a script that takes writes out a long markdown document with markers to delineate the individual descriptions, etc, split that up, and merge it back into the XML. This way, we don't need pandoc in the build closure.

I guess though, docs quality during the transition is not a priority, so then I suppose leaving some descriptions in a broken state is more acceptable than having pandoc in the closure?


Let me also mention literalDocBook, which occurs in example, defaultText. We'll want to have the same for markdown. At least these are easy to spot for the migration.

@pennae
Copy link
Contributor

pennae commented Jun 3, 2022

maybe we should do something that's a mix of all of the above: temporarily extend/augment the doc merging script we already have to render markdown to docbook, add literalMarkdown and (eg) mdDoc markers, and convert descriptions over time. once the conversion is done we could drop mdDoc and the docbook markers/converters. not going from docbook to markdown but from markdown to docbook seems easier, since docbook is somewhat richer.

(as we've learned from the split docs changes, a flag day event really doesn't work very well. the split docs builds at least broke to call attention to breakages, docbook somehow still making it into the MD future might not be as loud)

@pennae
Copy link
Contributor

pennae commented Jun 4, 2022

while trying some things out we've noticed we can't change some docbook tags without changing eg the manpages. one we noticed in particular was <command>, which gets rendered specially both in manpages and in the html manual. not quite sure what to do about that. presumably we'll need a new filter like already exists for manpages, eg {command}`stuff`‍?

@roberth
Copy link
Member Author

roberth commented Jun 12, 2022

@pennae has found a great solution for the migration. It appears to be ready for use by other contributors. I'd like to merge it in a couple of days, but perhaps more of us could give it a try?

#176146

@pennae
Copy link
Contributor

pennae commented Jun 20, 2022

interest seems low, but that might be a discoverability issue. perhaps a round through discourse would be useful, what do you think?

@ncfavier
Copy link
Member

ncfavier commented Jun 20, 2022

Forgot to share my thoughts: this is great. There are currently at least three places where the "NixOS module system option documentation rendering logic" is defined: nixpkgs, nmd (used by Home Manager, ping @rycee), and nixos-search. It would be nice to unify the three. Merging this now would break some options at nixos-search. Maybe I'll find the energy to port the changes there soon.

@roberth
Copy link
Member Author

roberth commented Jun 20, 2022

nixpkgs

No problems here.

nmd

Home Manager doesn't have to use mdDoc until they've updated their tooling, or they could make an instantaneous switch if they like. They control both their modules and their tooling, so I don't see a problem here.

nixos-search

Flakes are experimental anyway, so I don't think this is a blocker?


a round through discourse

I don't think the PR needs more review. If anything pops up, we can fix it as we go. It'd be a nicer experience for contributors if their changes are actually mergeable. I'd say merge and then announce on discourse.

@roberth
Copy link
Member Author

roberth commented Jun 20, 2022

A tracking issue for the conversion project would also be helpful for discoverability.

@ncfavier
Copy link
Member

ncfavier commented Jun 20, 2022

Home Manager doesn't have to use mdDoc until they've updated their tooling, or they could make an instantaneous switch if they like. They control both their modules and their tooling, so I don't see a problem here.

I think this situation is not optimal: since Home Manager inherits e.g. lib.literalMD from nixpkgs it should probably also inherit the associated logic. I feel like there's no good reason to have more than one way to render option documentation. But indeed this is tangential to this issue.

Flakes are experimental anyway, so I don't think this is a blocker?

The confusingly-named flake-info tool also indexes nixpkgs options and packages, BUT I just remembered that since NixOS/nixos-search#460 we use options.json directly, and the MarkDown translation here happens before options.json, so we should be fine.

@rycee
Copy link
Member

rycee commented Jun 20, 2022

FWIW, nmd has been moving towards asciidoc for some time and much of the HM manual is written in asciidoc. So it is likely that option descriptions in HM will go that way as well. I've been meaning to switch over but have never found the time to do so. I'll have more spare time sometime in July so might make some moves then.

@pennae
Copy link
Contributor

pennae commented Jun 20, 2022

the HM manual is written in asciidoc

that's a point, actually. nixos does have an asciidoc exporter for module docs, but it's kind of not useful. same goes for the MD exporter at this point, but that will become more useful in the future at least. the asciidoc exporter will likely have to go away though at some point, unless we want to write multiple renderers.

@roberth
Copy link
Member Author

roberth commented Jun 20, 2022

Arion's docs are also asciidoc based and it's the first consumer of the Nixpkgs asciidoc renderer. I'm not aware of any others.

Asciidoc was a good contender before markdown was chosen for the manuals, but I'm avoiding it for new projects. Migrating arion's docs to markdown wouldn't be bad.

@roberth
Copy link
Member Author

roberth commented Jun 20, 2022

Do I read correctly that we don't have any blockers for #176146?

@pennae
Copy link
Contributor

pennae commented Jun 21, 2022

don't think there are any, especially since the scope is still rather limited and changes to the rendering tool are easy enough should problems pop up.

@roberth
Copy link
Member Author

roberth commented Jun 28, 2022

@pennae, do you think the script you wrote in #175586 (comment) is worth promoting to a maintainers/ script, so people can use it as a helper?

It may create a better first iteration than I would, because I still forget to add mdDoc when writing docs.

@ehmry
Copy link
Contributor

ehmry commented Nov 1, 2022

I am not going to waste my time prefixing every documentation string in my modules with lib.docMD. There is no incentive for me to add extra processing to strings that need no processing when nixos is already expensive to evaluate.

@roberth
Copy link
Member Author

roberth commented Nov 1, 2022

@ehmry the markers were added to facilitate a smooth transition, taking into account 3rd party repos. The "extra processing" is very minimal, because conversion happens at build time, not eval time. The markers will be removed in due time.

@ncfavier
Copy link
Member

ncfavier commented Dec 30, 2022

Home Manager doesn't have to use mdDoc until they've updated their tooling, or they could make an instantaneous switch if they like. They control both their modules and their tooling, so I don't see a problem here.

Except for stuff like mkEnableOption, mkAliasOptionModule, etc: nix-community/home-manager#3543 (comment)

I guess nmd could just use (pkgs.nixosOptionsDoc { ... }).optionsJSON.

@pennae
Copy link
Contributor

pennae commented Dec 30, 2022

mkEnableOption should be able to handle both and emit options in kind. if mkAliasOptionModule (or others) can't do that during the transitional period that could well be regarded as a bug. nobody should be forced to support markdown yet

@ncfavier
Copy link
Member

ncfavier commented Dec 30, 2022

AFAICT that leaves two three such bugs: _module.args, mkPackageOption and mkAliasOptionModule (other places where mdDoc is used in the module system are invisible or internal).

@pennae
Copy link
Contributor

pennae commented Dec 30, 2022

what's wrong with _module.args? the other two are easy enough to fix with ${fn}MD variants.

@ncfavier
Copy link
Member

It uses mdDoc so its description also does not show up in the HM manual.

@pennae
Copy link
Contributor

pennae commented Dec 30, 2022

ah, well. that we probably can't fix without either adding special cases to the "no docbook descriptions in nixpkgs" rule, or dropping the rule altogether. neither of those sound like a good path forward. 🙁 maybe this (very special) option could be hidden for now and moved to the manual instead?

@ncfavier
Copy link
Member

Yes, we could make it invisible/internal (can't remember the distinction) for now and add a { options._module.args = mkOption { internal = false; }; } module to the main options doc.

@pennae
Copy link
Contributor

pennae commented Dec 30, 2022

unfortunately that doesn't seem to be possible. we can either hide it completely or not hide it at all. (we also tried other hacks, like setting a special module argument in the docs build. no dice.)

@rycee
Copy link
Member

rycee commented Dec 31, 2022

FWIW I've put together a preliminary PR to support AsciiDoc descriptions in nmd. See https://gitlab.com/rycee/nmd/-/merge_requests/7/diffs

Since CommonMark to some extent is similar I also use this for the mdDoc description type and it doesn't look entirely awful. I'll see about improving it a bit more and then make use of that in Home Manager.

@rycee
Copy link
Member

rycee commented Jan 1, 2023

Just to justify the use of AsciiDoc a bit. The primary reason that HM uses that instead of Markdown is that it simplifies the build pipeline a great deal and reduces the build closure size quite a lot. It also integrates well with the existing DocBook setup. For example, you can see that it was reasonably simple to support AsciiDoc module descriptions in the nmd merge request above without noticeably expanding the closure or build time.

The build closure and build time is more important for HM documentation since it will almost never be fetched from a central binary cache while the Nixpkgs/NixOS documentation will nearly always be available from cache.nixos.org.

That said, nmd uses the legacy Python version of the asciidoc tooling, which does add a dependency on Python and does not support recent AsciiDoc features. The more up-to-date Asciidoctor pulls in Ruby, which is even heavier so I would be very reluctant to move to that. With the recent resurgence of AsciiDoc development (see https://asciidoc.org/) perhaps there will be some more light-weight tool that can do AsciiDoc → DocBook in the future, perhaps based on libasciidoc.

Of course, if somebody knows of a light-weight Markdown → DocBook converter (that supports description lists) then we could equally well use that in nmd. It's just that I haven't found anything like that in my, admittedly shallow, investigations.

@roberth
Copy link
Member Author

roberth commented Jan 1, 2023

I would highly recommend to follow the module system's migration to markdown, so that all markers can be removed and the contributor experience is consistent.

@ncfavier
Copy link
Member

ncfavier commented Jan 1, 2023

@rycee did you have a look at the script we use in nixpkgs? It depends on Python + mistune, would that be small enough for your purposes?

I also think we should keep in mind the goal to make consumers able to use pkgs.nixosOptionsDoc (even for non-NixOS-specific modules) (which uses the above script to handle the MD conversion) to transform a bunch of modules into a standalone documentation snippet. As far as I can tell, the only thing left that would prevent NMD from using it (thereby removing a pile of duplicated code) is its handling of declaration locations, which I've started working on in #203994 (but don't look at that PR yet, there are more things I need to think about).

+1 on strongly recommending not adding a third adDoc format in the wild, things are complicated enough.

@rycee
Copy link
Member

rycee commented Jan 1, 2023

@ncfavier Ah yeah, the mergeJSON.py file looks promising! I'll see if I can integrate that one. It doesn't seem to support definition lists, if I understand correctly, but I think that's OK for now, can always use plain DocBook for that.

@pennae
Copy link
Contributor

pennae commented Jan 1, 2023

It doesn't seem to support definition lists, if I understand correctly

correct. we tried to add those, but the mistune plugin API seems to not be powerful enough to support the syntax nixpkgs already uses. since these lists were rather rare in nixos options we decided to convert them to regular lists instead, but if something pops up that lets us support them: all for it! :)

pennae added a commit that referenced this issue Jan 5, 2023
unfortunately we can't unconditionally make this text markdown without
impacting downstream users of docs generation (as noted in #175586).
hide it entirely until the transition is complete.
@pennae
Copy link
Contributor

pennae commented Jan 26, 2023

for the record: option docs now support all features of nixpkgs-flavored markdown, including definition lists.

@pennae
Copy link
Contributor

pennae commented Feb 12, 2023

can this be considered done, or only once docbook in option docs is no longer allowed?

@roberth
Copy link
Member Author

roberth commented Feb 13, 2023

It's done when "mdDoc" is the default behavior of the description field. I don't think we're already in a position where we can do that, are we? literalDocBook hasn't been deprecated in the latest release yet, so the next release should still support it, but considering that the alternative, lib.mdDoc, is already available in all releases, we can remove proper support for docbook in master after the 23.05 branch-off.

4. Switch the default to markdown, remove the lib.docMD calls.

@pennae
Copy link
Contributor

pennae commented Feb 13, 2023

I don't think we're already in a position where we can do that, are we?

technically we can do it, but that would go against the deprecation process for docbook descriptions that we agreed on elsewhere. so let's keep this open until after 23.05, when we can disable docbook support. :)

sloane-shark pushed a commit to sloane-shark/nixpkgs that referenced this issue Feb 17, 2023
unfortunately we can't unconditionally make this text markdown without
impacting downstream users of docs generation (as noted in NixOS#175586).
hide it entirely until the transition is complete.
@pennae
Copy link
Contributor

pennae commented Jul 3, 2023

we can remove proper support for docbook in master after the 23.05 branch-off.

#239636 removed the last vestiges of the docbook toolchain in our manual builds, and mdDoc behavior has been the default since #237557. removing mdDoc should probably wait for another release to keep backporting effort reasonable both within and without the nixpkgs repo, but the migration is now finally done.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
0.kind: enhancement Add something new 6.topic: documentation Meta-discussion about documentation and its workflow 6.topic: module system About "NixOS" module system internals
Projects
None yet
Development

No branches or pull requests

10 participants