simpler doc links: `getFilePermissions <#getFilePermissions,string>`_ => $`getFilePermissions` #125

timotheecour · 2019-01-30T09:40:04Z

this PR nim-lang/Nim#10492 adds a lot of great documentation how ever it also adds a lot of things like:

`getFilePermissions <#getFilePermissions,string>`_

which are required today by docgen to produce valid links.
This is not DRY for 2 reasons:

the name is consistently repeated twice
the args are mentioned which is a different kind of DRY violation (args mention in source code as well as in that doc link)

other drawbacks:

this makes for less readable docs when browsing source code
it increases risk of links becoming out of sync when proc signature changes (eg an extra (optional) param is later added, which can often happen)
discourages using proper doc links as it adds a barrier to "get it right" (which explains why so few procs have these links)
EDIT: doc links are order-dependent:

proc fn*[T: SomeFloat](a: T)=discard
proc fn*[T: SomeInteger](a: T)=discard

generates:

fn,T
fn,T_2

proposal

add a DRY and easy to remember syntax to refer to a symbol (see below for concrete syntax)
let docgen resolve it to an actual doc link exactly as it would if user typed getFilePermissions <#getFilePermissions,string>_ in full
the only problem is when symbol is overloaded; there are 2 approaches
- make docgen issue an ambiguity error and require full syntax getFilePermissions <#getFilePermissions,string>_ or something a bit DRY-er getFilePermissions <string>_
- make docgen resolve to the 1st symbol it finds; shouldn't be a big deal; overloads are generally placed next to each other and oftentimes only 1 is documented anyway;

in any case this is a practicaly tradeoff: we improve situation for the 95% case and make the 5% case well, as bad as it already is today

concrete syntax for the proposal

open to suggestions;
maybe:

see also $`getFilePermissions` which is related # docgen will resolve to same as `getFilePermissions <#getFilePermissions,string>`_
see also $`strutils.foo` which is related # docgen will resolve to same as `strutils.foo <#strutils.foo,string,int>`_ # ie, can use modules in other symbols

benefits:

would fix Fix broken links in docs Nim#16337
broken links would give CT errors when running nim doc
links would be easier to read/write/edit
each link would give rise to an entry in the index
links wouldn't break when you change a proc to a func or add optional params
there are easy ways to handle overloaded procs in the same module, I can describe what I have in mind

The text was updated successfully, but these errors were encountered:

ringabout · 2020-12-29T08:41:34Z

Yeah, I agree.
Original doc links way makes proc + nosideEffect => func a bit annoying.

a-mr · 2021-02-11T10:37:35Z

RST has the mechanism of custom roles for "interpreted text" (between ``).

So proper RST-style syntax would be something like:

:nim:`proc getFilePermissions`
:nim:proc:`getFilePermissions`

We can parse the text inside `` to generate proper link so we can get close to actual definition of the proc:

:nim:`proc getFilePermissions(string)`
:nim:proc:`getFilePermissions(string)`

Probably we can also study how it's done in sphinx

cc @konsumlamm

timotheecour · 2021-03-05T02:56:32Z

@a-mr

Probably we can also study how it's done in sphinx

we can do better than requiring users to write :py:const: or :py:class:, because nim is strongly typed and nim has access to resolved symbols when running nim doc, in particular nim doc --project can be expected to resolve all references including ones outside a module (or even nim doc without --project thanks to index files; theindex.html is proof this can all work)

:nim:proc getFilePermissions
:nim:proc getFilePermissions(string)

that's not DRY though, and still suffers from being sensitive to things like proc<=>func edits which would break links.

For nim files, all that's needed IMO is a single char to trigger link resolution. eg:

## See also: :`delete` # a reference to `delete` in same module
## See also: :`strutils.delete` # a reference to `delete` in a potentially different module

overloads

When a symbol is overloaded, IMO following is good enough:

## See also: :`httpclient.delete` # references 1st overload of `delete` in `httpclient`

docgen can, for overloads within a module, make it easy to "click to go to next overload". In the rare cases where showing 1st overload isn't good enough, we can extend the syntax, there are many ways, but this isn't needed in 1st version of this feature.

benefits

easy to read/write/edit such links in code
automatically generate warning when such symbols are not found
links are robust to future API additions such as adding params, adding overloads, etc

bikeshedding

I've tried to see which special char would work, see

criterion:

the links should use same syntax in rst and nim files
it should be as simple as possible, just 1 char to trigger resolving a link

conclusion:
2 options seem to render well in both nim and rst in github (note that nim rst2html and nim doc can always be customize as needed):

option1: `strutils.delete`_

option2: :`strutils.delete`

option 1 renders in rst github as a link that points to itself. This is fine since the main use case is nim doc and nim rst2html. Note that the linke can point to something if there's another definition somewhere, for eg:

see also: `strutils.delete`_.

.. _strutils.delete: https://github.com/nim-lang

but that's not the intended use (although it opens door for including an auto-generated rst file which would define all those links if we want to make those links work from github viewer).

option 2 renders in rst github as a code block preceded by : (so long .. default-role:: code appears on top, which is what we're going for anyways, see rst: single backticks now render correctly in both rst2html and github Nim#17028); which also seems acceptable but a tad less visually pleasing.

conclusion

I prefer option 1. Any downside I haven't thought about?

links

timotheecour · 2021-03-18T01:10:54Z

after nim-lang/Nim#17372, there's now another, possibly better option:

strutils.delete_

instead of:

`strutils.delete`_

example:

##[
foo1_

foo.bar2_

works with a comma separating links too:

foo.bar3_, foo.bar4_

`foo.bar5`_

`foo.bar6`_, `foo.bar7`_
##]

example of how it renders in github rst: https://github.com/timotheecour/vitanim/blob/3ae7daf173b4e074774af7062476beb28418de5b/testcases/tests/t10468b.rst#id11

timotheecour · 2021-07-26T17:46:54Z

@a-mr if you want to tackle this I'm happy to help, I think your previous contributions to docgen/rst (2 pass approach, underscore link support, sorting, etc) should make this more within reach; I'd be ok with the syntax strutils.delete_ which is as simple as it gets (simply adds an underscore).

For semantics we could do this:

if symbol is not overloaded, the hyperlink points to the symbol definition directly
if symbol is overloaded, the hyperlink points to the 1st overload; thanks to docgen: sort symbols (fix #17910) Nim#18560, the other overloads will all appear in order right below. In the rare cases where we absolutely must point to a specific overload, we can devise a syntax for that later, but I don't see it as critical

a-mr · 2021-07-26T20:53:23Z

@timotheecour ,
I think I need to clarify 2 things first (note I know very little about the compiler). Let us assume that we are talking about `strutils.delete`_

is this new syntax going to "resolve" doc links to modules that are not imported? E.g. if strutils was not imported in the module being processed then docgen would know nothing about delete, right?
- if no then we should just mechanically fall back to some default HTML link that will be generated e.g. for every proc?
what is its interaction with modules system/paths? E.g. where is strutils searched?
- considering that there can be module with same name with different paths: should not we add possibility to create the same directory hierarchy as main code? I'm a bit bothered by the fact that currently koch docs dumps *.html to web/upload/1.5.1 (except the separate compiler directory). Also should not our link look something like `std/strutils.delete`_ then?

timotheecour · 2021-07-26T21:31:01Z

Let us assume that we are talking about `strutils.delete`_

shouldn't that be:
strutils.delete_ ? (as enabled by your merged PR allow short-style RST references with symbols Nim#17372)

eg:

# good:
## See also strutils.delete_
# bad:
## See also `strutils.delete`_

is this new syntax going to "resolve" doc links to modules that are not imported? E.g. if strutils was not imported in the module being processed then docgen would know nothing about delete, right?

yes, this shall resolve doc links that are not imported in current module (and in fact, this point should be irrelevant), so that in system.nim you can write:

proc compileOption(option, arg: string): bool {.magic: "CompileOptionArg".} =
  ## See also compilesettings.querySetting_

then docgen will transform this into a link (that may not exist yet, depending on how you run nim doc vs nim doc --project):

  ## See also <a class="reference external" href="compilesettings.html#_querySetting">std/compilesettings.querySetting</a>

if no then we should just mechanically fall back to some default HTML link that will be generated e.g. for every proc?

This indeed requires adding an additional "short-name anchor" for each symbol foo, eg:

<a id="foo,string,string"></a> # classical anchor
<a id="_foo"></a> # short-name anchor

(it's easy to generate 2 anchors per symbol)
I'm using a prefix _ to disambiguate short-name anchors from other ones; it's an implementation detail as doc comments in user code shall use an intermediate syntax which docgen translates to those.

what is its interaction with modules system/paths? E.g. where is strutils searched?

this is yet another benefit of this RFC: it lets docgen handle resolving of external module links, so that you can write compilesettings.querySetting_ and then docgen transforms it to a correct href regardless of how you run nim doc (so long it's consistent obviously for all modules that are docgen'd).

for nim doc --project foo.nim (the newer way to build docs, which is used for building compiler docs, fusion etc), the layout follows the filesystem (and .. is transformed into _._ to avoid going out of outdir, refs nim doc --project: @@ caused issues; use _._ instead Nim#14454)
for nim doc foo.nim (which is used by stdlib, currently), layout is flattened

in either way though, docgen should know where to find a module relatively to another module (by inspecting whether user passed in --project, ie optWholeProject flag to see if it's not flattened, and also checking --docRoot)

What's needed is this API:

type DocMode = enum
  dmFlat # for `nim doc`, eg used by stdlib
  dmFilesystem # for `nim doc --project`, eg used by compiler docs, fusion etc

proc genRelativeLink(docMode: DocMode, toModule: string, fromModule: string): string = ...
  runnableExamples:
    assert genRelativeLink(dmFlat, "std/system", "std/compilesettings") == "system.html"
    assert genRelativeLink(dmFilesystem, "compiler/plugins/locals", "compiler/pathutils") == "locals.html"
    assert genRelativeLink(dmFilesystem, "compiler/pathutils", "compiler/plugins/locals") == "_._/pathutils.html"

note 1

for external modules, this should also be possible, but would need some way to register location of external docs in some place; this can be discussed in another RFC:

# we can use a clean syntax for external links:
## See also pkg/regex.findAllBounds_

note 2

it's simplest to do everything in terms of canonical paths (introduced in nim-lang/Nim#16999), ie:

system/assertions, fusion/matching, std/tables

happy to discuss more

Araq · 2021-07-27T07:28:42Z

Please ensure backwards compatibility, links are shared and kept around for good.

timotheecour · 2021-07-30T20:16:12Z

we will emit a warning that there is an ambiguity

at what time do you issue a warning? it's about whether the context is known to nim doc or not.

eg, if you call nim doc system.nim which has a comment see also strutils.split_ ; how can nim doc know that split_ could resolve to both iterator or proc, given that it's in another module it hasn't seen yet?

if you issue a warning when compiling nim doc strutils.nim, that means you'd systematically get a warning whenever an iterator overloads a proc, regardless whether you use the link split_

the warning can make sense if all the context is known at the time of nimdoc, eg if see also split_ is used within strutils (pointing to same module), or, maybe in other cases with nim doc --project

in that case we will also select the "winning" overload according to fixed priority "procs > iterators > templates" because people normally imply procs

that's fine, and a rare case anyways (iterator overloading a proc), and if symbols are listed alphabetically regardless of kind, it's really not a big problem

split(string, char, int)_ anyway

for the explicit overload syntax, we could require the symbol kind, eg:

iterator split(string, char, int)_

a-mr · 2021-07-30T21:21:48Z

at what time do you issue a warning?

I've thought that through only for local resolution — I will issue a warning in rst 2nd pass (finishGenerateDoc). Regarding different-module symbol I'm not yet sure but I guess it does not matter, I will need to retrieve all the symbols of that module anyway so it can be done at any stage (including finishGenerateDoc)

timotheecour · 2021-07-30T21:55:35Z

oh actually there's a really simple solution to this problem: we can rely on index files (or reuse/augment the one that's generated via nim doc which IIRC implies --index; preferably json format) to list all the references there as well as the anchors generated; then nim buildIndex (called after all the nim doc foo commands) can issue warnings for each problem found:

# foo.nim:
## see bar.baz_
## see bar.baz2_
## see bar.baz2(int, float)_

# bar.nim:
iterator baz(): int = discard
proc baz(): int = discard

nim doc foo # generates htmldocs/foo.idx
nim doc bar # generates htmldocs/bar.idx
nim buildIndex # generates htmldocs/theindex.html from all the idx files

nim buildIndex will now report the following warnings:

bar.baz_ should be disambiguated as either iterator bar.baz_ or proc bar.baz_
bar.baz2_ link reference invalid, did you mean xxx ?
bar.baz2(int, float)_ link reference invalid, did you mean xxx ?

note 1

nim buildIndex is an existing command that's already called to generate theindex.html; this simply enhances what it does to also report link inconsistencies

note 2

it will work the same regardless of separate compilation (nim doc foo followed by nim doc bar) or joint (via nim doc --project all), and regardless of same module references vs cross-modules references; no need to distinguish those cases (simpler!)

note 3

this will allow fixing nim-lang/Nim#16337 by reporting all broken links automatically

a-mr · 2021-07-30T22:28:23Z

sounds like a great idea. But I cannot figure out how to use this command:

$ bin/nim buildIndex lib/pure 
Hint: used config file '/home/amakarov/activity-shared/Nim/config/nim.cfg' [Conf]
Hint: used config file '/home/amakarov/activity-shared/Nim/config/config.nims' [Conf]
Hint: used config file '/home/amakarov/activity-shared/Nim/config/nim.cfg' [Conf]
Hint: used config file '/home/amakarov/activity-shared/Nim/config/config.nims' [Conf]
Hint: used config file '/home/amakarov/activity-shared/Nim/config/nimdoc.cfg' [Conf]
fatal.nim(53)            sysFatal
Error: unhandled exception: options.nim(651, 3) `not conf.outFile.isEmpty`  [AssertionDefect]

docgen.rst says it accepts a directory.

I know alternative: for the compiler we can run bin/nim doc --project --index:on compiler/main.nim but what about stdlib...

timotheecour · 2021-07-30T23:16:49Z

for separate nim doc invocations, see how it's done in kochdocs; this works (but needs better documentation, PR welcome):

nim doc --outdir:/tmp/d19 --index lib/pure/strutils
nim doc --outdir:/tmp/d19 --index lib/pure/os
nim buildIndex -o:/tmp/d19/theindex.html /tmp/d19

a-mr · 2021-08-02T21:29:26Z

During implementation/writing a test I found a problem with interpretation of spaces and case sensitivity.
I'm implementing it by adding additional anchors from docgen.nim.

According to RST spec, links are case-insensitive and any number of whitespace is equivalent to one space character. Current rst.nim takes that into account.

This opens a question how to deal with links like SortOrder_ and `fill(openArray[T], T)`_. Changing openArray -> openarray is harmless but it's not so for SortOrder -> sortorder and T -> t. Also what to do with a space after comma (openArray, T)?

My current plan is to normalize both references from the rst.nim side and generated anchors from docgen.nim side in this way:

any spaces before and after punctuation — ,, (, ), [, ] — are deleted.
tokens inside this references/anchors are normalized the Nim way — first char is preserved — in these cases:
- if they are at the very 1st char of a reference/anchor
- if they are after keywords like proc and const
- if they are after the mentioned punctuation signs

So the examples above become Sortorder and fill(openarray[T],T).
Both points are not exactly RST spec-compliant but what else can be done?

a-mr · 2021-08-02T22:01:00Z

It seems that having a separate set for docgen-generated anchors in rst.nim is the way to go. Then we can match references with anchors by dedicated rules independently.

timotheecour · 2021-08-02T22:10:50Z

can you clarfiy?

will the following work?

type SortOrder* = enum k0, k1
proc sortOrder*() = discard
proc foo*[T](a: int, sortOrder: SortOrder) = discard
proc foo*[T](a: int, sortOrder: SortOrder, c: int) = discard
proc foo*() =
  ## See: SortOrder_ (will link to type)
  ## See: sortOrder_ (will link to proc)
  ## See: foo_ (will link to 1st foo overload)
  ## See: `foo[T](int,SortOrder,int)` (will link to 2nd overload)

can we preserve case as describe above, even at expense of RST compliance (makes it easier to copy paste from code)? what would break?

if links generated cannot distinguish between SortOrder_ and sortOrder_, we can think about an internal encoding for caps, which would be hidden from users (ie references in docs would not need the encoding, it'd be generated automatically)

sortOrder_ => sortorder_
SortOrder_ => ^sortorder_

but i don't think it's needed, is it?

if you have a WIP PR i can test against, this would clarify things

a-mr · 2021-08-02T22:56:46Z

yes, it will work more or less. sortOrder and SortOrder will resolve OK. The other 2 cases with work the following nuances:

foo_ is actually referring not to first overload but to whole group of procs foo (it will add a wrapper group <div id="foo-procs-all">, likewise for all same-name overloads)
last case should be spelled as `foo(int,SortOrded,int)`_

If you meant using T inside input parameters like

proc foo*[T](a: int, sortOrder: SortOrder, c: T) = discard

then the link should be spelled as `foo(int,SortOrder,T)`_ without any [T]. It's because that's how anchors are preserved, you can check that the generated anchor id = foo,int,SortOrder,T. So T in parameter is just inserted literally. I guess this cannot be changed without breaking reverse compatibility or introducing extra level of wrappers.

Regarding copy-pasting: I don't expect supporting parameters names since by the same reason they are redundant, wdyt?

I'm not sure what to do with [T], I did not expect to add any semantic checking in incorrect cases like `proc foo*[U](T)`_. So everything I can do for this support is to delete [T] after after foo* in my internal anchor.

I'll try to send the PR tomorrow.

timotheecour · 2021-08-02T23:58:11Z

can foo,int,SortOrder,T introduce ambiguities?
eg:

proc foo[T](a: int, b: SortOrder, c: T)
proc foo[T](a: int, b: SortOrder[T])

future work:

osproc.nim#L1230 has:

  ## * `posix_utils.sendSignal(pid: Pid, signal: int) <posix_utils.html#sendSignal,Pid,int>`_

which is not DRY; I wonder if we can support writing:

  ## * posix_utils.sendSignal_

and then show in the docs a correct link along with this link name:

proc posix_utils.sendSignal(pid: Pid, signal: int)

this probably require a 2-pass algorithm, hence "future work"

a-mr · 2021-08-08T22:53:51Z

can foo,int,SortOrder,T introduce ambiguities?
e.g.

No, it cannot. Only the first example has this signature, for second one it will be foo,int,SortOrder[T].

which is not DRY; I wonder if we can support writing:

Yes the current plan (for the second part of work) is to support both variants:

posix_utils.sendSignal_
`posix_utils.sendSignal(Pid,int)`_

and then show in the docs a correct link along with this link name:

A good news is that current idx files already preserve them so it shouldn't be hard to implement.

I think you are right, it's the most logical behaviour.

a-mr · 2021-08-11T18:57:45Z

@timotheecour
Current plan is to set a full link name like you wrote above:

proc sendSignal(pid: Pid, signal: int)

What if we want to provide our own name for the link like:

this Posix function

?
Is there any syntax expected for that?

timotheecour · 2021-08-11T19:11:41Z

Is there any syntax expected for that?

ya this should be discussed (but the preferred/encouraged way should be the simplest See posix_utils.sendSignal_)

some options:

  # this should work but not recommended (brittle links etc)
  ## See `this <posix_utils.html#sendSignal,Pid,int>`_

  # maybe this? (general group + individual overload)
  ## See `this <posix_utils.sendSignal_>`_ posix function.
  ## See `this <posix_utils.sendSignal(Pid,int)_>`_ posix function.

  # or this? (general group + individual overload)
  ## See [this](posix_utils.sendSignal_) posix function.
  ## See [this](posix_utils.sendSignal(Pid,int)_) posix function.

a-mr · 2021-08-12T17:02:41Z

@timotheecour
2 more suggestions:

What about this DRY but limited form (without arbitrary names) with using " to delimit a part for inclusion into link text:

`"proc posix_utils.sendSignal" (Pid, int)`_ -> proc posix_utils.sendSignal
`proc posix_utils."sendSignal" (Pid, int)`_ -> sendSignal
`"proc " posix_utils."sendSignal" (Pid, int)`_ -> proc sendSignal
?
What if we specify proc after the name as a request to use this name literally:
`posix_utils.sendSignal proc`_ -> posix_utils.sendSignal proc
Format "short name + proc" is widespread in os.nim and other files in Nim repository, so it makes sense to add special syntax for it. And we can still show a tip with full name when hovering over the link.

timotheecour · 2021-08-12T17:38:05Z

What about this DRY but limited form (without arbitrary names) with using " to delimit a part for inclusion into link text:

i don't really like these; if i decide to use a custom link name, I shouldn't be limited to using the symbol name (otherwise might as well stick to the default See posix_utils.sendSignal_)

What if we specify proc after the name as a request to use this name literally:

is the problem you're trying to solve to allow a syntax to show the short-form (without the full declaration), or to allow showing a custom link name?

I think we should allow writing arbitrary custom link name, which neither 1 nor 2 do (but by default encourage the simplest See posix_utils.sendSignal_ which auto-generates link name from declaration, showing nb of overloads if there are overloads, else showing the resolved declaration if there is a single symbol)

Format "short name + proc" is widespread in os.nim and other files in Nim repository

But See posix_utils.sendSignal_ could be used in each such instance, advantageously.

So all that's needed is to agree on a syntax to allow arbitrary custom link names; i've suggested one in #125 (comment) but feel free to suggest alternatives

a-mr · 2021-08-12T19:52:00Z

is the problem you're trying to solve to allow a syntax to show the short-form

yes, in most cases the full form is overwhelming IMHO. Consider this modified example from os.nim (splitPath):

  ## See also: BEFORE
  ## * `joinPath(head, tail) proc <#joinPath,string,string>`_
  ## * `joinPath(varargs) proc <#joinPath,varargs[string]>`_
  ## * `/ proc <#/,string,string>`_
  ## * `/../ proc <#/../,string,string>`_
  ## * `relativePath proc <#relativePath,string,string>`_
  ##
  ## See also: AFTER
  ## * `joinPath(string, string)`_
  ## * `joinPath(varargs[string])`_
  ## * `/`_
  ## * `/../`_
  ## * relativePath_

To my taste the AFTER case is unnecessarily detailed.

timotheecour · 2021-08-12T23:31:55Z

3 possibilities:

how about controlling in the UI whether the long or short form is shown (just like we have a switch in UI for dark mode vs light mode)
or, use a similar technique as we use for expanding the pragmas by clicking on it
or just simply always show the short-form for links

the main point is to allow keeping the nice, short form for all links and not have to introduce a decision making each and every time we write a link to a proc in docs

a-mr · 2021-08-28T23:44:25Z

In the updated edition of PR link text is what is actually displayed:

Referencing by just function names is also working now:

* `joinPath(head, tail)`_

Full signature is displayed as a tooltip on hovering over the link.

a-mr · 2021-11-06T16:38:18Z

There is one more design decision regarding cross-module doc links: whether .idx should be read implicitly or not.

If we input `mod1.proc1`_ link then the mod1.idx module should be loaded.

I'd prefer to introduce explicit syntax for that:

.. import:: mod1.idx

This is a new dedicated RST directive import here. It is supposed to be placed at the 1st doc comment in .nim file or somewhere in the beginning of .rst file. Then all symbols exported from mod1.nim (and recorded in mod1.idx) will be available for referencing in the current .nim or .rst file.

a-mr · 2022-09-17T08:56:04Z

Updated plan

No separate stage for link checking (like nim buildIndex proposed above) is required.
No explicit import for .idx files.

Instead:

doc processing will be done in 2 runs — a) with .idx generation b) with link resolution.
we'll have a rule that specially formatted link string like module: linkTarget (note the colon :) will load a <module>.idx file implicitly. This will work for .md/.rst files also, so e.g. it will be possible to reference section headings (and other anchors) in documentation between .md files in the simple format.

Using a recently introduced Markdown syntax [link string] (instead of RST `link string`_) the referencing will look like:

Ref. [manual: Lexical Analysis]
  -- reference section "Lexical Analysis" of manual.md from any `.nim/.md` file
Ref. [posix_utils: sendSignal(Pid, int)] 
  -- reference the procedure of module posix_utils.nim from any file

In this example manual.idx and strutils.idx will be sought for in output dir (default htmldocs or one configured manually in cmdline).
During the first run, when .idx files are not yet generated, nim doc/md2html will emit warnings. To suppress them a new option --noCheckLinks will be introduced. So 2 runs will typically look like:

nim doc --noCheckLinks --index:on <file> for all files in the project
nim doc <file> for all files in the project

In practice double running should not be too bothersome because all .idx can be generated once (e.g. in Nim repository — by ./koch docs ) and then one can write docs doing referencing with link checking using existing .idx files.

Araq · 2022-09-21T08:41:32Z

Think about the consequences regarding algorithmic complexity. It used to be the case that index generation was done without .idx files entirely but it caused an O(n^2) algorithm or even some exponential explosion.

Fully implements nim-lang/RFCs#125 Follow-up of: nim-lang#18642 (for internal links) and nim-lang#20127. Overview -------- Explicit import-like directive is required, called `.. importdoc::`. (the syntax is % RST, Markdown will use it for a while). Then one can reference any symbols/headings/anchors, as if they were in the local file (but they will be prefixed with a module name or markup document in link text). It's possible to reference anything from anywhere (any direction in `.nim`/`.md`/`.rst` files). See `doc/docgen.md` for full description. Working is based on `.idx` files, hence one needs to generate all `.idx` beforehand. A dedicated option `--index:only` is introduced (and a separate stage for `--index:only` is added to `kochdocs.nim`). Performance note ---------------- Full run for `./koch docs` now takes 185% of the time before this PR. (After: 315 s, before: 170 s on my PC). All the time seems to be spent on `--index:only` run, which takes almost as much (85%) of normal doc run -- it seems that most time is spent on file parsing, turning off HTML generation phase has not helped much. (One could avoid it by specifying list of files that can be referenced and pre-processing only them. But it can become error-prone and I assume that these linke will be **everywhere** in the repository anyway, especially considering nim-lang/RFCs#478. So every `.nim`/`.md` file is processed for `.idx` first). But that's all without significant part of repository converted to cross-module auto links. To estimate impact I checked the time for `doc`ing a few files (after all indexes have been generated), and everywhere difference was **negligible**. E.g. for `lib/std/private/osfiles.nim` that `importdoc`s large `os.idx` and hence should have been a case with relatively large performance impact, but: * After: 0.59 s. * Before: 0.59 s. So Nim compiler works so slow that doc part basically does not matter :-) Testing ------- 1) added `extlinks` test to `nimdoc/` 2) checked that `theindex.html` is still correct 2) fixed broken auto-links for modules that were derived from `os.nim` by adding appropriate ``importdoc`` Implementation note ------------------- Parsing and formating of `.idx` entries is moved into a dedicated `rstidx.nim` module from `rstgen.nim`. `.idx` file format changed: * fields are not escaped in most cases because we need original strings for referencing, not HTML ones (the exception is linkTitle for titles and headings). Escaping happens later -- on the stage of `rstgen` buildIndex, etc. * all lines have fixed number of columns 6 * added discriminator tag as a first column, it always allows distinguish Nim/markup entries, titles/headings, etc. `rstgen` does not rely any more (in most cases) on ad-hoc logic to determine what type each entry is. * there is now always a title entry added at the first line. * add a line number as 6th column * linkTitle (4th) column has a different format: before it was like `module: funcName()`, now it's `proc funcName()`. (This format is also propagated to `theindex.html` and search results, I kept it that way since I like it more though it's discussible.) This column is what used for Nim symbols resolution. * also changed details on column format for headings and titles: "keyword" is original, "linkTitle" is HTML one

* docgen: implement cross-document links Fully implements nim-lang/RFCs#125 Follow-up of: #18642 (for internal links) and #20127. Overview -------- Explicit import-like directive is required, called `.. importdoc::`. (the syntax is % RST, Markdown will use it for a while). Then one can reference any symbols/headings/anchors, as if they were in the local file (but they will be prefixed with a module name or markup document in link text). It's possible to reference anything from anywhere (any direction in `.nim`/`.md`/`.rst` files). See `doc/docgen.md` for full description. Working is based on `.idx` files, hence one needs to generate all `.idx` beforehand. A dedicated option `--index:only` is introduced (and a separate stage for `--index:only` is added to `kochdocs.nim`). Performance note ---------------- Full run for `./koch docs` now takes 185% of the time before this PR. (After: 315 s, before: 170 s on my PC). All the time seems to be spent on `--index:only` run, which takes almost as much (85%) of normal doc run -- it seems that most time is spent on file parsing, turning off HTML generation phase has not helped much. (One could avoid it by specifying list of files that can be referenced and pre-processing only them. But it can become error-prone and I assume that these linke will be **everywhere** in the repository anyway, especially considering nim-lang/RFCs#478. So every `.nim`/`.md` file is processed for `.idx` first). But that's all without significant part of repository converted to cross-module auto links. To estimate impact I checked the time for `doc`ing a few files (after all indexes have been generated), and everywhere difference was **negligible**. E.g. for `lib/std/private/osfiles.nim` that `importdoc`s large `os.idx` and hence should have been a case with relatively large performance impact, but: * After: 0.59 s. * Before: 0.59 s. So Nim compiler works so slow that doc part basically does not matter :-) Testing ------- 1) added `extlinks` test to `nimdoc/` 2) checked that `theindex.html` is still correct 2) fixed broken auto-links for modules that were derived from `os.nim` by adding appropriate ``importdoc`` Implementation note ------------------- Parsing and formating of `.idx` entries is moved into a dedicated `rstidx.nim` module from `rstgen.nim`. `.idx` file format changed: * fields are not escaped in most cases because we need original strings for referencing, not HTML ones (the exception is linkTitle for titles and headings). Escaping happens later -- on the stage of `rstgen` buildIndex, etc. * all lines have fixed number of columns 6 * added discriminator tag as a first column, it always allows distinguish Nim/markup entries, titles/headings, etc. `rstgen` does not rely any more (in most cases) on ad-hoc logic to determine what type each entry is. * there is now always a title entry added at the first line. * add a line number as 6th column * linkTitle (4th) column has a different format: before it was like `module: funcName()`, now it's `proc funcName()`. (This format is also propagated to `theindex.html` and search results, I kept it that way since I like it more though it's discussible.) This column is what used for Nim symbols resolution. * also changed details on column format for headings and titles: "keyword" is original, "linkTitle" is HTML one * fix paths on Windows + more clear code * Update compiler/docgen.nim Co-authored-by: Andreas Rumpf <rumpf_a@web.de> * Handle .md and .nim paths uniformly in findRefFile * handle titles better + more comments * don't allow markup overwrite index title for .nim files Co-authored-by: Andreas Rumpf <rumpf_a@web.de>

* docgen: implement cross-document links Fully implements nim-lang/RFCs#125 Follow-up of: nim-lang#18642 (for internal links) and nim-lang#20127. Overview -------- Explicit import-like directive is required, called `.. importdoc::`. (the syntax is % RST, Markdown will use it for a while). Then one can reference any symbols/headings/anchors, as if they were in the local file (but they will be prefixed with a module name or markup document in link text). It's possible to reference anything from anywhere (any direction in `.nim`/`.md`/`.rst` files). See `doc/docgen.md` for full description. Working is based on `.idx` files, hence one needs to generate all `.idx` beforehand. A dedicated option `--index:only` is introduced (and a separate stage for `--index:only` is added to `kochdocs.nim`). Performance note ---------------- Full run for `./koch docs` now takes 185% of the time before this PR. (After: 315 s, before: 170 s on my PC). All the time seems to be spent on `--index:only` run, which takes almost as much (85%) of normal doc run -- it seems that most time is spent on file parsing, turning off HTML generation phase has not helped much. (One could avoid it by specifying list of files that can be referenced and pre-processing only them. But it can become error-prone and I assume that these linke will be **everywhere** in the repository anyway, especially considering nim-lang/RFCs#478. So every `.nim`/`.md` file is processed for `.idx` first). But that's all without significant part of repository converted to cross-module auto links. To estimate impact I checked the time for `doc`ing a few files (after all indexes have been generated), and everywhere difference was **negligible**. E.g. for `lib/std/private/osfiles.nim` that `importdoc`s large `os.idx` and hence should have been a case with relatively large performance impact, but: * After: 0.59 s. * Before: 0.59 s. So Nim compiler works so slow that doc part basically does not matter :-) Testing ------- 1) added `extlinks` test to `nimdoc/` 2) checked that `theindex.html` is still correct 2) fixed broken auto-links for modules that were derived from `os.nim` by adding appropriate ``importdoc`` Implementation note ------------------- Parsing and formating of `.idx` entries is moved into a dedicated `rstidx.nim` module from `rstgen.nim`. `.idx` file format changed: * fields are not escaped in most cases because we need original strings for referencing, not HTML ones (the exception is linkTitle for titles and headings). Escaping happens later -- on the stage of `rstgen` buildIndex, etc. * all lines have fixed number of columns 6 * added discriminator tag as a first column, it always allows distinguish Nim/markup entries, titles/headings, etc. `rstgen` does not rely any more (in most cases) on ad-hoc logic to determine what type each entry is. * there is now always a title entry added at the first line. * add a line number as 6th column * linkTitle (4th) column has a different format: before it was like `module: funcName()`, now it's `proc funcName()`. (This format is also propagated to `theindex.html` and search results, I kept it that way since I like it more though it's discussible.) This column is what used for Nim symbols resolution. * also changed details on column format for headings and titles: "keyword" is original, "linkTitle" is HTML one * fix paths on Windows + more clear code * Update compiler/docgen.nim Co-authored-by: Andreas Rumpf <rumpf_a@web.de> * Handle .md and .nim paths uniformly in findRefFile * handle titles better + more comments * don't allow markup overwrite index title for .nim files Co-authored-by: Andreas Rumpf <rumpf_a@web.de>

timotheecour changed the title ~~DRY doc links: getFilePermissions <#getFilePermissions,string>_ => getFilePermissions~~ DRY doc links: getFilePermissions <#getFilePermissions,string>_ => $getFilePermissions Jan 30, 2019

timotheecour mentioned this issue Jan 30, 2019

better docs: os nim-lang/Nim#10492

Merged

Araq changed the title ~~DRY doc links: getFilePermissions <#getFilePermissions,string>_ => $getFilePermissions~~ simpler doc links: getFilePermissions <#getFilePermissions,string>_ => $getFilePermissions Oct 28, 2020

timotheecour mentioned this issue Nov 13, 2020

Add parseFloatThousandSep nim-lang/Nim#15421

Closed

timotheecour changed the title ~~simpler doc links: getFilePermissions <#getFilePermissions,string>_ => $getFilePermissions~~ simpler doc links: getFilePermissions <#getFilePermissions,string>_ => ``$getFilePermissions```` Dec 7, 2020

timotheecour changed the title ~~simpler doc links: getFilePermissions <#getFilePermissions,string>_ => ``$getFilePermissions````~~ simpler doc links: getFilePermissions <#getFilePermissions,string>_ => "$getFilePermissions`" Dec 7, 2020

timotheecour changed the title ~~simpler doc links: getFilePermissions <#getFilePermissions,string>_ => "$getFilePermissions`"~~ simpler doc links: getFilePermissions <#getFilePermissions,string>_ => $getFilePermissions Dec 7, 2020

This was referenced Dec 7, 2020

use funcs and fix links in strutils nim-lang/Nim#16277

Merged

Fix broken links in docs nim-lang/Nim#16337

Open

timotheecour mentioned this issue Jan 1, 2021

std/lists: Various changes to lists (RFC #303) nim-lang/Nim#16536

Merged

timotheecour mentioned this issue Feb 10, 2021

rst: runnableExamples affects rst underline headings nim-lang/Nim#16990

Closed

This was referenced Mar 4, 2021

rst: rst2html and doc now have clickable anchors for each paragraph/list element nim-lang/Nim#17251

Closed

Added assertion to clamp nim-lang/Nim#17248

Merged

timotheecour added a commit to timotheecour/vitanim that referenced this issue Mar 5, 2021

refs nim-lang/RFCs#125

b54988d

This was referenced Mar 6, 2021

misc high priority timotheecour/Nim#377

Open

fix #17260 render \ properly in nim doc, rst2html nim-lang/Nim#17315

Merged

This was referenced Mar 13, 2021

RST minor bugs nim-lang/Nim#17340

Open

allow short-style RST references with symbols nim-lang/Nim#17372

Merged

timotheecour mentioned this issue Mar 20, 2021

new genAst as replacement for quote do nim-lang/Nim#17426

Merged

timotheecour mentioned this issue Mar 28, 2021

[docs]close #12580 nim-lang/Nim#17549

Merged

a-mr mentioned this issue May 13, 2021

misc doc2tex timotheecour/Nim#727

Open

5 tasks

a-mr mentioned this issue Jun 13, 2021

docgen: move to shared RST state (fix #16990) nim-lang/Nim#18256

Merged

a-mr mentioned this issue Aug 3, 2021

docgen: implement doc link resolution in current module nim-lang/Nim#18642

Merged

7 tasks

a-mr mentioned this issue Aug 1, 2022

Implement Markdown automatic links extension nim-lang/Nim#20127

Closed

a-mr mentioned this issue Dec 1, 2022

docgen: implement cross-document links nim-lang/Nim#20990

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

simpler doc links: `getFilePermissions <#getFilePermissions,string>`_ => $`getFilePermissions` #125

simpler doc links: `getFilePermissions <#getFilePermissions,string>`_ => $`getFilePermissions` #125

timotheecour commented Jan 30, 2019 •

edited

Loading

ringabout commented Dec 29, 2020

a-mr commented Feb 11, 2021

timotheecour commented Mar 5, 2021 •

edited

Loading

timotheecour commented Mar 18, 2021 •

edited

Loading

timotheecour commented Jul 26, 2021 •

edited

Loading

a-mr commented Jul 26, 2021

timotheecour commented Jul 26, 2021 •

edited

Loading

Araq commented Jul 27, 2021

timotheecour commented Jul 30, 2021

a-mr commented Jul 30, 2021

timotheecour commented Jul 30, 2021 •

edited

Loading

a-mr commented Jul 30, 2021

timotheecour commented Jul 30, 2021

a-mr commented Aug 2, 2021 •

edited

Loading

a-mr commented Aug 2, 2021

timotheecour commented Aug 2, 2021 •

edited

Loading

a-mr commented Aug 2, 2021

timotheecour commented Aug 2, 2021

a-mr commented Aug 8, 2021

a-mr commented Aug 11, 2021

timotheecour commented Aug 11, 2021

a-mr commented Aug 12, 2021

timotheecour commented Aug 12, 2021

a-mr commented Aug 12, 2021

timotheecour commented Aug 12, 2021 •

edited

Loading

a-mr commented Aug 28, 2021

a-mr commented Nov 6, 2021

a-mr commented Sep 17, 2022

Araq commented Sep 21, 2022

simpler doc links: getFilePermissions <#getFilePermissions,string>_ => $getFilePermissions #125

simpler doc links: getFilePermissions <#getFilePermissions,string>_ => $getFilePermissions #125

Comments

timotheecour commented Jan 30, 2019 • edited Loading

proposal

concrete syntax for the proposal

benefits:

ringabout commented Dec 29, 2020

a-mr commented Feb 11, 2021

timotheecour commented Mar 5, 2021 • edited Loading

overloads

benefits

bikeshedding

conclusion

links

timotheecour commented Mar 18, 2021 • edited Loading

timotheecour commented Jul 26, 2021 • edited Loading

a-mr commented Jul 26, 2021

timotheecour commented Jul 26, 2021 • edited Loading

note 1

note 2

Araq commented Jul 27, 2021

timotheecour commented Jul 30, 2021

a-mr commented Jul 30, 2021

timotheecour commented Jul 30, 2021 • edited Loading

note 1

note 2

note 3

a-mr commented Jul 30, 2021

timotheecour commented Jul 30, 2021

a-mr commented Aug 2, 2021 • edited Loading

a-mr commented Aug 2, 2021

timotheecour commented Aug 2, 2021 • edited Loading

a-mr commented Aug 2, 2021

timotheecour commented Aug 2, 2021

future work:

a-mr commented Aug 8, 2021

a-mr commented Aug 11, 2021

timotheecour commented Aug 11, 2021

a-mr commented Aug 12, 2021

timotheecour commented Aug 12, 2021

a-mr commented Aug 12, 2021

timotheecour commented Aug 12, 2021 • edited Loading

a-mr commented Aug 28, 2021

a-mr commented Nov 6, 2021

a-mr commented Sep 17, 2022

Updated plan

Araq commented Sep 21, 2022

simpler doc links: `getFilePermissions <#getFilePermissions,string>`_ => $`getFilePermissions` #125

simpler doc links: `getFilePermissions <#getFilePermissions,string>`_ => $`getFilePermissions` #125

timotheecour commented Jan 30, 2019 •

edited

Loading

timotheecour commented Mar 5, 2021 •

edited

Loading

timotheecour commented Mar 18, 2021 •

edited

Loading

timotheecour commented Jul 26, 2021 •

edited

Loading

timotheecour commented Jul 26, 2021 •

edited

Loading

timotheecour commented Jul 30, 2021 •

edited

Loading

a-mr commented Aug 2, 2021 •

edited

Loading

timotheecour commented Aug 2, 2021 •

edited

Loading

timotheecour commented Aug 12, 2021 •

edited

Loading