How to style-check comments in all packages in opam (a.k.a. "odoc lint")

Some thoughts about a separate tool for this purpose, perhaps a shell or OCaml script in this repo, an improved version of the current [`corpus.sh`](https://github.com/ocaml/odoc/blob/935c2115bd829237387378a13b5f0d53d4e9795a/src/util/corpus.sh).

TL;DR for @avsm, @rizo: I'm leaning towards a script that works on one package at a time, to run in opam's CI. It will:

1. Run over the package's `.cmti` files naturally generated by its own build system.
2. Blindly run over the package's `.mli` and `.mld` files, trying to compile them individually.
3. Filter out duplicate errors.
4. Show the result and succeed/fail according to what we think about the severity of having doc syntax errors.

The purpose of (1) is to use the build system's knowledge about project layout and preprocessing. The purpose of (2) is to stupidly try to cover anything that wasn't included by (1), and cover `.mld` files.

Note that we can do (1) and (2) whether the package build succeeds or not. (1) can still produce some `.cmti` files, and (2) will work for some `.mli` files.

 

Thought process starting from the last time I did a mass style check (more like whether-odoc-is-reasonable check), using `corpus.sh`:

1. `corpus.sh` wants as large an opam switch as possible.

 The reason for this is that opam files already contain each package's own build instructions, and we want to build as many `.cmti`s that way as we can.

 This can be done with a command like

 ```
 opam install odoc --criteria='+count(solution)'
 ```

 It then goes over all the `.cmti`s and runs `odoc compile` on each one.

 This covers, to some extent, both packages that use Dune, and packages that don't, as long as they pass `-bin-annot`. In my experience, using a "typical" package as the "center" of the switch (`odoc` in the command above) installs about 75% of the packages.

The rest of the thoughts are about covering packages that aren't installed in (1), or files that aren't compiled.

 

2. List all packages, and repeat (1) with "centers" that are outside already-installed switches.

 These consistent switches centered around different packages might have large intersections of packages that can be installed in both. We may want to install only each package and its dependencies in each next switch.

 There is also the complication that packages installed in each switch might not be at their latest version, so we probably need to check that.

 Also, some packages will not be buildable. This whole process needs to be done on a dedicated (virtual) machine or in a container, because some packages have system dependencies that need to be installed first.

 **(recommended)** A yet other way to do this is to rely on opam's CI, or a variant of it, and either take `.cmti` files from it, or just the generated error messages. We likely don't actually need to ever run this new `corpus.sh` outside opam CI anyway, and it's probably fine to run the script on one package at a time.

After solving (2), we should be checking as many `.cmti` files as we can get automatically compiled by each package's own build system.

 

3. Keep the sources of each package. Also, if it is possible that opam ever doesn't download the sources of any package after (2), we should try to download them separately.

 Go through the sources, find `.mld` files, and run `odoc compile` on them.

 Find all `.mli` files in sources, subtract `.mli` files that we already have `.cmti` files for. Try to compile these `.mli` files using separate `ocamlc` commands, and then run `odoc compile` on the resulting `.cmti`s.

 This should cover all `.mld` files, and all `.mli`s that either were not used during successful builds in (1) or (2), or `.mli`s in projects where the builds failed, or could not be started due to constraints, except those `.mli`s that require preprocessing. We probably can't do anything intelligent about the latter.

 **(not recommended by me)** We should be able to find the source file of each `.cmti` by loading it using `compiler-libs`, and checking locations. At least Dune seems to put correct locations into the files, even in case of preprocessing (though this will need to be checked). However, even if the source file names are wrong, the only harm is that we will effectively try to process the corresponding `.mli` file twice, as long as no `.cmti` ever claims that the source file is a real `.mli` from which it was *not* compiled (thus shadowing it).

 **(recommended)** Alternatively, we can just blindly try to compile all `.mli`s in (3), and use postprocessing to remove duplicate errors.

 

We may want a mode that works on the `master` branch of each repo, as that might be more useful for knowing what still needs to be fixed (rather than only released). If going with doing this per-package, then the same script that will work in opam's CI should also work from the root directory of a repo checkout.

 

EDIT: This was prompted by #226.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

How to style-check comments in all packages in opam (a.k.a. "odoc lint") #228

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

How to style-check comments in all packages in opam (a.k.a. "odoc lint") #228

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions