Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The SAW Documentation Revolution (Part 1) #2202

Merged
merged 82 commits into from
Feb 10, 2025

Conversation

ChrisEPhifer
Copy link
Contributor

@ChrisEPhifer ChrisEPhifer commented Jan 28, 2025

Overview

This is a pretty large PR, but is focused entirely on the contents and structure of doc.

TL;DR for reviewers: The following commits/commit ranges correspond to relatively high-level changes to SAW's documentation structure:

  • 51c669e
    Moves 'internal' documentation out of the way. A temporary measure modified later to include this documentation in the overall hierarchy.
  • dc42594 to e050f28
    Initial Markdown normalization and cleanup of doc/'s stale files.
  • 0df6156 to e98eea5
    Refactors the LLVM/Java tutorial into multiple Markdown sources, so maintenance is easier.
  • 9a72549 to ba8c180
    Similar to the above, but for the Rust tutorial
  • 81a6250 to a416bc9
    Ditto for the SAW User Manual.
  • dba1454 to c4444d1
    The tutorials/manual are given more memorable names.
  • c3eed8f to c318c83
    Rename 'internal' to 'development', and structure these documents to be included in the overall hierarchy.
  • a235e05
    Adds all of the necessary index.md files to tie the documentation together.
  • f808628
    Implements a pygments lexer for SAWScript (see full commit message for details - this can be removed eventually).
  • 71a2d4a
    Prepares doc/ for a Sphinx installation.
  • b7e0232 and 954ab67
    Add helper scripts to doc/ to build the documentation.
  • a15e5b7
    Add newly-rendered PDFs of existing resources.
  • 4266590
    Document the documents :)

TODOs not covered in this PR

  • Resolving pointers to doc/ that exist elsewhere in saw-script as appropriate

  • Moving other existing docs to doc/ (see the README for details)

  • Deploying documentation to the GitHub pages site for saw-script in CI

    The Cryptol setup to get this working with multiple versions is... complicated.
    I think we can adapt this SO answer for our purposes - it's far simpler than what Cryptol does currently, and should probably be considered as a solution in that context as well.

  • Replacing in-repo PDFs with appropriate CI/Action artifacts

Related issues

A few of the Markdown files that sat in doc/ are not intended for
public consumption (i.e. deployment to HTML/PDF versions of
documentation).
These include:

- doc/extcore.md -> doc/internal/extcore.md
  This is documentation for the SAWCore dump format, which
  is (hopefully) not something typical users are having to mess with.
- doc/limitations.md -> doc/internal/limitations.md
  A five-year-old attempt at documenting the limitations of SAW's use
  for verification.
  It should be updated or scrapped.
- doc/RELEASING.md -> doc/internal/releasing.md
  A (partially?) outdated documentation of SAW's release process.
  This needs some updating as well.
None of the files in doc/manual/code were referenced or used anywhere
in doc/manual, and at least one of these files made reference to
something completely nonexistent (see #2185).
One was ill-formed (links based on headers are written with a '#'
followed by a normalization of the header text), and another was
completely non-existent / referred to a section that had been renamed
at some point in the past.
This document is (in my opinion) fundamentally ill-formed, as there
are multiple level-1 headings.
Before fixing that, however, we might as well use the more-correct
incorrect thing.
I'm not entirely sure where the syntax `{.c}`/`{.rs}`/etc. came from,
but it was not recognized by the Sphinx Markdown engine while testing
locally, so I've used the conventional syntax everywhere instead.

This change doesn't break the existing PDF builds, either, so perhaps
it has to do with how the old website pages were built?
This temporarily makes this Markdown violate the "thou shalt only have
one <h1>" rule, however: Logically, we want this part of the SAW
documentation to be split into chapters, similar to the LLVM/Java
tutorial.
Before, this part of the documentation rendered to PDF (via Sphinx) as
one giant chapter.
Specifically, to 'Introduction'. Consistency is good!
Not in that order :)

In an effort to start standardizing the Markdown (e.g. removing
multiple top-level headers), this commit primarily restructures the
LLVM/Java tutorial into multiple source files.

This additionally makes individual parts of the tutorial easier to
maintain.
A similar change will be implemented for the manual and Rust tutorial
for the same reason.

This restructuring naturally broke the original PDF generation;
fortunately, the eventual introduction of Sphinx will give us both
HTML and PDF generation for free, and this generation strategy is no
longer necessary to maintain.

Note that this commit does _not_ perform any changes to the content of
the tutorial or indeed any other structural property of the Markdown.
Additional changes (e.g. standardizing the Markdown syntax used) will
be the subject of future commits.
This is part of a larger effort to standardize the Markdown sources
for SAW's various pieces of documentation.
Namely: Fenced with ``` ... ```, no extra space between the fence
marker and the language specifier, and _always_ labeled with a
language specifier.
Because we are using Sphinx/MyST, there is no need to use the
docode.hs script to insert code fragments, as we have the
literalinclude directive.

tutorial/using-smt-lib-solvers: Repair broken code references.

tutorial/compositional-proofs.md: Repair broken code references.

tutorial/interactive-interpreter: Repair broken code references.

tutorial/other-examples: Repair broken code references.

tutorial: Remove docode.hs.
And delete the PNG version, as the PDF will suffice to add the Galois
logo to PDF documentation.
A much more descriptive name :)
This prevents rendering issues, and is more consistent with the
directive mechanism.
It also allows for the use of 'admonitions', which may be helpful in
improving the documentation.
- Mismatched inline code elements
- Code blocks without whitespace in between
- Incorrect link (introduced by splitting up rust-tutorial.md)
- ::: fences
- {literalinclude} with :lines: for code/ sources (docode.hs defunct)
- {code-block} <lang> for everything else
- Invalid whitespace around headings / code blocks
- Bare URLs
- Incorrect emphasis notation (** instead of __)
- Invalid links to other pages/sections
- Code blocks (::: fences)
- __ for emphasis instead of **
- '-' instead of '*' for unordered lists
- Unordered -> Ordered (and vice versa) where appropriate
- Fix broken link(s)
For consistency with the actual title of this resource.
Analogous to dba1454 - more consistent with the title of the book
whose sources are contained within.
These documents may not be intended for most users, but they ought to
be available to read in a nicer format than Markdown files in the
repository.
Since we control which documents are rendered to PDF (i.e. when we
`make latexpdf`) explicitly, these developer documents will only be
rendered as part of the HTML (that will eventually be our GitHub Pages
site for saw-script).

This change reflects that these are documents related to the
development of SAW (rather than generically 'internal').
Additional restructuring will follow in additional commits.
This maintains the original internal linking of the affected
documents while using  explicit _reference labels_ rather than links
based on document names / headers.

In addition to removing dependence on references generated from
headers, which are subject to change during content editing (thereby
breaking links), reference labels allow all documents (eventually
managed by Sphinx) to straightforwardly cross-reference one another,
and are one of the documentation engine's most compelling features.

There is, however, a catch: For Sphinx to correctly render
cross-referenced documents, reference labels must be unique _to the
entirety of the documentation_.
In other words, the SAW User Manual and Rust Verification with SAW
tutorial could not both define a reference label `my-reference`.

There is no local scoping mechanism for reference labels, so there is
necessarily a tradeoff between creating a highly-connected, wiki-like
documentation (where items are given reference labels more often than
not), and having a massive namespace of reference labels to manage.
This is something to consider for the future, but certainly not our
biggest concern regarding documentation at the moment.
And normalize the Markdown formatting in the process.

This is analogous to earlier commits relating to the manual and
tutorials (but simpler, since there was no existing infrastructure
around this document).
And by 'fix', I mean: Add a document containing a verbatim rendering
of the 'real' `CHANGES.md` at the SAWScript repository root, and link
to that.

An alternative that was considered that would have resulted in the
'real' `CHANGES.md` being rendered directly was the `{include}`
directive.
Not only is it the case that "the include directive represents a
potential security hole"[^1], Sphinx will attempt (and fail) to
resolve the links within, which will fail fantastically since most of
them are outside of the doc/ root.

[^1]:
https://docutils.sourceforge.io/docs/ref/rst/directives.html#include
It will be a convenient place to link to from the index of the SAWCore
external format description in the developer materials.
In particular:

- Explicitly note that `make` is required to build documentation
- Replace the description of `make_docs.sh` with a description of the `Makefile`
  and its targets, as well as the `scripts/` directory and contents
- Update troubleshooting recommendations to match
- Update descriptions of the `doc/` directory contents:
  - In particular, remove note about removing PDFs from the repository.
    This was discussed out-of-band, and we aligned on leaving the PDFs around
    for developer benefit while browsing the repository
- Remove the no-longer-necessary disclaimer about packaging code
This reduces the 'escaped' dollar signs and produces more sensible
output (i.e. the output shows the result of expansion).
This Python package implements a new Sphinx role, download-dir, that can be used
to indicate that a referenced directory should be archived and made
downloadable.

For example: If we write {download-dir}`/my/cool/dir`, the role attempts to
create `$SOURCEDIR/my/cool/dir.tar.gz` from the contents of
`$SOURCEDIR/my/cool/dir`, where $SOURCEDIR is the Sphinx project root (for us,
`saw-script/doc/`).

In short, this means that you can only create downloads out of directories
somewhere under in `doc/`, including `doc/` itself.

As is Sphinx's convention, then, paths starting with `/` indicate that the
path is rooted at $SOURCEDIR, and relative paths are relative to the document
where the role was used.

The role also supports
{download-dir}`links with explicit titles <thing/to/download>`; when no title is
provided, the link will be the name of the archive (i.e. the stem of the target
directory + ".tar.gz").

This is an improvement over our current solution, which is inflexible (requiring
that only specific directories be used as download targets) and doesn't fully
utilize Sphinx's/docutils's extensible role framework (instead using scripts).
And, remove a spurious comment.

This makes sphinx-download-dir available in the build environment created by
scripts/setup_env.sh.
This will keep workspaces clean when building locally, and encourages handling
the archiving process via sphinx-download-dir.

There is currently one exception in the Rust tutorial sample code, where a
tar.gz of a large JSON file is provided.
Of course, one can use git add -f to commit additional archives as necessary.
This replaces uses of MyST's <path:...> link format used to generate download
links to individual files with uses of the role introduced by the
sphinx-download-dir package.
Given that we now use sphinx-download-dir (which handles the archiving of code
samples during the Sphinx build), we have no need for the packaging script /
make target.

Furthermore, we make the mostlyclean target interactive in its deletion of
tar.gz archives, to prevent unintentionally deleting a checked-in archive that
is part of a sample directory (the Rust tutorial is a case like this).

This is mostly necessary since we no longer force the sample directory / archive
to have a specific name, and simply attempt to clean up _all_ tar.gz archives in
and under `doc/`.
Mostly focuses on the effects on users (namely, that the PDFs they get with
releases have changed a bit).
Copy link
Contributor Author

@ChrisEPhifer ChrisEPhifer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I believe I addressed all of the review comments, and in addition I was able to further clean things up by replacing the somewhat hacked-together code sample packaging with a flexible Sphinx role.

doc/README.md Outdated Show resolved Hide resolved
doc/README.md Outdated Show resolved Hide resolved
doc/README.md Show resolved Hide resolved
doc/README.md Outdated Show resolved Hide resolved
doc/README.md Outdated Show resolved Hide resolved
doc/development/releasing.md Outdated Show resolved Hide resolved
doc/development/releasing.md Outdated Show resolved Hide resolved
doc/rust-tutorial/case-study-salsa20.md Outdated Show resolved Hide resolved
doc/tutorial/sawScriptTutorial.bib Outdated Show resolved Hide resolved
Copy link
Contributor

@sauclovian-g sauclovian-g left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few more minutiae, and there's one leftover from the last batch, but in general looks good. Thanks for taking care of all this!

doc/saw-lexer/src/saw_lexer.py Outdated Show resolved Hide resolved
doc/Makefile Outdated Show resolved Hide resolved
doc/Makefile Outdated Show resolved Hide resolved
doc/.env Outdated Show resolved Hide resolved
doc/README.md Outdated
This can be used to provide downloads in the HTML rendering of the
documentation (see [below](#code-examples)).
- `install-pdf`: Build and install PDF renderings to `doc/pdfs/`.
- `mostlyclean`: Clear out all packaged code and Sphinx-generated files.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My preference for these two would be clean and distclean respectively, since the general convention is that clean removes build products and distclean also removes configuration. FWIW, may not matter...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolved: 615e5ee

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One last thing: doc/README.md still refers to the old names of these 😐

doc/.gitignore Show resolved Hide resolved
doc/Makefile Outdated Show resolved Hide resolved
doc/README.md Show resolved Hide resolved
This is in addition to the 'builtin' functions we removed special highlighting
for previously.
Per @sauclovian-g: This will prevent any accidental leakage of the Python
virtual environment into other commands.
1. Move the definition from doc/.env to the more appropriate
   doc/scripts/epoch.mk.
2. Add some documentation about how the time is / can be generated.
These better match the conventional naming of make rules for cleanup.
In particular, to be explicit about the fact that the build will create a
Python virtual environment and download dependencies.
As @sauclovian-g rightfully points out, we'll probably want to be able to run
this in CI.

This commit additionally addresses that we had a checked-in .tar.gz while
adding *.tar.gz to the .gitignore, by replacing that .tar.gz with a .gz (also
taking this opportunity to update the reference to this file in the text of the
Rust tutorial it accompanies).
Copy link
Contributor Author

@ChrisEPhifer ChrisEPhifer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I took care of all the remaining items -- apologies for the delay!

doc/.gitignore Show resolved Hide resolved
doc/README.md Show resolved Hide resolved
doc/README.md Outdated
This can be used to provide downloads in the HTML rendering of the
documentation (see [below](#code-examples)).
- `install-pdf`: Build and install PDF renderings to `doc/pdfs/`.
- `mostlyclean`: Clear out all packaged code and Sphinx-generated files.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolved: 615e5ee

doc/Makefile Outdated Show resolved Hide resolved
doc/.env Outdated Show resolved Hide resolved
doc/Makefile Outdated Show resolved Hide resolved
doc/saw-lexer/src/saw_lexer.py Outdated Show resolved Hide resolved
To match the renaming of mostlyclean -> clean and clean -> distclean.
@ChrisEPhifer ChrisEPhifer added the PR: ready to merge Magic flag for pull requests to ask Mergify to merge given an approval and a successful CI run label Feb 10, 2025
@ChrisEPhifer ChrisEPhifer merged commit 8fdfce6 into master Feb 10, 2025
34 checks passed
@ChrisEPhifer ChrisEPhifer deleted the 2197-deploying-saws-written-materials branch February 10, 2025 21:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
PR: ready to merge Magic flag for pull requests to ask Mergify to merge given an approval and a successful CI run
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Nonexistent java_symexec in doc/manual/code
3 participants