-
Notifications
You must be signed in to change notification settings - Fork 431
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Demos in readthedocs have broken links #1292
Labels
Comments
Merged
kjun9
added a commit
that referenced
this issue
Apr 21, 2020
This adds our demo notebooks to readthedocs documentation using `nbsphinx` and `nbsphinx-link`. There are a variety of follow-ups mostly to improve formatting and make the automation more robust: * messy sections #1294 * binder/colab buttons #1293 * broken links #1292 * readme as indices #1298 * validate script in CI #1299 See #1159
This was referenced Apr 24, 2020
I think there's two causes here:
|
huonw
added a commit
that referenced
this issue
Apr 28, 2020
Our notebooks are designed to be narrated examples, which means they contain a bunch of text and headings. It's helpful to have consistent formatting for these for a few reasons: - our "cloud-runner" cells look best if they're immediately after the notebook title, so the first cell should be only the title - that title should be a H1 element (`# ...`), since it's semantically the main title and it ends up as the page title in the HTML docs (such as that shown on Read the Docs) - there should only be one H1 element (`# ...`), since the others end up being considered siblings to the main title and thus appear in the HTML docs tables of contents. Also, it's better practice: for instance, MDN <https://developer.mozilla.org/en-US/docs/Web/HTML/Element/Heading_Elements#Usage_notes>: > You should only use one `<h1>` per page. Using more than one will not result > in an error, but using only one is seen as a best practice. It makes logical > sense — `<h1>` is the most important heading, and tells you what the purpose > of the overall page is. You wouldn't have a book with more than one title, > or a movie with more than one name! Having a single top-level title is also > arguably better for screenreader users, and SEO. - headings shouldn't skip, meaning `## ...` (`<h2>` in HTML) should be followed by `### ...` (`<h3>`, or anything higher like `## ...`), not by `#### ...` (`<h4>`). For instance, reStructured Text requires "consistent" heading levels <https://docutils.sourceforge.io/docs/ref/rst/restructuredtext.html#sections> without skipping: > However, a document must be consistent in its use of section titles: once a > hierarchy of title styles is established, sections must use that hierarchy. <https://developer.mozilla.org/en-US/docs/Web/HTML/Element/Heading_Elements#Usage_notes>: > Avoid skipping heading levels: always start from `<h1>`, next use `<h2>` > and so on. This pull request adds a script that validates these rules, uses it on CI, and thus fixes all of notebooks that violate these guidelines. In order to keep this PR focused, it tries to _only_ fix the violations of the guidelines and not fix other typos and so on. The script has specific output about the problem, tries to have guidance for how one might fix the problems and includes links to the NBViewer-rendered notebooks. For instance, https://buildkite.com/stellar/stellargraph-public/builds/3448 shows all of the violations in our current `develop` notebooks. For the the changes to notebooks themselves: - GitHub comparison: - Before: https://github.com/stellargraph/stellargraph/blob/a310143ff/demos/embeddings/stellargraph-metapath2vec.ipynb - After: https://github.com/stellargraph/stellargraph/blob/9a1fd36/demos/embeddings/stellargraph-metapath2vec.ipynb - HTML docs table-of-contents comparison (see #1363 for screenshots): - Before: https://stellargraph.readthedocs.io/en/v1.0.0rc1/demos/index.html many of the links are links to internal sections of the main demos - After: https://stellargraph--1363.org.readthedocs.build/en/1363/demos/index.html This PR also resolves 13 of the 75 warnings in #1360 (all of the "Title level inconsistent" ones). There's potential future work we can do here, like: - validating links to other notebooks are relative (this might play into #1292 too) - ensuring there's a "conclusion" section at the end of each notebook - adding spellchecking (possibly overly annoying): #1368 See: #1294
This was referenced Apr 28, 2020
huonw
added a commit
that referenced
this issue
Apr 30, 2020
This is a building block for #1292, since part of that will require validating the contents of link elements (see: #1394), which will be much more reliable with a real parser. I chose the commonmark library https://github.com/readthedocs/commonmark.py because it (a) provides easy access to an AST node type, and (b) that AST records the source positions of block elements (in the form of the `sourcepos` property, which is a list like `[start, end]`, where `start` and `end` are lists like `[line_number, column_number]` (one-based)). The source positions make it easy to report nice error messages with lines from the cells, without having to reconstruct/guess at which part of the code was broken. This does increase the amount of code. Most of that is in support code, while the `@checker`s themselves and even error message construction get (arguably) simpler. The code can now reason in terms of actual markdown entities, instead of having to do regex parsing/guessing. This also catches one instance of a heading that the regexes did not, in the PPNP demo. Validation: I added some broken-on-purpose notebooks and built them in: https://buildkite.com/stellar/stellargraph-public/builds/3541 See: #1292
huonw
added a commit
that referenced
this issue
Apr 30, 2020
Our demo notebooks contain links to READMEs. These links don't work on Read the Docs/in Sphinx, because we have `index` (per the web server convention) files for each directory, not READMEs. This PR writes a basic sphinx extension that rewrites any relative links to a file called "README.md" to instead be a link to "index.txt", matching our Sphinx index files. I modelled this after: - https://www.sphinx-doc.org/en/master/extdev/index.html#dev-extensions - https://github.com/spatialaudio/nbsphinx/blob/93398b9d8ba3922e860486d7eacf9eeb3b036580/src/nbsphinx.py#L1527-L1586 This approach allows us to retain links to READMEs in the notebooks, so that they work locally and on Github. These links are probably useful even with the move to Read the Docs being the canonical display of demos (#1391), due to being useful locally. (A simpler approach that breaks this would be to change the links to be correct for Read the Docs within each notebook.) This also adjusts the case of one of the `connector/neo4j` README file, so that it's consistent with the rest of the repo, to make this code simpler. This fixes 12 out of 66 of the remaining warnings in #1360; all of the ones like: ``` WARNING: File not found: 'demos/README.md' WARNING: File not found: 'demos/node-classification/README.md' ``` Example: in the GCN node classification notebook (https://github.com/stellargraph/stellargraph/blob/1cca6b6c6f4cd1fea7c6932918cba18d15eef6d4/demos/node-classification/gcn/gcn-cora-node-classification-example.ipynb), there's a conclusion that includes links to both the parent node-classification README and the all notebooks README. Its Markdown source is: ```markdown StellarGraph includes [other algorithms for node classification](../README.md) and [algorithms and demos for other tasks](../../README.md). ``` Before: https://stellargraph.readthedocs.io/en/latest/demos/node-classification/gcn/gcn-cora-node-classification-example.html#Conclusion These links stayed pointing to `README.md` in the respective parent directories, which don't exist in the Sphinx docs/on Read the Docs, and so they're dead links. Generated HTML (note the `external` CSS class, and the `href=.../README.md`): ```html StellarGraph includes <a class="reference external" href="../README.md">other algorithms for node classification</a> and <a class="reference external" href="../../README.md">algorithms and demos for other tasks</a>. ``` After: https://stellargraph--1401.org.readthedocs.build/en/1401/demos/node-classification/gcn/gcn-cora-node-classification-example.html#Conclusion The links were rewritten and now work. Generated HTML (note the `internal` CSS class, and the `href=.../index.html`): ```html StellarGraph includes <a class="reference internal" href="../index.html"><span class="doc">other algorithms for node classification</span></a> and <a class="reference internal" href="../../index.html"><span class="doc">algorithms and demos for other tasks</span></a>. ``` See: #1292
huonw
added a commit
that referenced
this issue
Apr 30, 2020
This moves us towards using Read the Docs as the main source of demos, by duplicating the indexing information from READMEs into the Sphinx docs. We've decided to move to using Read the Docs like this because: - it gives us more control over how things are shown (e.g. #1386) - the display is faster - it allows us to version properly, including with internal links to documentation of the same version as the demo I used [m2r](https://miyakogi.github.io/m2r/index.html) to convert each of our demo READMEs to a reStructuredText `index.txt` file, and then fixed them up by hand. For each index, I've gone with: ```rst Title ====== ... description with subsections etc ... Table of contents ------------------ ... full ToC listing of subfolders/demos of this index ... ``` (Where the description is empty in some cases, for directories without READMEs: #1139.) The thinking is we'll replace the README.md files with a much simplified set of information, potentially just a link to the corresponding Read the Docs. However, the READMEs are still useful as an index if someone has a copy of the demos locally in Jupyter lab. Thus, this PR also extends the `demo_table.py` script to render the demo indexing table to both a raw-HTML markdown version into `demos/README.md` and a rST `list-table` version into `docs/demos/index.txt`. This means we can still retain that demo table in an otherwise reduced/not-human-maintained `demos/README.md`. Other than extending `demo_table.py`, this tries to do the minimal work to get this working. Future work: - optimising the display of each of the tables (at the moment they scroll sideways a lot): #1418 - handling abbreviations in the main indexing table, because it doesn't seem easy to get hover text in rST: #1417 - validating links (likely covered by #1360) - updating the demos to have relative links pointing to the index files, not README.mds (likely covered by #1292 and #1360) - updating the README files to remove their content: #1419 - updating any links to the demos on GitHub to point to Read the Docs: #1420 Rendered demo landing page: https://stellargraph--1391.org.readthedocs.build/en/1391/demos/index.html See: #1298
huonw
added a commit
that referenced
this issue
May 1, 2020
This means that all instances of nested formatting in our notebooks don't work (spatialaudio/nbsphinx#301), including examples like the above and also, more often, formatting within or around links: ```markdown [some `code` in a link](url) ***[bold italic link](url)*** ``` This PR removes all instances of this, in two ways: | description | before | after | |------------------------------------|-------------------------------|------------------------------| | removing formatting | ***text*** ***[link](#url)*** | **text**<br>[link](#url) | | moving link (especially doc links) | [some `method`](#url) | some `method` ([docs](#url)) | This removal was driven by extending the `notebook_text_checker` script to find instances of this, using a real Markdown parser (#1393). The script will ensure that we don't have more instances come into our notebooks in future. Example of failure on CI (with the notebooks that were updated here): https://buildkite.com/stellar/stellargraph-public/builds/3610#annotation-notebook_text_checker I checked some of the instances I fixed here in the existing documentation, and all of them were rendered incorrectly, validating that this is useful. Examples: - links in GCN node classification: Before: https://stellargraph.readthedocs.io/en/v1.0.0rc1/demos/node-classification/gcn/gcn-cora-node-classification-example.html#Loading-the-CORA-network After: https://stellargraph--1394.org.readthedocs.build/en/1394/demos/node-classification/gcn/gcn-cora-node-classification-example.html#Loading-the-CORA-network - paper reference in Attri2Vec node classification Before: https://stellargraph.readthedocs.io/en/v1.0.0rc1/demos/node-classification/attri2vec/attri2vec-citeseer-node-classification-example.html After: https://stellargraph--1394.org.readthedocs.build/en/1394/demos/node-classification/attri2vec/attri2vec-citeseer-node-classification-example.html All new docs: https://stellargraph--1394.org.readthedocs.build/en/1394/demos/index.html See: #1292
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Describe the bug
Follow-up from #1279
Notebooks were added to our readthedocs documentation by using
nbsphinx
, but seems like quite a number of links got broken. For example inloading-saving-neo4j
:The text was updated successfully, but these errors were encountered: