Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix mkdocs rendering symbols in notebook code #2033

Merged
merged 2 commits into from
Mar 20, 2024

Conversation

ADBond
Copy link
Contributor

@ADBond ADBond commented Mar 6, 2024

Type of PR

  • BUG
  • FEAT
  • MAINT
  • DOC

Is your Pull Request linked to an existing Issue or Pull Request?

Closes #1767.

Was also ultimately the root cause of #2018.

The issue is that the mknotebooks plugin converts .ipynb files into .html files, via nbconvert. As such, in code-blocks when we have a '<' literal (such as in custom SQL for comparisons, or blocking rules) it is converted to '&lt;' so that it can be correctly displayed as part of the html.
But at some later stage this html is processed by mkdocs itself, and it escapes everything in a code-block (so that for instance blocks of html code would display correctly), so this then gets converted to '&amp;lt;' - i.e. the ampersand gets rendered as-is.

[NB: I'm not certain on all of the details here, but this seems to at least approximately be what is occurring]

Ultimately this means we end up with code that appears as e.g.

deterministic_rules = [
    "l.first_name = r.first_name and levenshtein(r.dob, l.dob) &lt;= 1",
    "l.surname = r.surname and levenshtein(r.dob, l.dob) &lt;= 1",
    "l.first_name = r.first_name and levenshtein(r.surname, l.surname) &lt;= 2",
    "l.email = r.email"
]

which is incorrect, and as the code will copy as such, will lead to confusing errors for anyone who tries to run it.

Give a brief description for the solution you have provided

In the end the most straightforward solution was to override the mkdocs config option which mknotebooks uses to convert notebook files with a class that converts to markdown, rather than html (with the md -> html conversion handled by mkdocs as usual).
This runs as a hook after any other on_config hooks/plugins run, so we make sure to override the mknotebooks version.

This may not be a fully general solution for any configuration of mknotebooks, but as we are using only a simple set of its features there should not be a problem.

See the built version on my fork.

Additionally in the course of researching this I discovered that mkdocs-simple-hooks (which I introduced in #1913) is deprecated, as the functionality is included in core mkdocs since 1.4, so I ditched this dependency and used the native hooks feature.

PR Checklist

  • Added documentation for changes
  • Added feature to example notebooks or tutorial (if appropriate)
  • Added tests (if appropriate)
  • Updated CHANGELOG.md (if appropriate)
  • Made changes based off the latest version of Splink
  • Run the linter

`mkdocs-simple-hooks` is deprecated as the functionality now lives in `mkdocs` directly.
Missed that when I added it!
We convert notebooks to .md rather than .html (leaving the latter to mkdocs itseld).
Without this, some symbols such as '<' end up as '&lt;'
@ADBond ADBond added bug Something isn't working documentation Improvements or additions to documentation labels Mar 6, 2024
@ADBond ADBond requested review from RobinL and RossKen March 6, 2024 14:21
@RobinL
Copy link
Member

RobinL commented Mar 6, 2024

@RossKen are you ok to look at this one? It's been a while since i've looked at the docs build so not sure i'm in a great place to understand this!

@RossKen
Copy link
Contributor

RossKen commented Mar 6, 2024

@RossKen are you ok to look at this one? It's been a while since i've looked at the docs build so not sure i'm in a great place to understand this!

Yep it is on my todo list

Copy link
Member

@RobinL RobinL left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ross didn't get time to look at this before he went on hol so I propose a YOLO approval. I'll just assume this works, because if it doesn't its easy to change back!

@ADBond ADBond merged commit bfaf8d1 into moj-analytical-services:master Mar 20, 2024
4 of 5 checks passed
@ADBond ADBond deleted the fix-mkdocs-rendering branch March 20, 2024 01:34
@RobinL
Copy link
Member

RobinL commented Jun 10, 2024

An addedum. I was trying to get this:

tag_remove_configs:
        remove_cell_tags:
          - Remove_cell
        remove_all_outputs_tags:
          - Remove_all_output
        remove_single_output_tags:
          - Remove_single_output
        remove_input_tags:
          - Remove_input

syntax to work

Which would allow us to have jupyter cells that create markdown tables, without the code showing

Unfortunately it's not compatible with the use of the markdown converter:

def on_config(config: MkDocsConfig) -> MkDocsConfig:
    # convert ipynb to md rather than html directly
    # this ensures we render symbols such as '<' correctly
    # in codeblocks, instead of '%lt;'
    md_exporter = MarkdownExporter(config=config)

I thought I'd investigate further to see if its's possible to get it working without the markdown config change above

I wondered whether it was something in our config, but I've confirmed it's not. Specifically I made a minimal mkdocs site as follows:

.
├── about.md
├── example.ipynb
├── index.md
├── introduction.md
└── some_section
    ├── page1.md
    └── page2.md
site_name: My Documentation
theme:
  name: material

plugins:
  - mknotebooks
nav:
  - Home: index.md
  - About: about.md
  - Docs:
      - Introduction: introduction.md
      - Some Section:
          - Page 1: some_section/page1.md
          - Page 2: some_section/page2.md
      - Example Notebook: example.ipynb

with a cell a = "a < 2" in the ipynb

And you still get an error, so pretty sure this is just a bug, it's not some weird interactino with some other part of our setup

See also: greenape/mknotebooks#1118

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working documentation Improvements or additions to documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[DOCS] Some symbols not rendering properly in docs
3 participants