-
Notifications
You must be signed in to change notification settings - Fork 85
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve the API script to handle |
characters
#718
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work! I've re-generated the docs with the new script and it all LGTM 🚀
Unfortunately, this resulted in some pages crashing, per #673 (comment). This is because we broke column alignment settings for matrices, like this from
If you use
Maybe we were over-zealous with always replacing |
This reverts commit f6d67a1.
Reverts #718. Turns out that breaks certain pages, as explained at #718 (comment). Instead, we fix the problematic pages by changing the original Sphinx HTML in Box directly in this PR's last commit. We have to manually revert the 0.45 docs due to #755.
### Summary This PR changes how we handle pipes inside math expressions when we convert sphinx HTML into markdown. Some math expressions use pipes to define quantum states using the Dirac notation, and we need to escape those characters to avoid breaking the page when the pipe characters are used inside a markdown table. ### Details One solution could only handle the `|` characters used inside a table, but given that the math expressions could be used in nested tags (e.g `<td> <p> <span class="math"> SOME_EXPRESSION </span></p></td>`), and would need to make the script more complex without ensuring we fix all the cases where we could make the page to fail to render, I decided to handle that character differently in all math expressions. The PR replaces the `|` character with `\vert ` which will represent the same character. `\vert` needs extra space at the end to handle cases where the pipe was next to a non-numerical character. In those cases, we should avoid converting `|x` to `\vertx` given that the latter is not a valid command (`\vert x` is the correct conversion). We also need to take into account that, in some cases, we still need to use the `|` characters because when escaped (`\|`), it represents a double pipe (`||`), which could be used in different mathematical expressions like the length of a vector. This is the regex used, which only matches pipe characters not preceded by a backslash: ```ts /(?<!\\)\|/gm ``` A new test was added to verify different cases where we can find a pipe character. The tests checks `|` characters outside math expressions, in math expressions inside a table, and in math expressions outside the table. We can also find a case where we intentionally want to use `\|` to create a double pipe. In the following screenshot, we can see the rendered result of the test. ![test-example](https://github.com/Qiskit/documentation/assets/47946624/9603819c-4fb6-4d6a-997d-dc4fdcf1c3cd) Closes Qiskit#488
Reverts Qiskit#718. Turns out that breaks certain pages, as explained at Qiskit#718 (comment). Instead, we fix the problematic pages by changing the original Sphinx HTML in Box directly in this PR's last commit. We have to manually revert the 0.45 docs due to Qiskit#755.
Summary
This PR changes how we handle pipes inside math expressions when we convert sphinx HTML into markdown. Some math expressions use pipes to define quantum states using the Dirac notation, and we need to escape those characters to avoid breaking the page when the pipe characters are used inside a markdown table.
Details
One solution could only handle the
|
characters used inside a table, but given that the math expressions could be used in nested tags (e.g<td> <p> <span class="math"> SOME_EXPRESSION </span></p></td>
), and would need to make the script more complex without ensuring we fix all the cases where we could make the page to fail to render, I decided to handle that character differently in all math expressions.The PR replaces the
|
character with\vert
which will represent the same character.\vert
needs extra space at the end to handle cases where the pipe was next to a non-numerical character. In those cases, we should avoid converting|x
to\vertx
given that the latter is not a valid command (\vert x
is the correct conversion).We also need to take into account that, in some cases, we still need to use the
|
characters because when escaped (\|
), it represents a double pipe (||
), which could be used in different mathematical expressions like the length of a vector.This is the regex used, which only matches pipe characters not preceded by a backslash:
/(?<!\\)\|/gm
A new test was added to verify different cases where we can find a pipe character. The tests checks
|
characters outside math expressions, in math expressions inside a table, and in math expressions outside the table. We can also find a case where we intentionally want to use\|
to create a double pipe. In the following screenshot, we can see the rendered result of the test.Closes #488