Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discussion thread for #792 #809

Open
yzhang-gh opened this issue Sep 9, 2020 · 6 comments
Open

Discussion thread for #792 #809

yzhang-gh opened this issue Sep 9, 2020 · 6 comments
Labels
Area: Exporting To HTML. Probably also other formats someday. Area: Link (Reserved) Markdown link processing, URI recognition, slugification. Area: Table of contents Pertaining to table of contents (TOC generation and detection, related heading operations). Needs Discussion We haven't decided what to do.

Comments

@yzhang-gh
Copy link
Owner

It's a relatively big bug. Major features, including TOC generation, printing, and code completion, rely on slugify(), thus, are affected.

I've refactored mdHeadingToPlaintext() on branch slugify. Then, we can implement a CommonMark-compliant (assuming the input string is in pure CommonMark) method first.

If someday someone requests support for a platform whose "Markdown to plain text conversion" method is different, we can simply append the new method there.

Originally posted by @Lemmingh in #792 (comment)

@yzhang-gh yzhang-gh added the Needs Discussion We haven't decided what to do. label Sep 9, 2020
@yzhang-gh
Copy link
Owner Author

@Lemmingh Thanks.

What is the difference between legacy and commonmark versions of mdHeadingToPlaintext()? I cannot remember clearly but I did have some thoughts about it:

  • Syntaxes to remove: _italic_, **bold**, [text][link], [text](link), …
  • Do not change: 1. First Heading, 1) Me Too
  • ???: $math$ (which may be what you said, non-CommonMark)

Not sure whether you can edit my comment. Feel free to help us sort it out.

@Lemmingh
Copy link
Collaborator

You can check my early thoughts at https://github.com/users/Lemmingh/projects/2#card-45105512
before I post them formally.

Tasks

  • How markdown-it plugins work.
  • Test/Inspection tools
    • README.md
    • clickable.md
    • generate-test-files.ps1
    • test-cases.ps1
  • GitHub
    • Docs
    • Code
    • Test
  • GitLab
    • Docs
    • Code
    • Test
  • VS Code
    • Docs
    • Code
    • Test
  • Gitea
    • Docs
    • Code
    • Test
  • goldmark
    • Docs
  • blackfriday
    • Docs
    • Code
    • Test

@Lemmingh
Copy link
Collaborator

I'm able to edit your comment.

Since you cannot recall all the details, let's start from the beginning. Correct me if I miss any point.

Background

Slugify

The goal of slugify() is to generate a proper ID value from a Markdown heading content.

A typical slugify process looks like:

  1. Let S be the heading content, which may contain Markdown inline elements.
  2. Convert S to plain text, that's, convert all the Markdown inline elements to its plain text form.
  3. Apply slug filters to S.
  4. Return S.

Basically, a slugify method with its rules is bound with a Markdown processor. And the methods diverge greatly.

Indeed, this extension implicitly has its own processor, similar to VS Code, and rules, especially after #658.

markdown-it plugins

I'm not familiar with markdown-it plugins.

It seems that markdown-it-emoji processes the document at a very early stage. When it runs to MarkdownIt.renderer, there is already no emoji code but only characters.

Markdown to plain text conversion

Theoretically, "convert Markdown to plain text" is a step of "slugify" as described above. The mdHeadingToPlaintext performs this step. (Honestly, markdownToPlainText should be a better name.)

  • The commonMark mode is the default or fallback. It only processes elements recognized by CommonMark Spec.

  • The legacy mode needs a better name. It intends to retrieve what is sent to renderer, and then behaves in the same way as commonMark, though the current implementation is extracting inner text from HTML (rendering result).

    This is necessary in some cases.

    For example, markdown-it generates heading ID at rendering stage. However, as mentioned above, markdown-it-emoji transforms document content before it. Thus, the document that renderer receives is different from the raw Markdown file. VS Code's built-in Markdown preview and printing feature of this extension are affected.

    GitLab is said to do the same thing sometimes:

    Note that the emoji processing happens before the header IDs are generated, so the emoji is converted to an image which is then removed from the ID.

Slug filter

The biggest difference between slugify methods is here.

The slugifyMethods holds these filters. Although they are called "filter", they perform pretty many operations besides simple filtering, including case conversion, encoding, and even additional character mapping.

Problems

GitHub

The reference implementation last modified on 2019-07-18 says GitHub only downcases ASCII characters.

However, a document full of Greek characters indicates that GitHub does full Unicode case conversion now.

We don't know what happened behind the scene.

GitLab

According to #469, #312, and GitLab's documentation, there are many different Markdown processors on GitLab.

Visual Studio Code

See "Markdown to plain text conversion" above.

Gitea & Hugo

They changed implementations a few times. Maybe someone that has a good command of Go can figure out it.

@yzhang-gh
Copy link
Owner Author

yzhang-gh commented Sep 10, 2020

Thanks for the input.

As I recall, we only had two versions of slugify function vscode and github in the very beginning.

mdHeadingToPlaintext() and textInHtml() were introduced at that time to mimic this behavior of GitHub, where Markdown is already rendered as HTML and text is extracted from the HTML (and then slugification).

Thereafter, mdHeadingToPlaintext() was also used somewhere else (to show an outline view).
(We want a title some italic text rather than some _italic_ text, not to mention a heading containing <kbd>Ctrl</kbd>.)

However, that feature was removed as later VSCode implemented its own outline view. And mdHeadingToPlaintext() remains till now without drawing much attention.

After checking the 4 kinds of the slugify function, I believe it is time to remove the above two functions (move them into the github mode). (They were only for github mode in the beginning!) (The other slugify function don't need these two functions at all.)

We don't need to change slugify functions except for github. Without careful thinking, we just need to stop using markdown-it plugins (for the process of Markdown to HTML in the github mode) and we are done.

@Lemmingh
Copy link
Collaborator

we only had two versions of slugify function vscode and github in the very beginning.

Yeah, I remember there was githubCompatibility before version 3. It's superseded by slugifyMode to support GitLab.


I believe it is time to remove the above two functions (move them into the github mode).

I do agree to merge "to plain text conversion" into their respective functions in slugifyMethods, which also makes the flow of slugify() clearer.

Slugify process
  1. Let S be the heading content.
  2. Apply slug filters to S, which often involves:
    • Converting all the Markdown inline elements to its plain text form.
    • Case conversion.
    • Removing some characters.
    • Additional character mapping.
    • Encoding.
  3. Return S.

But I prefer to keep the CommonMark mode as separate text utility commonmarkToPlainText() and commonmarkToPlainTextInline(), since it looks quite common:

AFAIK, both GitHub and GitLab do "CommonMark to plain text conversion".

Source

From: tools/slugify/test-cases.ps1

[[_TOC_]]

## foo \_text_ bar
## foo _italic_ bar
## foo <em>emphasis</em> bar
## `<em>code</em>`
## foo [link](https://www.bing.com/) bar
## image ![moz](https://www.mozilla.org/media/protocol/img/logos/mozilla/black.40d1af88c248.svg) logo
## :sweat_smile: with emoji code
## 😅 with emoji character
GitHub (2020-09-11)
<p>[[<em>TOC</em>]]</p>
<h2><a id="user-content-foo-_text_-bar" class="anchor" aria-hidden="true" href="#foo-_text_-bar"></a>foo _text_ bar</h2>
<h2><a id="user-content-foo-italic-bar" class="anchor" aria-hidden="true" href="#foo-italic-bar"></a>foo <em>italic</em> bar</h2>
<h2><a id="user-content-foo-emphasis-bar" class="anchor" aria-hidden="true" href="#foo-emphasis-bar"></a>foo <em>emphasis</em> bar</h2>
<h2><a id="user-content-emcodeem" class="anchor" aria-hidden="true" href="#emcodeem"></a><code>&lt;em&gt;code&lt;/em&gt;</code></h2>
<h2><a id="user-content-foo-link-bar" class="anchor" aria-hidden="true" href="#foo-link-bar"></a>foo <a href="https://www.bing.com/" rel="nofollow">link</a> bar</h2>
<h2><a id="user-content-image--logo" class="anchor" aria-hidden="true" href="#image--logo"></a>image <a target="_blank" rel="noopener noreferrer" href="https://camo.githubusercontent.com/349c3efbe11a677096b332496c45f770cb493fbf/68747470733a2f2f7777772e6d6f7a696c6c612e6f72672f6d656469612f70726f746f636f6c2f696d672f6c6f676f732f6d6f7a696c6c612f626c61636b2e3430643161663838633234382e737667"><img src="https://camo.githubusercontent.com/349c3efbe11a677096b332496c45f770cb493fbf/68747470733a2f2f7777772e6d6f7a696c6c612e6f72672f6d656469612f70726f746f636f6c2f696d672f6c6f676f732f6d6f7a696c6c612f626c61636b2e3430643161663838633234382e737667" alt="moz" data-canonical-src="https://www.mozilla.org/media/protocol/img/logos/mozilla/black.40d1af88c248.svg" style="max-width:100%;"></a> logo</h2>
<h2><a id="user-content-sweat_smile-with-emoji-code" class="anchor" aria-hidden="true" href="#sweat_smile-with-emoji-code"></a><g-emoji class="g-emoji" alias="sweat_smile" fallback-src="https://github.githubassets.com/images/icons/emoji/unicode/1f605.png">😅</g-emoji> with emoji code</h2>
<h2><a id="user-content--with-emoji-character" class="anchor" aria-hidden="true" href="#-with-emoji-character"></a><g-emoji class="g-emoji" alias="sweat_smile" fallback-src="https://github.githubassets.com/images/icons/emoji/unicode/1f605.png">😅</g-emoji> with emoji character</h2>
GitLab (13.4.0-pre)
<ul class="section-nav">
<li><a href="#foo-_text_-bar">foo _text_ bar</a></li>
<li><a href="#foo-italic-bar">foo italic bar</a></li>
<li><a href="#foo-emphasis-bar">foo emphasis bar</a></li>
<li><a href="#emcodeem">&lt;em&gt;code&lt;/em&gt;</a></li>
<li><a href="#foo-link-bar">foo link bar</a></li>
<li><a href="#image-logo">image  logo</a></li>
<li><a href="#sweat_smile-with-emoji-code"><gl-emoji title="smiling face with open mouth and cold sweat" data-name="sweat_smile" data-unicode-version="6.0">😅</gl-emoji> with emoji code</a></li>
<li><a href="#-with-emoji-character"><gl-emoji title="smiling face with open mouth and cold sweat" data-name="sweat_smile" data-unicode-version="6.0">😅</gl-emoji> with emoji character</a></li>
</ul>
<h2 data-sourcepos="3:1-3:18" dir="auto">
<a id="user-content-foo-_text_-bar" class="anchor" href="#foo-_text_-bar" aria-hidden="true"></a>foo _text_ bar</h2>
<h2 data-sourcepos="4:1-4:19" dir="auto">
<a id="user-content-foo-italic-bar" class="anchor" href="#foo-italic-bar" aria-hidden="true"></a>foo <em>italic</em> bar</h2>
<h2 data-sourcepos="5:1-5:28" dir="auto">
<a id="user-content-foo-emphasis-bar" class="anchor" href="#foo-emphasis-bar" aria-hidden="true"></a>foo <em>emphasis</em> bar</h2>
<h2 data-sourcepos="6:1-6:18" dir="auto">
<a id="user-content-emcodeem" class="anchor" href="#emcodeem" aria-hidden="true"></a><code>&lt;em&gt;code&lt;/em&gt;</code>
</h2>
<h2 data-sourcepos="7:1-7:40" dir="auto">
<a id="user-content-foo-link-bar" class="anchor" href="#foo-link-bar" aria-hidden="true"></a>foo <a href="https://www.bing.com/" rel="nofollow noreferrer noopener" target="_blank">link</a> bar</h2>
<h2 data-sourcepos="8:1-8:101" dir="auto">
<a id="user-content-image-logo" class="anchor" href="#image-logo" aria-hidden="true"></a>image <a class="no-attachment-icon" href="https://user-content.gitlab-static.net/ca8c0ef92f8dfffc9fbf51ce9cffdca50182adc9/68747470733a2f2f7777772e6d6f7a696c6c612e6f72672f6d656469612f70726f746f636f6c2f696d672f6c6f676f732f6d6f7a696c6c612f626c61636b2e3430643161663838633234382e737667" target="_blank" rel="nofollow noreferrer noopener" data-canonical-src="https://www.mozilla.org/media/protocol/img/logos/mozilla/black.40d1af88c248.svg"><img alt="moz" data-canonical-src="https://www.mozilla.org/media/protocol/img/logos/mozilla/black.40d1af88c248.svg" class="js-lazy-loaded qa-js-lazy-loaded" loading="lazy" src="https://user-content.gitlab-static.net/ca8c0ef92f8dfffc9fbf51ce9cffdca50182adc9/68747470733a2f2f7777772e6d6f7a696c6c612e6f72672f6d656469612f70726f746f636f6c2f696d672f6c6f676f732f6d6f7a696c6c612f626c61636b2e3430643161663838633234382e737667"></a> logo</h2>
<h2 data-sourcepos="9:1-9:32" dir="auto">
<a id="user-content-sweat_smile-with-emoji-code" class="anchor" href="#sweat_smile-with-emoji-code" aria-hidden="true"></a><gl-emoji title="smiling face with open mouth and cold sweat" data-name="sweat_smile" data-unicode-version="6.0">😅</gl-emoji> with emoji code</h2>
<h2 data-sourcepos="10:1-10:28" dir="auto">
<a id="user-content--with-emoji-character" class="anchor" href="#-with-emoji-character" aria-hidden="true"></a><gl-emoji title="smiling face with open mouth and cold sweat" data-name="sweat_smile" data-unicode-version="6.0">😅</gl-emoji> with emoji character</h2>
Azure (2020-09-06)

Note: Sensitive information is removed.

<p mlp="1"></p><div class="toc-container"><div class="toc-container-header">Contents</div><ul><li><a href="#foo-%5C_text_-bar">foo \text bar</a></li><li><a href="#foo-_italic_-bar">foo italic bar</a></li><li><a href="#foo-%3Cem%3Eemphasis%3C%2Fem%3E-bar">foo emphasis bar</a></li><li><a href="#%60%3Cem%3Ecode%3C%2Fem%3E%60">code</a></li><li><a href="#foo-%5Blink%5D(https%3A%2F%2Fwww.bing.com%2F)-bar">foo link bar</a></li><li><a href="#image-!%5Bmoz%5D(https%3A%2F%2Fwww.mozilla.org%2Fmedia%2Fprotocol%2Fimg%2Flogos%2Fmozilla%2Fblack.40d1af88c248.svg)-logo">image moz logo</a></li><li><a href="#%3Asweat_smile%3A-with-emoji-code">:sweat_smile: with emoji code</a></li><li><a href="#%F0%9F%98%85-with-emoji-character">😅 with emoji character</a></li></ul></div><p></p>
<h2 id="user-content-foo-%5C_text_-bar" mlp="3">foo _text_ bar<a href="https://server/path?anchor=foo-%5C_text_-bar" class=" shareHeaderAnchor" aria-labelledby="user-content-foo-%5C_text_-bar"></a></h2>
<h2 id="user-content-foo-_italic_-bar" mlp="4">foo <em mlp="5">italic</em> bar<a href="https://server/path?anchor=foo-_italic_-bar" class=" shareHeaderAnchor" aria-labelledby="user-content-foo-_italic_-bar"></a></h2>
<h2 id="user-content-foo-%3Cem%3Eemphasis%3C%2Fem%3E-bar" mlp="6">foo <em>emphasis</em> bar<a href="https://server/path?anchor=foo-%3Cem%3Eemphasis%3C%2Fem%3E-bar" class=" shareHeaderAnchor" aria-labelledby="user-content-foo-%3Cem%3Eemphasis%3C%2Fem%3E-bar"></a></h2>
<h2 id="user-content-%60%3Cem%3Ecode%3C%2Fem%3E%60" mlp="7"><code mlp="8">&lt;em&gt;code&lt;/em&gt;</code><a href="https://server/path?anchor=%60%3Cem%3Ecode%3C%2Fem%3E%60" class=" shareHeaderAnchor" aria-labelledby="user-content-%60%3Cem%3Ecode%3C%2Fem%3E%60"></a></h2>
<h2 id="user-content-foo-%5Blink%5D(https%3A%2F%2Fwww.bing.com%2F)-bar" mlp="9">foo <a href="https://www.bing.com/" class="" rel="noopener noreferrer" target="_blank">link</a> <span class="fabric-icon ms-Icon--NavigateExternalInline font-size" role="presentation" aria-hidden="true"> </span> bar<a href="https://server/path?anchor=foo-%5Blink%5D(https%3A%2F%2Fwww.bing.com%2F)-bar" class=" shareHeaderAnchor" aria-labelledby="user-content-foo-%5Blink%5D(https%3A%2F%2Fwww.bing.com%2F)-bar"></a></h2>
<h2 id="user-content-image-!%5Bmoz%5D(https%3A%2F%2Fwww.mozilla.org%2Fmedia%2Fprotocol%2Fimg%2Flogos%2Fmozilla%2Fblack.40d1af88c248.svg)-logo" mlp="10">image <img src="https://www.mozilla.org/media/protocol/img/logos/mozilla/black.40d1af88c248.svg" alt="moz"> logo<a href="https://server/path?anchor=image-!%5Bmoz%5D(https%3A%2F%2Fwww.mozilla.org%2Fmedia%2Fprotocol%2Fimg%2Flogos%2Fmozilla%2Fblack.40d1af88c248.svg)-logo" class=" shareHeaderAnchor" aria-labelledby="user-content-image-!%5Bmoz%5D(https%3A%2F%2Fwww.mozilla.org%2Fmedia%2Fprotocol%2Fimg%2Flogos%2Fmozilla%2Fblack.40d1af88c248.svg)-logo"></a></h2>
<h2 id="user-content-%3Asweat_smile%3A-with-emoji-code" mlp="11">😅 with emoji code<a href="https://server/path?anchor=%3Asweat_smile%3A-with-emoji-code" class=" shareHeaderAnchor" aria-labelledby="user-content-%3Asweat_smile%3A-with-emoji-code"></a></h2>
<h2 id="user-content-%F0%9F%98%85-with-emoji-character" mlp="12">😅 with emoji character<a href="https://server/path?anchor=%F0%9F%98%85-with-emoji-character" class=" shareHeaderAnchor" aria-labelledby="user-content-%F0%9F%98%85-with-emoji-character"></a></h2>

Without careful thinking, we just need to stop using markdown-it plugins (for the process of Markdown to HTML in the github mode) and we are done.

Yeah, creating a new instance without plugins works.

After correcting slugify(), there is another problem, which is best to be fixed before delivery:

The challenge of printing (plugin-related)

Since #658, the HTML generating behavior of printing has been almost aligned with that of VS Code's preview. The key difference is:

VS Code has only one slugify function, but this extension allows to change it.

Then, imagine a scenario:

  1. Install Markdown Emoji which is powered by markdown-it-emoji.
  2. Set "markdown.extension.toc.slugifyMode": "github".
  3. Use emoji code in headings. Eg. # :book:
  4. Create TOC. (markdown.extension.toc.create)
  5. Print. (markdown.extension.printToHtml)

Embarrassing thing (as bad as #792) comes:

  • "TOC generation" thinks the Markdown source file will be committed to the target platform, and reads original document.
  • "Printing" produces HTML in the same way as VS Code's preview, and reads the document preprocessed by markdown-it-emoji at parsing stage.

It sounds hard to make "printing" generate the same slug (heading ID value) as "TOC generation". I have some ideas, but got stuck:

  1. (Possible) Add a core rule before other plugins. The rule wraps each heading with <span>, and assigns ID to <span>. Then we don't assign ID to headings when rendering. But I'm not sure how to implement it on earth.
  2. (Failed) Create a heading-to-slug map (named slugLut, slugMap, preferredSlug or whatever) at the beginning of print(), store it in Markdown engine, and look up in it in renderer.rules.heading_open. But we cannot distinguish between emoji code and emoji character, due to markdown-it-emoji.

Sorry for the delay.

@yzhang-gh
Copy link
Owner Author

But I prefer to keep the CommonMark mode as separate text utility commonmarkToPlainText() and commonmarkToPlainTextInline(), since it looks quite common:

AFAIK, both GitHub and GitLab do "CommonMark to plain text conversion".

👍 I agree as it is at least used by both GitHub and GitLab.


The challenge of printing (plugin-related)

So far I guess we have such a consensus: slugify function is not affected by markdown-it plugins. (VSCode doesn't, GitHub and GitLab should(?) use CommonMark and thus not.)

If we define "platform" as a way of converting MD to HTML (and then present it), then printing is a platform other than what we have now

  • VSCode('s preview) uses vscode slugify function
  • GitHub uses github ...
  • ...
  • Printing uses ... your choice!

What makes the printing platform special is it allows dynamic markdown-it plugins. I think your concern is whether markdown-it plugins will change a heading before a certain slugify function reads it. (There is nothing to worry in the process of slugify.)

In the printing platform, the slugify function is used here.

private addNamedHeaders(md: any): void {
const originalHeadingOpen = md.renderer.rules.heading_open;
md.renderer.rules.heading_open = function (tokens, idx, options, env, self) {
const title = tokens[idx + 1].children.reduce((acc: string, t: any) => acc + t.content, '');
let slug = slugify(title);
if (mdEngine._slugCount.has(slug)) {
mdEngine._slugCount.set(slug, mdEngine._slugCount.get(slug) + 1);
slug += '-' + mdEngine._slugCount.get(slug);
} else {
mdEngine._slugCount.set(slug, 0);
}
tokens[idx].attrs = tokens[idx].attrs || [];
tokens[idx].attrs.push(['id', slug]);
if (originalHeadingOpen) {
return originalHeadingOpen(tokens, idx, options, env, self);
} else {
return self.renderToken(tokens, idx, options, env, self);
}
};
}

I'm not sure whether it is always executed before other markdown-it plugins. (If it is the case, we are happy. If not, well, empirically it won't cause any problem if you don't use that syntax in the heading. Wait, you wanna use footnotes in a heading? 😵)

Lemmingh added a commit to Lemmingh/vscode-markdown that referenced this issue Sep 15, 2020
## Summary

This commit combines `baec028fcbb0a807da23aeb847ee8e4b539643c0` to `cadab204c2f28ae935498f1dddb5fdc4e3f72766`, and `f9fea764f89cc490f01e890d0a59041868dc9a89` to `2ab1a6ac17635c93aa6b863a856f55cd52e473b3`.

* ⬆ Update dependencies
  * Enable new language features:
    * ts-loader: 6.2.2 -> 8.0.3
    * typescript: 3.5.2 -> 4.0.2
  * Improve HTML entities decoding:
    * entities: 2.0.0 -> 2.0.3
  * Improve syntax highlighting when printing:
    * highlight.js: 9.15.6 -> 10.2.0
    * @types/highlight.js: 9.12.3 -> 9.12.4
  * Others are just regular upgrades.
* 🎨 Refactor function `slugify`
  * Move all the slugify functions to `slugifyMethods`.
  * Add a few comments.
* 🐛 Perform user-required case conversion before calling slugify function
* 🎨 Add type annotations to `markdownEngine.ts`
* ✨ Add `CommonmarkEngine`
* 🐛 Fix the `github` slugify function
  * Use `commonmarkEngine.engine.renderInline()`.
  * Use `entities.decodeHTML()` to decode HTML entities.
  * Correct `getTextInHtml()`.
  * Correct `PUNCTUATION_REGEXP`.
  * Add "Perform full Unicode case conversion".
* 🎨 Optimize the `gitlab` slugify function
* 🔧 Reorganize vscodeignore
* 🔧 Update metadata
  * `categories`.
  * Description of configuration.

## Known issues

"Printing" may not generate correct heading ID.

Failed cases:

```markdown
## `<em>code</em>`
## 😅
```

The second case is related to `markdown-it-emoji`. See
yzhang-gh#809 (comment)
Lemmingh added a commit that referenced this issue Sep 16, 2020
## Summary

* ⬆ Update dependencies
  * Enable new language features:
    * ts-loader: 6.2.2 -> 8.0.3
    * typescript: 3.5.2 -> 4.0.2
  * Improve HTML entities decoding:
    * entities: 2.0.0 -> 2.0.3
  * Improve syntax highlighting when printing:
    * highlight.js: 9.15.6 -> 10.2.0
    * @types/highlight.js: 9.12.3 -> 9.12.4
  * Others are just regular upgrades.
* 🎨 Refactor function `slugify`
  * Move all the slugify functions to `slugifyMethods`.
  * Add a few comments.
* 🐛 Perform user-required case conversion before calling slugify function
* 🎨 Add type annotations to `markdownEngine.ts`
* ✨ Add `CommonmarkEngine`
* 🐛 Fix the `github` slugify function
  * Use `commonmarkEngine.engine.renderInline()`.
  * Use `entities.decodeHTML()` to decode HTML entities.
  * Correct `getTextInHtml()`.
  * Correct `PUNCTUATION_REGEXP`.
  * Add "Perform full Unicode case conversion".
* 🎨 Optimize the `gitlab` slugify function
* 🔧 Reorganize vscodeignore
* 🔧 Update metadata
  * `categories`.
  * Description of configuration.
* ✅ Correct unit test
* 👷 Allow to run CI manually

## Known issues

1. "Printing" may not generate correct heading ID.

Failed cases:

```markdown
## `<em>code</em>`
## 😅
```

The second case is related to `markdown-it-emoji`. See
#809 (comment)

2. Some headings may lead to weird TOC text. Links are not affected. For example:

a][b

Co-authored-by: Lemmingh <43396014+Lemmingh@users.noreply.github.com>
@Lemmingh Lemmingh added Area: Exporting To HTML. Probably also other formats someday. Area: Table of contents Pertaining to table of contents (TOC generation and detection, related heading operations). labels Oct 29, 2020
@Lemmingh Lemmingh added the Area: Link (Reserved) Markdown link processing, URI recognition, slugification. label Aug 25, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: Exporting To HTML. Probably also other formats someday. Area: Link (Reserved) Markdown link processing, URI recognition, slugification. Area: Table of contents Pertaining to table of contents (TOC generation and detection, related heading operations). Needs Discussion We haven't decided what to do.
Projects
None yet
Development

No branches or pull requests

2 participants