Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve markdown + HTML parsing #3

Merged
merged 6 commits into from
Jul 17, 2024
Merged

Improve markdown + HTML parsing #3

merged 6 commits into from
Jul 17, 2024

Conversation

nvlang
Copy link
Owner

@nvlang nvlang commented Jul 17, 2024

This PR significantly improves SvelTeX's parsing of content mixing markdown and HTML syntax. Among other things, it includes the following changes:

  • Use sanitize-html to ensure that HTML generated by Markdown
    processor is valid.

  • Refine whitespace adjustment performed before passing markup to the
    Markdown processor.

  • Remove <p> tags within HTML elements or Svelte components that
    cannot contain paragraphs (e.g., <span><p>*text*</p></span>
    becomes <span><em>text</em></span> now, ignoring insignificant
    whitespace.

  • Add markdown.components option to SvelTeX configuration to specify
    preferences in regards to how each Svelte component is treated by
    SvelTeX when it comes to whitespace adjustments.

  • Auto-import components "registered" in the markdown.components array
    from the SvelTeX configuration if they are used in the markup and not
    already imported in the file's <script> tag.

    Note: a component is "registered" in the markdown.components array
    iff there exists an object obj in the markdown.components array such
    that all of the following hold:

    • obj.name equals the name of the component (case-sensitive).
    • obj.importPath is not undefined.
  • Add tests for all of the above features and fixes.

  • Add markdown.remarkRehypeOptions and
    markdown.rehypeStringifyOptions to SvelTeX configuration when the
    unified Markdown backend is used.

Fixes #2.

nvlang added 6 commits July 13, 2024 18:53
micromark is CommonMark compliant by default, whereas marked isn't. This
makes it a more reliable reference parser for our purposes.
-   Use `sanitize-html` to ensure that HTML generated by Markdown
    processor is valid.
-   Refine whitespace adjustment performed before passing markup to the
    Markdown processor.
-   Remove `<p>` tags within HTML elements or Svelte components that
    cannot contain paragraphs (e.g., `<span><p>*text*</p></span>`
    becomes `<span><em>text</em></span>` now, ignoring insignificant
    whitespace.
-   Add `markdown.components` option to SvelTeX configuration to specify
    preferences in regards to how each Svelte component is treated by
    SvelTeX when it comes to whitespace adjustments.
-   Add tests for all of the above features and fixes.
-   Add `markdown.remarkRehypeOptions` and
    `markdown.rehypeStringifyOptions` to SvelTeX configuration when the
    `unified` Markdown backend is used.

Fixes #2.
Auto-import components "registered" in the `markdown.components` array
from the SvelTeX configuration if they are used in the markup and not
already imported in the file's `<script>` tag.

Note: a component is "registered" in the `markdown.components` array
iff there exists an object `obj` in the `markdown.components` array such
that all of the following hold:

-   `obj.name` equals the name of the component (case-sensitive).
-   `obj.importPath` is not `undefined`.
@nvlang nvlang added bug Something isn't working enhancement New feature or request labels Jul 17, 2024
@nvlang nvlang self-assigned this Jul 17, 2024
@nvlang nvlang linked an issue Jul 17, 2024 that may be closed by this pull request
@nvlang nvlang merged commit e4018bb into main Jul 17, 2024
6 checks passed
@nvlang nvlang deleted the 2-bad-markdown-parsing branch July 17, 2024 14:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Bad markdown parsing
1 participant