fix: ignore legacy magic blocks when stripping comments #1228
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Important
I have a bandaid provisional fix for this issue up here https://github.com/readmeio/readme/pull/16356 to relieve the pressure on getting this figured out.
This PR attempts to address the
@readme/markdownportion of https://linear.app/readme-io/issue/CX-2524/legacy-customer-sumsub-has-pages-with-large-html-blocks-not-loading where a few legacy projects with very large html blocks aren't loading.This is a regression from https://github.com/readmeio/readme/pull/16342
In the readme app, we strip html comments from a document's markdown body. We do this on the server before the markdown is used in the
rdmdrender. See this diffThe function we use to strip html comments is a plain
remarkParsepipeline that handles either markdown or mdx content:markdown/lib/stripComments.ts
Lines 15 to 24 in 39cb012
I think there's something going on with the pipeline modifying the magic block syntax, but we've only seen it on html blocks with tons of code within them, so it's hard to pinpoint. I think the best solution is to just skip magic blocks altogether in the parsing.
I made an attempt at doing this in the legacy package but couldn't find a clear path. Here's the PR with my attempt at that. We don't want to actually transform the magic blocks into HTML -- we want to alter the surrounding markdown and return it with the magic blocks untouched. Not sure how to do that with an ast without a lot of complexity! But maybe I'm missing something obvious.
So the best I could do is update the
stripCommentsfunction in@readme/markdownto skip magic block syntax. It was much easier to just use regex to extract the blocks before the remark pipeline runs and then restore them afterwards.🧰 Changes
stripCommentsto use a newextractMagicBlocksutil that swaps magic blocks with placeholders before parsing the markdown. After the markdown is parsed the placeholders are restored with the original magic block text.🧬 QA & Testing
Do the tests pass? I did link a markdown build from this branch to my readme repo locally and it fixed the issue on clones of the projects in question.