fix: ignore legacy magic blocks when stripping comments #1228

kevinports · 2025-11-07T04:49:54Z

Important

I have a bandaid provisional fix for this issue up here https://github.com/readmeio/readme/pull/16356 to relieve the pressure on getting this figured out.

This PR attempts to address the @readme/markdown portion of https://linear.app/readme-io/issue/CX-2524/legacy-customer-sumsub-has-pages-with-large-html-blocks-not-loading where a few legacy projects with very large html blocks aren't loading.

This is a regression from https://github.com/readmeio/readme/pull/16342
In the readme app, we strip html comments from a document's markdown body. We do this on the server before the markdown is used in the rdmd render. See this diff

The function we use to strip html comments is a plain remarkParse pipeline that handles either markdown or mdx content:

markdown/lib/stripComments.ts

Lines 15 to 24 in 39cb012

    
           async function stripComments (doc: string, { mdx }: Opts = {}): Promise<string> { 
        
             const processor = unified() 
        
               .use(remarkParse) 
        
               .use(mdx ? remarkMdx : undefined) 
        
               .use(stripCommentsTransformer) 
        
               .use(remarkStringify); 
        
             const file = await processor.process(doc); 
        
             return String(file); 
        
           }

I think there's something going on with the pipeline modifying the magic block syntax, but we've only seen it on html blocks with tons of code within them, so it's hard to pinpoint. I think the best solution is to just skip magic blocks altogether in the parsing.

I made an attempt at doing this in the legacy package but couldn't find a clear path. Here's the PR with my attempt at that. We don't want to actually transform the magic blocks into HTML -- we want to alter the surrounding markdown and return it with the magic blocks untouched. Not sure how to do that with an ast without a lot of complexity! But maybe I'm missing something obvious.

So the best I could do is update the stripComments function in @readme/markdown to skip magic block syntax. It was much easier to just use regex to extract the blocks before the remark pipeline runs and then restore them afterwards.

🧰 Changes

Modify stripComments to use a new extractMagicBlocks util that swaps magic blocks with placeholders before parsing the markdown. After the markdown is parsed the placeholders are restored with the original magic block text.

🧬 QA & Testing

Do the tests pass? I did link a markdown build from this branch to my readme repo locally and it fixed the issue on clones of the projects in question.

lib/utils/extractMagicBlocks.ts

kevinports · 2025-11-07T05:02:19Z

lib/utils/extractMagicBlocks.ts

+  const replaced = markdown.replace(MAGIC_BLOCK_REGEX, (match) => {
+    // Use backticks so it becomes a code span, preventing remarkParse from 
+    // parsing special characters in the token as markdown syntax
+    const token = `\`__MAGIC_BLOCK_${index}__\``; 
+
+    blocks.push({ token, raw: match });
+    index += 1;
+    return token;
+  });


Tried to fix this here: 81c5885

fix: ignore legacy magic blocks when stripping comments

1b67c9f

github-advanced-security bot found potential problems Nov 7, 2025

View reviewed changes

lib/utils/extractMagicBlocks.ts Fixed Show fixed Hide fixed

cleanup

37c0820

github-advanced-security bot found potential problems Nov 7, 2025

View reviewed changes

fix: improve regex

81c5885

kevinports marked this pull request as ready for review November 7, 2025 05:26

kevinports requested a review from kellyjosephprice as a code owner November 7, 2025 05:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: ignore legacy magic blocks when stripping comments #1228

fix: ignore legacy magic blocks when stripping comments #1228

Uh oh!

kevinports commented Nov 7, 2025 •

edited

Loading

Uh oh!

Uh oh!

Check failure

kevinports Nov 7, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	async function stripComments (doc: string, { mdx }: Opts = {}): Promise<string> {
	const processor = unified()
	.use(remarkParse)
	.use(mdx ? remarkMdx : undefined)
	.use(stripCommentsTransformer)
	.use(remarkStringify);

	const file = await processor.process(doc);
	return String(file);
	}

fix: ignore legacy magic blocks when stripping comments #1228

Are you sure you want to change the base?

fix: ignore legacy magic blocks when stripping comments #1228

Uh oh!

Conversation

kevinports commented Nov 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🧰 Changes

🧬 QA & Testing

Uh oh!

Uh oh!

Check failure

Uh oh!

Uh oh!

kevinports Nov 7, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

kevinports commented Nov 7, 2025 •

edited

Loading