Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: case-insensitive matching of tags #13556

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

Conversation

benmccann
Copy link
Member

GitHub is giving us some code scanning alerts. I'm not sure these matter that much, but thought I'd take a stab at fixing one: https://github.com/sveltejs/kit/security/code-scanning/8

Copy link

changeset-bot bot commented Mar 6, 2025

🦋 Changeset detected

Latest commit: 5017825

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 1 package
Name Type
@sveltejs/amp Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

Comment on lines 10 to +12
return html
.replace(/<style([^]+?)<\/style>/, (match, $1) => `<style amp-custom${$1}</style>`)
.replace(/<script[^]+?<\/script>/g, '')
.replace(/<link[^>]+>/g, (match) => {
.replace(/<style([^]+?)<\/style>/i, (match, $1) => `<style amp-custom${$1}</style>`)
.replace(/<script[^]+?<\/script>/gi, '')

Check failure

Code scanning / CodeQL

Polynomial regular expression used on uncontrolled data High

This
regular expression
that depends on
library input
may run slow on strings starting with '<script' and with many repetitions of '<script'.
Comment on lines 10 to +11
return html
.replace(/<style([^]+?)<\/style>/, (match, $1) => `<style amp-custom${$1}</style>`)
.replace(/<script[^]+?<\/script>/g, '')
.replace(/<link[^>]+>/g, (match) => {
.replace(/<style([^]+?)<\/style>/i, (match, $1) => `<style amp-custom${$1}</style>`)

Check failure

Code scanning / CodeQL

Polynomial regular expression used on uncontrolled data High

This
regular expression
that depends on
library input
may run slow on strings starting with '<style' and with many repetitions of '<stylea'.
Comment on lines 10 to +12
return html
.replace(/<style([^]+?)<\/style>/, (match, $1) => `<style amp-custom${$1}</style>`)
.replace(/<script[^]+?<\/script>/g, '')
.replace(/<link[^>]+>/g, (match) => {
.replace(/<style([^]+?)<\/style>/i, (match, $1) => `<style amp-custom${$1}</style>`)
.replace(/<script[^]+?<\/script>/gi, '')

Check failure

Code scanning / CodeQL

Incomplete multi-character sanitization High

This string may still contain
<script
, which may cause an HTML element injection vulnerability.

Copilot Autofix AI 16 days ago

To fix the problem, we need to ensure that all instances of <script> tags are removed from the input, even if they are nested or malformed. The best way to achieve this is to apply the regular expression replacement repeatedly until no more replacements can be performed. This approach ensures that all occurrences of the targeted pattern are effectively removed.

We will modify the transform function to repeatedly apply the replacement for <script> tags until the input no longer changes. This will be done by introducing a loop that continues to replace <script> tags until the input remains the same.

Suggested changeset 1
packages/amp/index.js

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/packages/amp/index.js b/packages/amp/index.js
--- a/packages/amp/index.js
+++ b/packages/amp/index.js
@@ -9,20 +9,25 @@
 export function transform(html) {
-	return html
-		.replace(/<style([^]+?)<\/style>/i, (match, $1) => `<style amp-custom${$1}</style>`)
-		.replace(/<script[^]+?<\/script>/gi, '')
-		.replace(/<link[^>]+>/gi, (match) => {
-			if (/rel=('|")?stylesheet\1/.test(match)) {
-				if (/ disabled /.test(match)) return '';
-				throw new Error(
-					'An AMP document cannot contain <link rel="stylesheet"> — ensure that inlineStyleThreshold is set to Infinity, and remove links from your page template and <svelte:head> elements'
-				);
-			}
+	let previous;
+	do {
+		previous = html;
+		html = html
+			.replace(/<style([^]+?)<\/style>/i, (match, $1) => `<style amp-custom${$1}</style>`)
+			.replace(/<script[^]+?<\/script>/gi, '')
+			.replace(/<link[^>]+>/gi, (match) => {
+				if (/rel=('|")?stylesheet\1/.test(match)) {
+					if (/ disabled /.test(match)) return '';
+					throw new Error(
+						'An AMP document cannot contain <link rel="stylesheet"> — ensure that inlineStyleThreshold is set to Infinity, and remove links from your page template and <svelte:head> elements'
+					);
+				}
 
-			return match;
-		})
-		.replace(/<meta[^>]+>/g, (match) => {
-			if (match.includes('http-equiv')) return '';
-			return match;
-		})
-		.replace('</head>', boilerplate + '</head>');
+				return match;
+			})
+			.replace(/<meta[^>]+>/g, (match) => {
+				if (match.includes('http-equiv')) return '';
+				return match;
+			})
+			.replace('</head>', boilerplate + '</head>');
+	} while (html !== previous);
+	return html;
 }
EOF
@@ -9,20 +9,25 @@
export function transform(html) {
return html
.replace(/<style([^]+?)<\/style>/i, (match, $1) => `<style amp-custom${$1}</style>`)
.replace(/<script[^]+?<\/script>/gi, '')
.replace(/<link[^>]+>/gi, (match) => {
if (/rel=('|")?stylesheet\1/.test(match)) {
if (/ disabled /.test(match)) return '';
throw new Error(
'An AMP document cannot contain <link rel="stylesheet"> — ensure that inlineStyleThreshold is set to Infinity, and remove links from your page template and <svelte:head> elements'
);
}
let previous;
do {
previous = html;
html = html
.replace(/<style([^]+?)<\/style>/i, (match, $1) => `<style amp-custom${$1}</style>`)
.replace(/<script[^]+?<\/script>/gi, '')
.replace(/<link[^>]+>/gi, (match) => {
if (/rel=('|")?stylesheet\1/.test(match)) {
if (/ disabled /.test(match)) return '';
throw new Error(
'An AMP document cannot contain <link rel="stylesheet"> — ensure that inlineStyleThreshold is set to Infinity, and remove links from your page template and <svelte:head> elements'
);
}

return match;
})
.replace(/<meta[^>]+>/g, (match) => {
if (match.includes('http-equiv')) return '';
return match;
})
.replace('</head>', boilerplate + '</head>');
return match;
})
.replace(/<meta[^>]+>/g, (match) => {
if (match.includes('http-equiv')) return '';
return match;
})
.replace('</head>', boilerplate + '</head>');
} while (html !== previous);
return html;
}
Copilot is powered by AI and may make mistakes. Always verify output.
Positive Feedback
Negative Feedback

Provide additional feedback

Please help us improve GitHub Copilot by sharing more details about this comment.

Please select one or more of the options
.replace(/<script[^]+?<\/script>/g, '')
.replace(/<link[^>]+>/g, (match) => {
.replace(/<style([^]+?)<\/style>/i, (match, $1) => `<style amp-custom${$1}</style>`)
.replace(/<script[^]+?<\/script>/gi, '')

Check failure

Code scanning / CodeQL

Bad HTML filtering regexp High

This regular expression does not match script end tags like </script >.

Copilot Autofix AI 16 days ago

Copilot could not generate an autofix suggestion

Copilot could not generate an autofix suggestion for this alert. Try pushing a new commit or if the problem persists contact support.

@Rich-Harris
Copy link
Member

can we just disable the alerts? they're pretty stupid

@benmccann
Copy link
Member Author

I wanted to review the alerts individually before deciding how many were good practice or not. Like I don't think there's any security issue here, but this seems like a decent change to make anyway, so I don't know why we wouldn't merge this PR?

@Rich-Harris
Copy link
Member

I don't think it is a decent change to make. Svelte itself doesn't recognise non-lowercased <script> or <style>, and the elements that are being transformed are (possibly with rare exceptions? I don't recall if it's possible to add scripts/styles in userland in this context) from SvelteKit and are known to be lowercase.

So the most likely outcome is that this change makes no difference whatsoever, but if that doesn't happen it's because we introduced a discrepancy between how tags are treated in different parts of the ecosystem, and added untested/undocumented surface area that would probably become an example of Hyrum's Law one day.

In other words everything CodeQL has found so far is wrong or irrelevant. Feels like a waste of time to me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants