Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: decode html entities before sanitizing #40

Merged
merged 1 commit into from
Mar 1, 2022
Merged

Conversation

crookedneighbor
Copy link
Contributor

Jira

Addresses a reported XSS vulnerability in this package.

If the urls provided are HTML encoded, they'll remain HTML encoded when dynamically created using the DOM api. However, if rendered as HTML from the server, they'll get converted to readable entities without sanitization.

This update decodes the HTML in the url before sanitizing.

It's technically a breaking change, since it's transforming the URL, so it'll go out as v6.

@crookedneighbor crookedneighbor requested a review from a team as a code owner February 22, 2022 16:16
"&#x6A&#x61&#x76&#x61&#x73&#x63&#x72&#x69&#x70&#x74&#x3A&#x61&#x6C&#x65&#x72&#x74&#x28&#x27&#x58&#x53&#x53&#x27&#x29",
"jav	ascript:alert('XSS');",
"  javascript:alert('XSS');",
];

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

qq: Is there a reference to how these attack vector urls are generated?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are HTML encoded characters. So each character can be rendered as itself or as an HTML entity. For instance, j can be j or j or &#0000106 (see https://www.htmlsymbols.xyz/unicode/U+006A)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wish I had an easy go-to site for generating them, but since j is a valid encoded character, by default, encoders don't convert it. Had trouble finding a place that would encode it using the special characters.

Most of these values came from the XSS report, and they do decode to the expected values.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants