Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't sanitize some DOM snippets, e.g. <td>text</td> #190

Closed
edg2s opened this issue Dec 1, 2016 · 4 comments
Closed

Can't sanitize some DOM snippets, e.g. <td>text</td> #190

edg2s opened this issue Dec 1, 2016 · 4 comments

Comments

@edg2s
Copy link
Contributor

edg2s commented Dec 1, 2016

input: <td>text</td>
expected: <td>text</td>
actual: text

@cure53
Copy link
Owner

cure53 commented Dec 1, 2016

That is because certain elements cannot stand on their own, particularly table data elements. They need a parent element to be able to function and make sense. The browser throws away elements that cannot stand on their own - and we use the browser to sanitize :)

This cannot easily be fixed in the core but a fix should be easy to implement via hook. Basically check upon element sanitation if it's a table data and react accordingly.

Would that work for you?

@edg2s
Copy link
Contributor Author

edg2s commented Dec 1, 2016

The browser throws away elements that cannot stand on their own

Only if you parse the string as an entire document using DOMParser. Other techniques such as jQuery.parseHTML will give you a single table node form <td>text</td>. If you don't want to change the parser, you could allow DOM nodes as an input, as well as HTML strings.

@cure53
Copy link
Owner

cure53 commented Dec 1, 2016

Indeed.

But jQuery.parseHTML works around that issue with several dirty tricks that we don't wanna employ :)

Being able to throw in DOM nodes and get a sanitized node back is indeed an interesting feature and we might implement that at some point. But not anytime soon - so in the meantime I recommend to use a hook to fix this problem.

@edg2s
Copy link
Contributor Author

edg2s commented Dec 1, 2016

FYI our use case is Wikipedia's visual editor, where we'd like to be able to pass around user-generated transactions that may include generated HTML fragments. In normal usage these fragments are safe, but there's nothing to stop someone injecting malicious HTML via the client side APIs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants