-
Notifications
You must be signed in to change notification settings - Fork 205
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HTML code changed to character when sanitizing #190
Comments
This is expected behavior. HtmlSanitizer uses AngleSharp, a standards compliant HTML parser that parses the input before it is sanitized. This results in some entities getting expanded. It works the same way in a browser: var e = document.createElement("div");
e.innerHTML = "ã &";
e.innerHTML // -> "ã &" |
Hmm ok, the issue originally arose from a signature creator tool for outlook. This seems to have problesm resolving the 'ã' character (along with a couple of other characters. Apparently this might be more of an issue with outlook rather then the sanitized Html. |
Perhaps this is an encoding issue, either within Outlook or the code that is generating the email. I'd rather not add code to HtmlSanitizer to work around these kinds of issues in third-party software. Maybe you could add a short snippet of your |
When sanitizing an HTML string that contains an "&" this is, as expected unchanged.
However when the same thing is done for "ã" the returned value is ã.
This is, from what I can tell incorrect behavior.
I haven't extensively tested this but it also seems to occur for "°" elements and probably for many more items.
The text was updated successfully, but these errors were encountered: