WebUtility.HtmlDecode does not decode all HTML5 character entities #19103
Labels
area-System.Net
design-discussion
Ongoing discussion about design without consensus
enhancement
Product code improvement that does NOT require public API changes/additions
help wanted
[up-for-grabs] Good issue for external contributors
Milestone
The current character entity replace dictionary used inside WebUtility uses the entity set as defined in HTML4, see https://github.com/dotnet/corefx/blob/master/src/System.Runtime.Extensions/src/System/Net/WebUtility.cs#L757.
However, the HTML5 spec defines additional entities, see https://www.w3.org/TR/html5/syntax.html#named-character-references.
I'm willing to send a PR to include the new named character references, if this that would be an acceptable change to the behavior of WebUtility.HtmlDecode. Any specific guidance with respect to making this change? Are there specific areas that need to be addressed?
The text was updated successfully, but these errors were encountered: