You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
With 1.16.2 "<iframe src="https://example.com\"/>" remains "<iframe src="https://example.com\"/>".
Is it intentional that tags that are not allowed to be self-closing now only get fixed when using the htmlParser?
The text was updated successfully, but these errors were encountered:
The intention of the XML parser is to be a generic parser, and not follow the specific rules of the HTML parser.
Your snippet worked originally by a fluke of the implementation - the Tag object which holds these formatting rules was intended for the HTML parser and not the XML parser. In HTML we know what that an iframe element is and how it should behave. But in the XML parser we shouldn't assume that.
I didn't really intend for the XML parser to be used as a kind of semi-html parser, using some of the rules from HTML and ignoring others.
So this changed in #2008 when I added basic namespace support, to enable Math and SVG tags. Now the Tag set is namespaced, and in the XML parser, the HTML namespace is not set and therefore there is no matching iframe tag to return, and so there's no setting to disable self-closing.
A couple of ways I could think of changing this:
implement a namespace stack in the XML parser (there is an impl in the W3CDom which we might shift out to XML parser), and detect the namespace from an xmlns attribute, and lookup tags by that
add a default namespace parameter to the XML parser
Questions for you:
Can you tell me more about your usecase, and why you're not using the HTML parser?
Can you provide me a full sample of the HTML (or the source URL) so I can review implementation options
With 1.16.1 and earlier the following was true
With 1.16.2 "<iframe src="https://example.com\"/>" remains "<iframe src="https://example.com\"/>".
Is it intentional that tags that are not allowed to be self-closing now only get fixed when using the htmlParser?
The text was updated successfully, but these errors were encountered: