-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: HTML API: Add an XML serializer. #7408
base: trunk
Are you sure you want to change the base?
Conversation
The HTML Processor understands HTML regardless of how it's written, but many other functions are unable to do so. There are all sorts of syntax peculiarities and semantics that would be helpful to eliminate using the knowledge contained in the HTML Processor. This patch introduces `WP_HTML_Processor::normalize( $html )` as a method which takes a fragment of HTML as input and then returns a serialized version of the input, "cleaning it up" by balancing all tags, providing all missing optional tags, re-encoding all text, removing all duplicate attributes, and double-quote-escaping all attribute values. Core-62036
Co-authored-by: Weston Ruter <westonruter@git.wordpress.org>
If code later in the processing pipeline adds unquoted attributes and doesn't add the requisite space following that, then another parser might find that the solidus is part of the attribute value instead of serving as a self-closing flag. Co-authored-by: Weston Ruter <westonruter@git.wordpress.org>
Co-authored-by: Weston Ruter <westonruter@git.wordpress.org>
Co-authored-by: Weston Ruter <westonruter@git.wordpress.org>
Co-authored-by: Weston Ruter <westonruter@git.wordpress.org>
The following accounts have interacted with this PR and/or linked issues. I will continue to update these lists as activity occurs. You can also manually ask me to refresh this list by adding the Core Committers: Use this line as a base for the props when committing in SVN:
To understand the WordPress project's expectations around crediting contributors, please review the Contributor Attribution page in the Core Handbook. |
Test using WordPress PlaygroundThe changes in this pull request can previewed and tested using a WordPress Playground instance. WordPress Playground is an experimental project that creates a full WordPress instance entirely within the browser. Some things to be aware of
For more details about these limitations and more, check out the Limitations page in the WordPress Playground documentation. |
Are the above examples actually right? Is the Compare the above code to the example here: https://developer.mozilla.org/en-US/docs/Web/SVG/Element/foreignObject
The above Atom example has basically the same issue - is the Compare to the example here: https://en.wikipedia.org/wiki/Atom_(web_standard)#Example_of_an_Atom_1.0_feed |
Thanks @siliconforks. You're right, in that the new default namespace applies to the But I'm still exploring and trying to understand what needs to occur and how it can be done in order to transform as safely as possible. I'll add |
@siliconforks after reviewing the XML Names spec, I believe that it's still ideal to change the default namespace on the <svg><foreignObject><p>Hi</p></foreignObject></svg> Should translate into this <svg xmlns="http://www.w3.org/2000/svg"><svg:foreignObject xmlns="http://www.w3.org/1999/xhtml"><p>Hi</p></svg:foreignObject></svg> and this is actually closer to why I originally reset the default namespace on the it gets more complicated with attribute names, but only because they work differently than the element does. attribute names usually don't have a namespace, and |
In your example, wouldn't you also need to bind the Like this (adding whitespace to make it more readable): <svg xmlns="http://www.w3.org/2000/svg">
<svg:foreignObject xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg">
<p>Hi</p>
</svg:foreignObject>
</svg> ...or this: <svg xmlns="http://www.w3.org/2000/svg" xmlns:svg="http://www.w3.org/2000/svg">
<svg:foreignObject xmlns="http://www.w3.org/1999/xhtml">
<p>Hi</p>
</svg:foreignObject>
</svg> |
@siliconforks yeah but I didn't want to show that in the code snippet. we could, for instance, eagerly add this to the start of any XML output and let it be, or add it piecemeal. <p>Out<svg><foreignObject><p>Hi</p></foreignObject></svg> <p
xmlns="http://www.w3.org/1999/xhtml"
xmlns:h="http://www.w3.org/1999/xhtml"
xmlns:s="http://www.w3.org/2000/svg"
xmlns:m="http://www.w3.org/1998/Math/MathML"
>Out<svg xmlns="http://www.w3.org/2000/svg"><s:foreignObject xmlns="http://www.w3.org/1999/xhtml"><p>Hi</p></s:foreignObject></svg> anyway, I think this is a minor detail. the point is mainly that we can rely on resetting the default namespace on the integration points, but will likely want to prefix the integration point itself. |
Trac ticket: Core-62091
Built from #7331
Provides a mechanism to serialize an HTML fragment to the XML syntax. YOU PROBABLY SHOULDN'T USE THIS!!!!
REMEMBER that so-called "XHTML" served without a path ending in
.xml
or without theContent-type: application/xml+xhtml
HTTP header will render as HTML and ONE SHOULD NOT SERVE XML/XHTML as HTML!!!Extremely rare cases when it's appropriate to use this
<content type="html"><p>yay</></content>
, but if the document can be serialized into<content type="xhtml" xmlns="http://www.w3.org/1999/xhtml"><p>yay</p></content>
.HTML generally cannot be expressed in XML, and according to the HTML specification, Using the XML syntax is not recommended! Prefer escaping the HTML to avoid corruption and data loss.