-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: HTML API: Add set_inner_html()
to HTML Processor
#7326
base: trunk
Are you sure you want to change the base?
Conversation
This method depends on other methods, but this change introduces an idea for how to accomplish setting the inner HTML, leaving supporting details unresolved.
Test using WordPress PlaygroundThe changes in this pull request can previewed and tested using a WordPress Playground instance. WordPress Playground is an experimental project that creates a full WordPress instance entirely within the browser. Some things to be aware of
For more details about these limitations and more, check out the Limitations page in the WordPress Playground documentation. |
* | ||
* @since 6.8.0 | ||
* | ||
* @param string $new_inner_html New HTML to inject as inner HTML for the currently-matched tag. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any plans to support in the future the HTML fragment as the param in addition to string
? I can see how that might slightly improve developer experience when needing to construct more complex HTML and avoid serializing to HTML before passing to this method to convert back to the HTML fragment instance when insering into document.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It wasn't in my plans, mostly because I'm wearywary of passing around $processor
objects and think it's important to encourage keeping them local. as stateful objects, surprising things can happen when passing them to functions and getting them back. there's long been a discussion concerning locking them down or making them read only or limiting the work that can be done, but that doesn't exist yet.
On the other hand, when working earlier with templating I had a desire to wrap HTML content in some class indicating that it was safe. Something like that would allow deferred parsing and might be a worthwhile investigation here. On the other other hand I've also been a bit doubtful about the likelihood of people using that. Maybe it would eventually be good to have both as you suggest, where a string would be re-parsed, but a wrapped fragment could in theory be deferred.
In a wrapped model, to achieve performance wins, I think we have to be careful to defer until the last moment when rendering the outermost processor, which may not be easy and may require stacking complicated updates. For example, this code:
$processor = WP_HTML_Processor::create_fragment( '<li>' );
$processor->next_tag( 'li' );
$processor->set_inner_html( WP_HTML_Processor::create_fragment( 'Apples and Oranges' ) );
$outer_processor = WP_HTML_Processor::create_fragment( '<ul>' );
$outer_processor->next_tag( 'ul' );
$outer_processor->set_inner_html( $processor );
This is a challenge because the original processor isn't going to know that it shouldn't apply its update and advance. This also demonstrates some of the questions arising from non-local access of a processor: when passing it into another function it might scan to the end and "finish." What if calling code expects to continue using it after passing it in?
For the time being I see the eager string parsing as the more approachable option with computational weight as the tradeoff. This would leave room in the future for an optimization. Other potential optimizations might also arise, such as creating subtrees of the full tree for edits, but these do get complicated quickly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like the spawn_fragment_parser()
method hasn't been implemented yet. I assume it's going to be very similar to the create_fragment()
method, but the $context
would be inferred from the root document. So, the improvement for the developer could be minimal for the time being in spawn_fragment_parser()
would essentially change the context if necessary on the passed instance of the created fragment. It gets serialized in the next line, so it would get evaluated anyway.
$fragment = $this->spawn_fragment_parser( $new_inner_html );
$new_markup = $fragment->serialize();
In the scenario when using only string as param, the same example would look like this:
$processor = WP_HTML_Processor::create_fragment( '<li>' );
$processor->next_tag( 'li' );
$processor->set_inner_html( (string) WP_HTML_Processor::create_fragment( 'Apples and Oranges' ) );
$outer_processor = WP_HTML_Processor::create_fragment( '<ul>' );
$outer_processor->next_tag( 'ul' );
$outer_processor->set_inner_html( (string) $processor );
So there are two steps:
- the dev needs to explicitly serialize to string
- the processor need to recreate the fragment
Aside. It made me realize one important aspect for the example provided with the outlined implementation. By eagerly processing HTML when inserting inner HTML, it will happen in a bottom up fashion. In effect, the processor will have to serialize the 'Apples and Oranges' fragment multiple times as the fragments build on top of each other. At every stage, it will know more about the entire document so in certain scenarios it might even impact the produced HTML. Just to illustrate it better:
- first fragment
<p>My text</p>
-> entire paragraph processed - second fragment
<blockquote></blockquote>
-> has to process<blockquote><p>My text</p></blockquote>
- third fragment
<ul><li></li></ul>
-> has to process<ul><li><blockquote><p>My text</p></blockquote></li></ul>
So this isn't something I initially thought about, but after thinking a bit more about it, it's another part where things are getting interesting.
This is a challenge because the original processor isn't going to know that it shouldn't apply its update and advance. This also demonstrates some of the questions arising from non-local access of a processor: when passing it into another function it might scan to the end and "finish." What if calling code expects to continue using it after passing it in?
I didn't think about that earlier. Yes, that is indeed interesting. When passing as a string, you have to call get_updated_html()
so that means the fragment will end up at the end of the document. So that part becomes predictable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've sketched an implementation for the spawn_fragment_parser
method in dmsnell#22. The algorithm is well described in the spec.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See #7144 some prior work on different fragment contexts.
/** | ||
* Normalize an HTML string by serializing it. | ||
* | ||
* This removes any partial syntax at the end of the string. | ||
* | ||
* @since 6.7.0 | ||
* | ||
* @param string $html Input HTML to normalize. | ||
* | ||
* @return string|null Normalized output, or `null` if unable to normalize. | ||
*/ | ||
public static function normalize( string $html ): ?string { | ||
return static::create_fragment( $html )->serialize(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implemented in #7331.
This method depends on other methods, but this change introduces an idea for how to accomplish setting the inner HTML, leaving supporting details unresolved.
Trac ticket:
This Pull Request is for code review only. Please keep all other discussion in the Trac ticket. Do not merge this Pull Request. See GitHub Pull Requests for Code Review in the Core Handbook for more details.