Add optional callback argument to serialize_block(s) #5187

ockham · 2023-09-11T16:53:28Z

For some operations, we need to traverse the parsed block tree (before it is eventually serialized). In order to minimize the number of tree traversals (which is a potentially costly operation), it seems that adding a callback function argument to serialize_block() -- which inevitably has to traverse the entire tree anyway -- is a natural fit.

Examples where this could come in handy:

theme arg injection into a block template loaded from a file.
automatic insertion of hooked blocks (see https://core.trac.wordpress.org/ticket/59313)

Spun off from #5158.

Trac ticket: https://core.trac.wordpress.org/ticket/59327

This Pull Request is for code review only. Please keep all other discussion in the Trac ticket. Do not merge this Pull Request. See GitHub Pull Requests for Code Review in the Core Handbook for more details.

dlh01 · 2023-09-12T02:50:10Z

Hi! Please forgive my dropping in with a few comments, but this PR was interesting to me.

As a regular reader of the block API, I found the $callback parameter somewhat unclear. In the context #5158, of course, it makes more sense. Otherwise, $callback seems just as likely to apply at the end of the function as to the beginning. Perhaps it acts as replacement logic for the whole function. A longer name like $tree_callback or something might help disambiguate.

Next, after thinking about the variable name, I wondered why this function would introduce a dedicated parameter for what most other functions would make accessible as a filter — pre_serialize_block or something. My instinct as a developer at an agency would be that if core has a need to modify the behavior of serialize_block(), doing so with a filter would be more in keeping with the extensibility of the project and preserve the ability for developers to integrate with core.

Still, I can think of some reasons against a filter, such as being wary about adding a filter to a hot code path, a desire to keep the new behavior in #5158 "private," or disliking the pattern of adding and removing one-off filters. So that leads to my final thought, which is, rather than add a callback to serialize_block(), create a separate function for traversing the block tree, and call that before serialize_blocks(), e.g.

$pattern            = $this->registered_patterns[ $pattern_name ];
$blocks             = parse_blocks( $pattern['content'] );
$visitor            = _parsed_block_visitor( $pattern );
$blocks             = map_block_tree( $visitor, $blocks ); // Here
$pattern['content'] = serialize_blocks( $blocks );

This new function could be @access private if you thought that was important, although I can think of some cases where we could have used such a function with our clients.

Regardless, a new function would keep an added responsibility out of serialize_block(s)() and still allow you to modify the block tree without interference when Block Hooks required doing so.

Thanks for considering it!

src/wp-includes/blocks.php

tests/phpunit/tests/blocks/serialize.php

gziolo · 2023-09-12T10:38:51Z

@dlh01, you shared great feedback in your comment about $callback. It's a difficult topic as serialize_blocks needs to be extremely performant.

This new function could be @access private if you thought that was important, although I can think of some cases where we could have used such a function with our clients.

Regardless, a new function would keep an added responsibility out of serialize_block(s)() and still allow you to modify the block tree without interference when Block Hooks required doing so.

This would work, but that would require traversing the tree of blocks twice. You would essentially have to recreate recursion inside map_block_tree. What's interesting, a very similar helper is already present in the code: _inject_theme_attribute_in_block_template_content.

My instinct as a developer at an agency would be that if core has a need to modify the behavior of serialize_block(), doing so with a filter would be more in keeping with the extensibility of the project and preserve the ability for developers to integrate with core.

Yes, that would definitely be my favorite approach. I'm very hesitant to go this path though, as it's hard to predict the consequences. In this case, it's safer to start small and allow modifications only when loading the templates and patterns from the files on the server, as we know it will get processed only once. In fact, Block Hooks feature will most likely include a filter as part of the flow so plugin devs can cover more advanced use cases where they have specific limitations for automatically inserting blocks.

Anyway, let's keep the discussion going as we implement the basic functionality to enable the feature for WP 6.4 beta 1.

ockham · 2023-09-12T11:00:02Z

Hey @dlh01, thanks a lot for your thoughtful feedback! I think you're making a number of great points there, so it's much appreciated 😄

As a regular reader of the block API, I found the $callback parameter somewhat unclear. In the context #5158, of course, it makes more sense. Otherwise, $callback seems just as likely to apply at the end of the function as to the beginning. Perhaps it acts as replacement logic for the whole function. A longer name like $tree_callback or something might help disambiguate.

Yeah, I found the naming part hard. Eventually, I went with the fairly generic $callback, as is also used by somewhat similar PHP functions such as array_map.

I'm a bit reluctant to use $tree_callback as it seems to put too much focus on the tree aspect 🤔 $block_callback also comes to mind, but to me, that evokes more something like a render_block callback... Naming is hard! 😅 Maybe I can solve this by a better argument description in the PHPDoc?

Next, after thinking about the variable name, I wondered why this function would introduce a dedicated parameter for what most other functions would make accessible as a filter — pre_serialize_block or something. My instinct as a developer at an agency would be that if core has a need to modify the behavior of serialize_block(), doing so with a filter would be more in keeping with the extensibility of the project and preserve the ability for developers to integrate with core.

Still, I can think of some reasons against a filter, such as being wary about adding a filter to a hot code path, a desire to keep the new behavior in #5158 "private," or disliking the pattern of adding and removing one-off filters.

Yes, that was pretty much my thought process! I'll add that the main reason against a filter was that serialize_block is a fairly low-level function that's used in a variety of different contexts, so it seemed risky to require the user to remove the filter after serialization to make sure it wouldn't be accidentally applied when running serialization in a different context. Finally, if we have different parts of the codebase add different filters for serialize_blocks (to be used in different contexts), it could also get a bit hard to tell which ones apply to a certain code path, since the addition and removal of filters isn't necessarily colocated with the serialize_block() call.

So that leads to my final thought, which is, rather than add a callback to serialize_block(), create a separate function for traversing the block tree, and call that before serialize_blocks(), e.g. [...]

I also considered this option 😅 but ultimately decided against. Two reasons here:

Efficiency: It would duplicate tree traversal, which is a potentially costly operation. As stated in the PR description, serialize_block has to recursively traverse the tree; if we need to traverse it for another purpose, we might as well do it here 😄
Established pattern: There's arguable some precedent for a pattern like this -- application of a function to every element of a data structure while it is being traversed (the basic one being once again array_map). In cases like these, languages like PHP (but also JS) tend to accept the callback as a function argument.

FWIW, I was considering still some other alternatives/variations (e.g. even re-implementing serialize_blocks itself using a variation of your map_block_tree function, and/or accepting an array of callbacks rather than just an individual one) but eventually landed on the present approach since it seemed to strike a good balance between the requirements described above, and pretty good backwards-compatibility (via adding a new argument to an existing function). It also allowed keeping serialize_blocks perfectly generic and ignorant of applications such as hooked block insertion, as that can be completely contained in the callback.

With all that said, I could see us adding a generic filter to serialize_block like you suggested later if a strong need arises, but for the time being, I'd rather go with the callback argument, for the above reasons 😊

Edit: I only saw that @gziolo already replied to a similar effect after I submitted my comment 😅

gziolo

This looks good. We still might want to improve the part related to how the new $callback param is called, or exercise a different pattern for fulfilling the same task as discussed in the comments. Let's ensure it doesn't delay the process of integration the Block Hooks into WP 6.4 beta 1.

ockham · 2023-09-12T12:34:39Z

Thank you @gziolo!

I'll make one minor change to the PHPDoc for $callback to clarify what it's for.

ockham · 2023-09-12T13:01:36Z

Committed to Core in https://core.trac.wordpress.org/changeset/56557/.

ockham · 2023-09-12T13:07:49Z

BTW here's a basic first change that's going to leverage the new callback arg: #5192 😊

(Test coverage to guard against regressions is being added in #5186.)

dlh01 · 2023-09-12T14:39:41Z

Thanks @ockham and @gziolo for your comments! (And thanks @ockham for the generous props.)

Both of you are right, of course, that a separate function would require traversing the block tree again, which is not optimal. I would say in response, first, that a separate function leaves the door open to refactoring the logic in both functions to reduce the duplication with minimal effects on backwards compatibility.

Second, the performance effect of traversing the tree twice would be felt in only the locations where the new function was called, whereas the current implementation commits core to checking is_callable() before serializing every block everywhere. On a client project that we recently finished, for example, serialize_blocks() is called three times just within the scope of the_content as part of fulfilling various requirements for the site. I don't have the numbers to show it one way or the other, but I wonder whether these is_callable()s running all the time would soon add up to more than the cost of the traversals running some of the time, an effect that would reach our projects, at least.

ockham · 2023-09-13T17:09:38Z

Thanks @ockham and @gziolo for your comments! (And thanks @ockham for the generous props.)

Both of you are right, of course, that a separate function would require traversing the block tree again, which is not optimal. I would say in response, first, that a separate function leaves the door open to refactoring the logic in both functions to reduce the duplication with minimal effects on backwards compatibility.

But if it needs serialization in addition to another operation, it will always traverse the tree at least twice, won't it? I'd really rather avoid that as a lower boundary for file-based template processing, when it's clear that it is avoidable.
(Furthermore, I think that there's a risk that with this pattern, people will end up introducing another full tree traversal for each additional operation they would like to perform on the tree, rather than "merging" that operation into an existing traversal.)

Second, the performance effect of traversing the tree twice would be felt in only the locations where the new function was called, whereas the current implementation commits core to checking is_callable() before serializing every block everywhere. On a client project that we recently finished, for example, serialize_blocks() is called three times just within the scope of the_content as part of fulfilling various requirements for the site. I don't have the numbers to show it one way or the other, but I wonder whether these is_callable()s running all the time would soon add up to more than the cost of the traversals running some of the time, an effect that would reach our projects, at least.

That's fair enough (although my hunch is that is_callable will be a fairly quick check). I would still like to continue to stick with the API of a callback argument for serialize_blocks, and to avoid a lower boundary of two tree traversals for file-based templates.

I believe that it should be easy enough to optimize the chosen approach: If the is_callable call proves too costly, we could do something like the following:

function serialize_blocks( $blocks, $callback = null ) {
	$result = '';
	if ( is_callable( $callback ) ) {
		foreach ( $blocks as $block ) {
			$result .= serialize_block( $block, $callback );
		};
	} else {
		$result .= _serialize_block_without_callback( $block );
	}
	return $result;
}

With _serialize_block_without_callback defined like the "old" (pre-callback) serialize_block. A similar check could be added to serialize_block (singular). AFAICS, that would limit the overhead to just one is_callable check.

gziolo · 2023-09-14T08:36:06Z

I checked out of the curiosity whether this change impacted performance metrics reported in trunk with https://github.com/WordPress/wordpress-develop/actions/runs/6159812895/job/16715609415:

It gets compensated with the further refactoring as documented in #5192 (comment).

dlh01 · 2023-09-17T02:07:34Z

But if it needs serialization in addition to another operation, it will always traverse the tree at least twice, won't it? I'd really rather avoid that as a lower boundary for file-based template processing, when it's clear that it is avoidable. (Furthermore, I think that there's a risk that with this pattern, people will end up introducing another full tree traversal for each additional operation they would like to perform on the tree, rather than "merging" that operation into an existing traversal.)

I think you sort of answered your own question 😄 Traversing the tree twice is definitely avoidable. In my original comment, I suggested a separate function to apply the callback throughout the tree before serialization occurs. Another way of achieving the goal of minimizing the impact to serialize_block(s) would be to reverse your suggestion of a new function that serializes without a callback and create a _serialize_block(s)_with_callback().

Both new functions would, though, put us back to where we started, which is that the logic to recurse through the tree would be recreated.

I think at the heart of my original comment was the sense that it would be preferable to accept that duplicated logic in a new function that served the specific purposes presented by block hooks, compared to permanently encoding it in the main serialize_block(s) functions.

A new internal function can be deprecated and forgotten as the block API takes on new use cases (for example, if it becomes necessary to merge multiple operations into a traversal), but the new arguments to the existing functions will live on.

Anyway: This is fun to talk about, but I know you have other things to work on, so we can leave it here. I appreciate you indulging me in this discussion and will be happy to talk about it more if it ever comes up again!

ockham · 2023-09-19T08:20:26Z

@dlh01 Just a quick update: Since we need to add a bit more logic, we'll be reverting the change to serialize_block(s) and will introduce a new traverse_and_serialize_block(s) instead. Thanks again for the discussion, it was insightful feedback that eventually influenced this decision! 😄

gziolo · 2023-09-19T10:16:48Z

I opened #5246, which addresses the changes mentioned by @ockham.

dlh01 · 2023-09-20T05:03:52Z

Very interesting, @ockham — thanks for the heads up!

On the subject of sharing code between the functions, the decorator pattern might become helpful as the block API evolves. Here's an example implementation that I was experimenting with after our initial discussion, based on the original changes to serialize_block(s). WordPress doesn't really go for interfaces, but in this example I think they would allow for sharing the innerContent loop without conditionals.

ockham added 2 commits September 11, 2023 16:35

Add optional callback argument to serialize_block(s)

ffe785a

Add test coverage for callback argument

fd58a40

ockham self-assigned this Sep 11, 2023

ockham requested a review from gziolo September 11, 2023 16:53

ockham marked this pull request as ready for review September 11, 2023 16:56

ockham mentioned this pull request Sep 11, 2023

Add optional callback argument to serialize_block(s) #5185

Closed

gziolo reviewed Sep 12, 2023

View reviewed changes

src/wp-includes/blocks.php Outdated Show resolved Hide resolved

gziolo reviewed Sep 12, 2023

View reviewed changes

tests/phpunit/tests/blocks/serialize.php Show resolved Hide resolved

ockham added 2 commits September 12, 2023 13:07

Add docblock to test

b4dc0e5

Change callback arg type to callable|null

d5781ad

gziolo approved these changes Sep 12, 2023

View reviewed changes

Clarify PHPDoc a bit

7015be3

ockham closed this Sep 12, 2023

ockham deleted the add/serialize-blocks-callback-arg branch September 12, 2023 13:01

ockham mentioned this pull request Sep 13, 2023

Inject theme attribute during serialization #5192

Closed

ockham mentioned this pull request Sep 19, 2023

Blocks: Pass parent block, block index, chunk index to serialize_block(s) callback argument #5242

Closed

2 tasks

gziolo mentioned this pull request Sep 19, 2023

Blocks: Introduce a variation of serialize blocks helper with traversing #5246

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add optional callback argument to serialize_block(s) #5187

Add optional callback argument to serialize_block(s) #5187

Uh oh!

ockham commented Sep 11, 2023 •

edited

Loading

Uh oh!

dlh01 commented Sep 12, 2023

Uh oh!

Uh oh!

Uh oh!

gziolo commented Sep 12, 2023

Uh oh!

ockham commented Sep 12, 2023 •

edited

Loading

Uh oh!

gziolo left a comment

Uh oh!

ockham commented Sep 12, 2023

Uh oh!

ockham commented Sep 12, 2023

Uh oh!

ockham commented Sep 12, 2023 •

edited

Loading

Uh oh!

dlh01 commented Sep 12, 2023

Uh oh!

ockham commented Sep 13, 2023

Uh oh!

gziolo commented Sep 14, 2023

Uh oh!

dlh01 commented Sep 17, 2023

Uh oh!

ockham commented Sep 19, 2023

Uh oh!

gziolo commented Sep 19, 2023

Uh oh!

dlh01 commented Sep 20, 2023

Uh oh!

Uh oh!

Add optional callback argument to serialize_block(s) #5187

Add optional callback argument to serialize_block(s) #5187

Uh oh!

Conversation

ockham commented Sep 11, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dlh01 commented Sep 12, 2023

Uh oh!

Uh oh!

Uh oh!

gziolo commented Sep 12, 2023

Uh oh!

ockham commented Sep 12, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gziolo left a comment

Choose a reason for hiding this comment

Uh oh!

ockham commented Sep 12, 2023

Uh oh!

ockham commented Sep 12, 2023

Uh oh!

ockham commented Sep 12, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dlh01 commented Sep 12, 2023

Uh oh!

ockham commented Sep 13, 2023

Uh oh!

gziolo commented Sep 14, 2023

Uh oh!

dlh01 commented Sep 17, 2023

Uh oh!

ockham commented Sep 19, 2023

Uh oh!

gziolo commented Sep 19, 2023

Uh oh!

dlh01 commented Sep 20, 2023

Uh oh!

Uh oh!

ockham commented Sep 11, 2023 •

edited

Loading

ockham commented Sep 12, 2023 •

edited

Loading

ockham commented Sep 12, 2023 •

edited

Loading