Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Block API: Add pre_render and post_render block filters #10108

Closed
wants to merge 7 commits into from

Conversation

dmsnell
Copy link
Member

@dmsnell dmsnell commented Sep 22, 2018

There are numerous needs to process posts at the block level without
demanding that plugin authors implement their own parsing systems.
Some needs involve structural processing - knowing the block type or
specific attributes on the block - while others are simply needing to
process the output HTML from a given block.

In this patch I'm adding an API to allow extending do_blocks
and gutenberg_render_block with a pre-processing and a
post-processing filter.

The main distinction between pre and post filtering is that pre
filtering gives us the chance to control or change how a block is rendered
while the post filtering gives us a chance to change a block's output.

One use case is in #8760 where we want to replace the HTML parts of
blocks while preserving other structure. Another use case could be
removing specific inner blocks or content based on the current user
requesting a post.

See the docblock comments in the patch itself for examples.

@dmsnell dmsnell added [Feature] Block API API that allows to express the block paradigm. [Status] In Progress Tracking issues with work in progress [Feature] Parsing Related to efforts to improving the parsing of a string of data and converting it into a different f labels Sep 22, 2018
dmsnell added a commit that referenced this pull request Nov 7, 2018
Attempt three at including positional information from the parse to enable isomorphic reconstruction of the source `post_content` after parsing.

See alternate attempts: #11082, #11309
Motivated by: #7247, #8760, Automattic/jetpack#10256
Enables: #10463, #10108

## Abstract

Add new `innerContent` property to each block in parser output indicating where in the innerHTML each innerBlock was found.

## Status

 - will update fixtures after design review indicates this is the desired approach
 - all parsers passing new tests for fragment behavior

## Summary

Inner blocks, or nested blocks, or blocks-within-blocks, can exist in Gutenberg posts. They are serialized in `post_content` in place as normal blocks which exist in between another block's comment delimiters.

```html
<!-- wp:outerBlock -->
Check out my
<!-- wp:voidInnerBlock /-->
and my other
<!-- wp:innerBlock -->
with its own content.
<!-- /wp:innerBlock -->
<!-- /wp:outerBlock -->
```

The way this gets parsed leaves us in a quandary: we cannot reconstruct the original `post_content` after parsing because we lose the origin location information for each inner block since they are only passed as an array of inner blocks.

```json
{
	"blockName": "core/outerBlock",
	"attrs": {},
	"innerBlocks": [
		{
			"blockName": "core/voidInnerBlock",
			"attrs": {},
			"innerBlocks": [],
			"innerHTML": ""
		},
		{
			"blockName": "core/innerBlock",
			"attrs": {},
			"innerBlocks": [],
			"innerHTML": "\nwith its own content.\n"
		}
	],
	"innerHTML": "\nCheck out my\n\nand my other\n\n"
}
```

At this point we have parsed the blocks and prepared them for attaching into the JavaScript block code that interprets them but we have lost our reverse transformation.

In this PR I'd like to introduce a new mechanism which shouldn't break existing functionality but which will enable us to go back and forth isomorphically between the `post_content` and first stage of parsing. If we can tear apart a Gutenberg post and reassemble then it will let us to structurally-informed processing of the posts without needing to be aware of all the block JavaScript.

The proposed mechanism is a new property as a **list of HTML fragments with `null` values interspersed between those fragments where the blocks were found**.

```json
{
	"blockName": "core/outerBlock",
	"attrs": {},
	"innerBlocks": [
		{
			"blockName": "core/voidInnerBlock",
			"attrs": {},
			"innerBlocks": [],
			"blockMarkers": [],
			"innerHTML": ""
		},
		{
			"blockName": "core/innerBlock",
			"attrs": {},
			"innerBlocks": [],
			"blockMarkers": [],
			"innerHTML": "\nwith its own content.\n"
		}
	],
	"innerHTML": "\nCheck out my\n\nand my other\n\n",
	"innerContent": [ "\nCheck out my\n", null, "\n and my other\n", null, "\n" ],
}
```

Doing this allows us to replace those `null` values with their associated block (sequentially) from `innerBlocks`.

## Questions

 - Why not use a string token instead of an array?
    - See #11309. The fundamental problem with the token is that it could be valid content input from a person and so there's a probability that we would fail to split the content accurately.

 - Why add the `null` instead of leaving basic array splits like `[ 'before', 'after' ]`?
    - By inspection we can see that without an explicit marker we don't know if the block came before or after or between array elements. We could add empty strings `''` and say that blocks exist only _between_ array elements but the parser code would have to be more complicated to make sure we appropriately add those empty strings. The empty strings are a bit odd anyway.

 - Why add a new property?
    - Code already depends on `innerHTML` and `innerBlocks`; I don't want to break any existing behaviors and adding is less risky than changing.
@dmsnell dmsnell closed this Nov 12, 2018
@dmsnell
Copy link
Member Author

dmsnell commented Nov 12, 2018

Rebased and restarted - previous head was 1014389

@dmsnell dmsnell reopened this Nov 12, 2018
lib/blocks.php Outdated
$block_type = WP_Block_Type_Registry::get_instance()->get_registered( $block['blockName'] );
$is_dynamic = $block['blockName'] && null !== $block_type && $block_type->is_dynamic();
$global_post = $post;
$pre_render = apply_filters( 'block_pre_render', $block );
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All filters must be documented:

https://make.wordpress.org/core/handbook/best-practices/inline-documentation-standards/php/#4-hooks-actions-and-filters

Which would be particularly helpful here because it's not clear to me the purpose in testing for null, or what the use-case is for pre_render.

I might have seen it as something of an inverse where a non-null filtered value would be inferred as the overridden value, similar to what's done in WP_oEmbed::get_html:

https://github.com/WordPress/WordPress/blob/da7a80d67fea29c2badfc538bfc01c8a585f0cbe/wp-includes/class-oembed.php#L375-L394

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @aduth - I'm not ready to document them because I'm not sure what I would document. trying to figure out the interface first 😉

I'll make the description in the PR clearer

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hah, I was just thinking about something along the same line as a solution for #11704.

As @aduth mentioned, the usual method would be to call the filter pre_render_block, and to pass it a null value as the first argument. $block would then be passed as the second argument.

If the pre_render_block rendered a non-null value (including an empty string, as #11704 would do), render_block() would return that value, instead of running the rest of the function.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

documented!

@aduth aduth mentioned this pull request Nov 15, 2018
@mtias mtias added this to the 4.7 milestone Nov 28, 2018
@dmsnell dmsnell changed the title WIP: Parser: Add filter to process blocks after parsing Block API: Add pre_render and post_render block filters Nov 28, 2018
@dmsnell dmsnell removed [Feature] Parsing Related to efforts to improving the parsing of a string of data and converting it into a different f [Status] In Progress Tracking issues with work in progress labels Nov 28, 2018
@mtias mtias modified the milestones: 4.7, 4.8 Nov 29, 2018
Allows for blocks to be structurally and textually filtered during the
rendering process. The pre-filter provides the parsed block object for
transformation while the post-filter provides the rendered output of the
block.

These filters can be used in unison to perform a variety of operations
on blocks. They are applied in a depth-first manner when rendering a
block tree so that deeper-nested blocks are processed before their
parents.

> Caveat: Inner blocks aren't structurally filtered in this patch yet
before their parent.

Actually this needs a lot of work still...
@youknowriad youknowriad modified the milestones: 4.8, 4.9 Dec 19, 2018
Copy link
Member

@aduth aduth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Been thinking on the naming / behavior of the filters added here, considering alignment with similar filtering elsewhere in core.

For "pre":

  • In my prior comment, I noted these are sometimes used to short-circuit continued execution of a function (example). I could see this being a valid use case where someone would want to preempt the rendering of the block with their own substitution markup, not just altering attributes of the block. This is still technically achievable via the "post" filter, but only after the render callback has already been called.
  • But there's also precedent for merely "prepare a value before it's used" (example), more like what's done here.
    • Minor, but the distinctly named $source_block and $block caught me off guard, where in most filtering I've seen the assignment occurs to the value being filtered (i.e. just have $block).

For "post":

  • If its purpose is to alter the return value of the function, I would expect its name should more or less align to the name of the function

To the last point, I'm a bit more curious the workflow for changes like this moving forward; whether they ought to be submitted as a pull request to the plugin, or as a proposal in a Trac issue for core.

@dmsnell
Copy link
Member Author

dmsnell commented Jan 9, 2019

Thanks @aduth for the comments here. On pre_ and post_ I am flexible on the terms but specifically what I think they need to communicate is that the pre_ filter has the ability to influence how something renders while the post_ filter has the ability to filter something after it has rendered.

The difference is in doing something like swapping out one block type for another or selecting a specific attribute (such as a locale) or filtering the list of child blocks vs. doing something like running a text filter over the output or scanning for images to cache.

as I went to seek out examples, I notice this has already been added to core with the 'render_block' filter name

sadly the way I remember it is that this change was introduced in Trac and committed without feedback when these pre_ and post_ filters existed in the plugin repo first. I had raised concern at the time before the merge about the render_block filter but that's history now. it has been my hope since before 5.0 that we could quickly deprecate and remove render_block since it doesn't communicate much about what or how it's filtering block data.

@pento
Copy link
Member

pento commented Jan 10, 2019

As this is all in Core now, it should really be done on Trac.

The render_block filter in Core follows the same pattern as similar function/filter combinations. At the end of the function, a filter with the same name is applied to the return value.

The pre_* pattern has two usages, as @aduth noted. These broadly fall into two categories:

  • The most common usage is to short-circuit the function. These are usually provided so that plugins can override the Core behaviour of that function.
  • An older pattern, which we don't use anymore, is the style of filtering data something before it's processed, often for storing in the database. pre_update_option is one such example. The more common pattern for that now is function_name_data, so a filter on $block at the start of render_block would be render_block_data.

So anyway, there's a related core ticket (WP45451) that I've added a patch for pre_render_block and render_block_data filters on.

@aduth
Copy link
Member

aduth commented Jan 11, 2019

it has been my hope since before 5.0 that we could quickly deprecate and remove render_block since it doesn't communicate much about what or how it's filtering block data.

To me, it communicates well that it merely allows an opportunity to override what would otherwise be returned by the function, given the default return value.

@dmsnell
Copy link
Member Author

dmsnell commented Jan 11, 2019

Thanks for linking the Trac ticket @pento - I'll close this issue in favor of that one.

@dmsnell dmsnell closed this Jan 11, 2019
@dmsnell dmsnell deleted the parser/add-post-parse-filter branch January 11, 2019 18:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
[Feature] Block API API that allows to express the block paradigm.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants