-
Notifications
You must be signed in to change notification settings - Fork 384
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sanitize invalid children of amp-story and amp-story-page elements to prevent white story of death #3336
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor nitpicks only.
/** | ||
* Sanitize the AMP elements contained by <amp-story-page> element where necessary. | ||
* | ||
* @since 0.2 | ||
*/ | ||
public function sanitize() { | ||
$nodes = $this->dom->getElementsByTagName( self::$tag ); | ||
$num_nodes = $nodes->length; | ||
$this->amp_story_tag_spec = AMP_Allowed_Tags_Generated::get_allowed_tag( 'amp-story' )[0]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
get_allowed_tag()
can potentially return null
and this will then throw a notice on PHP 7.4+: https://3v4l.org/pnjnl
However, I assume we fully control the allowed tags here and those we check for here can't be filtered away?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's correct. If we update the Validator spec and it results in these tag specs being null
then we'd catch it in the unit test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@schlessera I hardened this in 685cfa1.
return; | ||
$amp_story_element = $this->dom->getElementsByTagName( 'amp-story' )->item( 0 ); | ||
if ( $amp_story_element instanceof DOMElement ) { | ||
$this->sanitize_story_element( $amp_story_element ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The way this flows seems counterintuitive to me, it makes it look like sanitizing the story element is an edge case.
I would prefer it for the condition to be inversed and add an early return. Then have the sanitize_story_element()
as the default next step.
$node = $element->firstChild; | ||
while ( $node ) { | ||
$next_node = $node->nextSibling; | ||
if ( $node instanceof DOMElement ) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same logic inversion here, I would prefer an early return (continue
in this case) instead of making the main logic look like an edge case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True, but the reason why I did it this way was because of $node = $next_node
needing to run below. Otherwise, I'd have added:
if ( ! $node instanceof DOMElement ) {
$node = $next_node;
continue;
}
But that seems worse because the logic is duplicated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about something like this:
$node = $element->firstChild;
do {
$next_node = $node->nextSibling;
if ( ! $node instanceof DOMElement ) {
continue;
}
if ( 'amp-story-page' === $node->nodeName ) {
$page_number++;
$this->sanitize_story_page_element( $node, $page_number );
} elseif ( ! in_array( $node->nodeName, $this->amp_story_tag_spec['tag_spec']['child_tags']['child_tag_name_oneof'], true ) ) {
$this->remove_invalid_child( $node );
}
} while ( $node = $next_node );
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: This is mostly just preference here. I'll approve the changes and let you decide whether you want to make changes or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I like that. However, I tried it and then there is a PHPCS compliant: WordPress.CodeAnalysis.AssignmentInCondition.FoundInWhileCondition
. We can revisit later.
$node = $element->firstChild; | ||
while ( $node ) { | ||
$next_node = $node->nextSibling; | ||
if ( $node instanceof DOMElement ) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would also prefer an early return/continue here instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See reasoning above.
'story_with_invalid_layer_siblings' => [ | ||
'<amp-story-page><p>Before layer</p><amp-story-grid-layer><p>Lorem Ipsum Demet Delorit.</p></amp-story-grid-layer><p>After layer</p></amp-story-page</p>', | ||
'<amp-story-page><amp-story-grid-layer><p>Lorem Ipsum Demet Delorit.</p></amp-story-grid-layer></amp-story-page>', | ||
], | ||
]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test for CTA removal is missing...?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That should be covered above by story_with_cta_on_first_page
and story_with_multiple_cta_on_second_page
.
if ( ! isset( $rule_specs ) ) { | ||
continue; | ||
} | ||
foreach ( $rule_specs as $rule_spec ) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that the $rule_specs
array only has one item in it.
… prevent white story of death (#3336) * Sanitize invalid children of amp-story and amp-story-page elements * Harden logic for gathering allowed children for AMP Stories
* tag '1.3.0': (318 commits) Bump 1.3.0 Add inline styles for custom fonts (#3345) Limit deeply-nesting test to 200 to fix Xdebug error (#3341) Bump 1.3-RC2 (#3335) Sanitize invalid children of amp-story and amp-story-page elements to prevent white story of death (#3336) Remove unused Travis deploy stage (#3340) Implement automated accessibility testing using Axe (#3294) Only add all Google Font style rules in editor context Prevent adding AMP query var to Story URLs in Compatibility Tool Prevent attempting to redirect Stories with rejected validation errors Ensure all AMP scripts (including v0.js) get moved to the head Make sure that media picker is background types are filter correctly. Normalize style[type] attribute quote style after r46164 in WP core Fix phpunit covers tags Bump version to 1.3-RC1 Strip 100% width/height from layout=fill elements Fix issue with cut (#3246) Remove unused Google Fonts SVGs (#3289) Fix resize for non-fit text box (#3259) Use template_dir consistently as signal for transitional mode ...
A compatibility issue was discovered in #3321 with the Reading Time WP plugin, but it is likely going to happen with other plugins as well. The Reading Time WP plugin filters
the_content
to inject this at the beginning:This results in invalid
amp-story
which restricts its children to elements likeamp-story-page
. When thisspan
is a direct child and thechild_tag_name_oneof
constraint is violated, the result is the entireamp-story
being invalid and a white story of death (where thebody
has no children). The validation error is not helpful at all:This problem was actually “prophesied” in #2926:
So this PR fixes the problem by extending the
AMP_Story_Sanitizer
to preemptively remove AMP Story elements underamp-story
andamp-story-page
which are invalid. These are the two elements which have thechild_tag_name_oneof
constraint. This special case sanitizer is especially important for AMP Stories since all of the markup for a story is inpost_content
and is prone to be mutated withthe_content
filters to add elements like word counts, sharing buttons, and related posts. This PR prevents such elements from being seen by the tag-and-attribute sanitizer, thus preventing theamp-story
andamp-story-page
as a whole from being removed.In the case of the
span
which the Reading Time WP plugin adds tothe_content
, the validation error now becomes much more helpful:And no white story of death occurs.
Fixes #3321.