Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix handling of child_tag_name_oneof constraint in tag spec #2926

Merged
merged 1 commit into from
Aug 1, 2019

Conversation

westonruter
Copy link
Member

In reviewing the Newspack theme (https://github.com/Automattic/newspack-theme/pull/120/files#r309371902), I was seeing something very strange in regards to markup like:

<amp-sidebar id="mobile-sidebar" layout="nodisplay" side="right">
	<nav>
		<ul>
			<li><a href="https://example.com/"></a></li>
		</ul>
		<div class="some-div">
			...
		</div>
	</nav>
</amp-sidebar>

The AMP plugin was marking it as invalid as if it was matching the amp-sidebar > nav tag spec, specifically due to the child_tags constraint:

  child_tags: {
    mandatory_num_child_tags: 1
    child_tag_name_oneof: "OL"
    child_tag_name_oneof: "UL"
  }

But this made no sense because this tag spec should not be considered at all because the nav does not have the toolbar or toolbar-target attributes which are also part of that tag spec:

  attrs: {
    name: "toolbar"
    mandatory: true
    # This tagspec sometimes produces errors for non-sidebar NAVs. This
    # dispatch key helps with this somewhat.
    dispatch_key: NAME_DISPATCH
  }
  attrs: {
    name: "toolbar-target"
    mandatory: true
  }

The reason turns out to be that the DOM was being mutated while checking for candidate tag specs to validate against a given node.

When beginning to process_node, we have to gather up a list of candidate tag specs to validate the node against:

/*
* Compile a list of rule_specs to validate for this node
* based on tag name of the node.
*/
$rule_spec_list_to_validate = [];
$rule_spec_list = [];
if ( isset( $this->allowed_tags[ $node->nodeName ] ) ) {
$rule_spec_list = $this->allowed_tags[ $node->nodeName ];
}
foreach ( $rule_spec_list as $id => $rule_spec ) {
if ( $this->validate_tag_spec_for_node( $node, $rule_spec[ AMP_Rule_Spec::TAG_SPEC ] ) ) {
$rule_spec_list_to_validate[ $id ] = $this->get_rule_spec_list_to_validate( $node, $rule_spec );
}
}

This loop is only checking for candidate tag specs; it should not be doing any sanitization. Nevertheless, the validate_tag_spec_for_node() method call there was doing this in its last line:

return ! ( ! empty( $tag_spec[ AMP_Rule_Spec::CHILD_TAGS ] ) && ! $this->check_valid_children( $node, $tag_spec[ AMP_Rule_Spec::CHILD_TAGS ] ) );

The check_valid_children() method call here was erroneously mutating the document, while checking to see if the nav is a match for that tag spec. The actual tag spec that should only be considered here is the nav spec which has no constraints on the children:

# 4.3.4 The nav element
tags: {
  html_format: AMP
  html_format: AMP4ADS
  html_format: AMP4EMAIL
  html_format: ACTIONS
  tag_name: "NAV"
}

This is a bug that was introduced in v1.1 via #1860.

👉 A side-effect of the change here is the sanitization of AMP components with invalid children will be more draconian. For example, if an amp-story has an invalid child element, then the entire amp-story element will be removed, as opposed to the invalid child alone being removed. Nevertheless, since only AMP components have such validation constraints for children, it should not so common for this to occur in WordPress sites that are just outputting normal HTML. It only will become an issue when starting to use AMP components, and this makes #1420 much more important as it will be needed to explain why the element was removed.

@westonruter westonruter added Bug Something isn't working Validation Sanitizers labels Jul 31, 2019
@westonruter westonruter added this to the v1.2.1 milestone Jul 31, 2019
@googlebot googlebot added the cla: yes Signed the Google CLA label Jul 31, 2019
Copy link
Member

@amedina amedina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was a tricky one!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Something isn't working cla: yes Signed the Google CLA Sanitizers Validation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants