-
Notifications
You must be signed in to change notification settings - Fork 384
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle additional validation errors #310
Handle additional validation errors #310
Conversation
…for srcset and size
…es to add their own custom tags, attributes and protocols that need to be stripped
…s it could contain valid nodes
@@ -102,6 +107,26 @@ public function get_data() { | |||
'<a href="http://example.com" target="boom">Link</a>', | |||
'<a href="http://example.com">Link</a>', | |||
), | |||
|
|||
'a_with_href_mailto' => array( | |||
'<a href="mailto:email@domain.com">Link</a>', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mailto
should be okay (it's included in an example here: https://github.com/ampproject/amphtml/blob/eb641b052d44f78b8b83fc91bc64394b7263ee5e/examples/article.amp.html#L203)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was throwing validation errors in Webmaster Tools as recently as last week. It looks like they've just updated the spec to support it and a few others:
allowed_protocol: "http"
allowed_protocol: "https"
allowed_protocol: "mailto"
# Whitelisting additional commonly observed third party
# protocols which should be safe
allowed_protocol: "sms"
allowed_protocol: "tel"
allowed_protocol: "viber"
allowed_protocol: "whatsapp"
Will update accordingly.
With regards to the comment about moving the new filters The initial problem is that the base class is using However, I can't think of a scenario where you'd ever want to allow someone to modify the built in since they will always be invalid. In essence, you'd be letting someone intentionally break validation which seems pointless. So, the args I've added are only for additional tags/protocols/attributes and do not allow any modification of the built-in list. I think this addresses following the pattern established with the iframe class and also protects anyone using this plugin against doing something intentionally or accidentally invalid. |
…d iframe to use new validation logic
This is ready for final consideration for merging and is no longer a work in progress. I decided to add a new function to If an iframe, audio or video node is found to have no valid src, it is then altogether removed. This only happens if the I've also added a bunch of new unit tests for these scenarios. |
$protocol = strtok( $src, ':' ); | ||
if ( 'https' !== $protocol ) { | ||
// Check if https is required | ||
$https_required = apply_filters( 'amp_require_https_src', false ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should move this to an arg.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fine with me. Rewrote this and the corresponding unit tests.
$this->sanitize_a_attribute( $node, $attribute ); | ||
// Sanitize the tag, but remove it entirely if the href is invalid. | ||
// Children will be preserved as part of the parent. | ||
if ( false === $this->sanitize_a_attribute( $node, $attribute ) ) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should separate out the validation and sanitization. No point in sanitizing the attributes if we're just going to drop the link. May have to do the validation before we process any attributes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair point indeed. I've handled this and also added the scenario of <a name="section2"></a>
both to the validation and unit tests.
This is ready for another review. |
|
||
// If no href is set and this isn't an anchor, it's invalid | ||
if ( empty( $href ) ) { | ||
if ( ! empty( $node->getAttribute( 'name' ) ) ) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Breaks in PHP 5.2:
Fatal error: Can't use method return value in write context in /home/travis/build/Automattic/amp-wp/includes/sanitizers/class-amp-blacklist-sanitizer.php on line 124
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
This is ready for another review. Just got back from a few days off and saw/fixed the failing test. |
// TODO: `source` does not have closing tag, and DOMDocument doesn't handle it well. | ||
foreach ( $node->childNodes as $child_node ) { | ||
$new_child_node = $child_node->cloneNode( true ); | ||
$new_node->appendChild( $new_child_node ); | ||
$old_child_attributes = AMP_DOM_Utils::get_node_attributes_as_assoc_array( $new_child_node ); | ||
$new_child_attributes = $this->filter_attributes( $old_child_attributes ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not going to block merging (I will open a new issue for this) but we should probably filter attributes for the source
elements separately (since the general attributes don't apply for it).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair point and agreed on a new issue
Spotted two things but otherwise this is good to go. Thanks again for working on this! |
Fixed or commented on those items, when you're ready to take another look. |
<a>
tags that have invalid href attributes, but preserves child elements.<font>
tags, which aren't allowed, but preserves child elements.target
attributes which must be lowercase for validation.