-
Notifications
You must be signed in to change notification settings - Fork 383
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for amp-bind #895
Conversation
ba8c9cd
to
d266233
Compare
Nice! Generally seems reasonable, but want to see if @choumx has any additional thoughts. |
ampproject/amphtml#11115 would help but haven't had the time. 😓 |
I'm glad that an XHTML-friendly option is being considered, but we'll need to support the existing bracketed syntax as well. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is very exciting ❤️ I left my CR below.
} | ||
|
||
// Save all alternative names in lookup to improve performance. | ||
if ( isset( $attr_spec[ AMP_Rule_Spec::ALTERNATIVE_NAMES ] ) ) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason not to cache this in the conditional statement above to avoid looping through the $attr_spec[ AMP_Rule_Spec::ALTERNATIVE_NAMES ]
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought about that, but there are other alternative names than just the bind ones. For example, srcset
is an alt for src
, and xlink:href
is an alternate for href
. These need to be accounted for as well. So that is why there is a separate conditional.
// Prepare whitelists. | ||
$this->allowed_tags = $this->args['amp_allowed_tags']; | ||
foreach ( AMP_Rule_Spec::$additional_allowed_tags as $tag_name => $tag_rule_spec ) { | ||
$this->allowed_tags[ $tag_name ][] = $tag_rule_spec; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is $this->allowed_tags[ $tag_name ]
always an array, amp-share-tracking
doesn't seem to be declared as an array?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This logic was just moved from the sanitize
method to the constructor. Yes, it should always an array. The $additional_allowed_tags
is an array of arrays:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, though it is still looking for $this->allowed_tags[ 'amp-share-tracking' ]
which doesn't exists isn't it? To make sure the array is conventionally declared, then we should have an:
if ( ! isset( $this->allowed_tags[ $tag_name ] ) ) {
$this->allowed_tags[ $tag_name ] = array();
}
Like you have done here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Potentially but it's just moved from existing code and it didn't have any problem. And actually, my use of isset()
is actually overkill it seems. It's actually not required in PHP at all. For example:
<?php
error_reporting(E_ALL);
$array = array();
$array['foo']['bar']['baz'] = 'Hello!';
print_r( $array );
Results in:
Array
(
[foo] => Array
(
[bar] => Array
(
[baz] => Hello!
)
)
)
There are no notices or warnings. It works all the way back to PHP 4.3.0 (and beyond potentially): https://3v4l.org/b8ZaG
/** | ||
* Replace AMP binding attributes with something that libxml can parse (as HTML5 data-* attributes). | ||
* | ||
* This is necessary necessary because attributes in square brackets are not understood in PHP and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo mistake necessary necessary
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is very necessary 😉
@@ -56,6 +56,8 @@ public static function get_dom( $document ) { | |||
|
|||
$dom = new DOMDocument(); | |||
|
|||
$document = self::convert_amp_bind_attributes( $document ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think at this stage it would be beneficial to look at extending the DOMDocument
. For example here is a quick prototype which handles the amp binding conversion:
/**
* Custom AMP DOM Document.
*/
class AMP_Dom_Document extends DOMDocument {
/**
* Class constructor.
*
* @param string $version The version number of the document as part of the XML declaration.
* @param string $encoding The encoding of the document as part of the XML declaration.
*/
public function __construct( $version = null, $encoding = null ) {
parent::__construct( $version, $encoding );
}
/**
* Load HTML.
*
* @param string $source The HTML code
* @param int $options Additional Libxml parameters
* @return boolean True on success or False on failure
*/
public function loadHTML( $source, $options = 0 ) {
$source = AMP_DOM_Utils::convert_amp_bind_attributes( $source );
parent::loadHTML( $source, $options );
}
/**
* Save HTML.
*
* @param DOMNode $node Optional parameter to output a subset of the document.
* @return string The document (or node) HTML code as string
*/
public function saveHTML( DOMNode $node = null ) {
$html = parent::saveHTML( $node );
return AMP_DOM_Utils::restore_amp_bind_attributes( $html );
}
}
Then new AMP_Dom_Document
would be called instead of new DOMDocument
.
Eventually, AMP_Dom_Document
could call the entire sanitization logic making the processing workflow easier to understand and more bullet proof.
Also, instead of removing attributes causing Warning: DOMDocument::loadHTML(): error parsing attribute name
and restoring them after saveHTML()
, we could look at potentially using DOMDocument::registerNodeClass()
in our AMP_DOM_Document::__construct
to allow AMP attributes and values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is a good idea, but I want to wait to do that until we do the sanitization of the entire doc including the head
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Anyway, it is a good enhancement but it's not critical for this first merge.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes that is perfectly fine, good to have it as a @todo
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ThierryA And this is now done! 😄 There is a shiny new \Amp\AmpWP\Dom\Document
courtesy of @schlessera via #3758.
*/ | ||
private function process_alternate_names( $attr_spec_list ) { | ||
foreach ( $attr_spec_list as $attr_name => &$attr_spec ) { | ||
if ( '[' === $attr_name[0] ) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is $this->args['amp_bind_placeholder_prefix']
prefix used for spec attributes like [src]
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, since such attributes aren't understood by DOMDocument
we add an safe placeholder version of that same attribute (e.g. amp-binding-1234…-src
) and add it to the alternative names so it can be matched instead when the nodes are processed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps we could use another term than binding
in the future if some of the attributes are not specific to the binding component.
* @param array $attr_spec_list Attribute spec list. | ||
* @return array Modified attribute spec list. | ||
*/ | ||
private function process_alternate_names( $attr_spec_list ) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is called for all tags and then for all attributes of each tag which is quite resource heavy. My understanding is that it is used to add the $this->args['amp_bind_placeholder_prefix']
to the AMP_Rule_Spec::ATTR_SPEC_LIST
which is later on used for the allowed attrs and flagging scripts.
To avoid having to do all these loops, we could perhaps do the conditional check at the is_amp_allowed_attribute()
and process_node()
level since we have access to the single node attributes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually it should just be called once for each tag. It loops over all attributes itself.
By doing it once at the constructor we avoid double-nested foreach
loops inside the sanitize_disallowed_attributes_in_node
method below. Since the double foreach O(n^2) was being done at the node level, it could be extremely poor for performance. So now is_amp_allowed_attribute
is able to use the rev_alternate_attr_name_lookup
to do fast lookups.
Nice!
…On Thu, Dec 19, 2019 at 6:20 PM Weston Ruter ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In includes/utils/class-amp-dom-utils.php
<#895 (comment)>:
> @@ -56,6 +56,8 @@ public static function get_dom( $document ) {
$dom = new DOMDocument();
+ $document = self::convert_amp_bind_attributes( $document );
@ThierryA <https://github.com/ThierryA> And this is now done! 😄 There is
a shiny new \Amp\AmpWP\Dom\Document courtesy of @schlessera
<https://github.com/schlessera> via #3758
<#3758>.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#895?email_source=notifications&email_token=AA3KLWVMJZR6SIJSNAZMOZDQZOUMPA5CNFSM4ENLZFZKYY3PNVWWK3TUL52HS4DFWFIHK3DMKJSXC5LFON2FEZLWNFSXPKTDN5WW2ZLOORPWSZGOCP2GO2Q#discussion_r359975012>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AA3KLWRIRWYPNH4PF3STMVDQZOUMPANCNFSM4ENLZFZA>
.
--
[image: Google Logo]
Thierry Muller
Developer Relations Program Manager
Engineering - News & Content Ecosystem
thierrymuller@google.com
|
[x]
binding attribute syntax withdata-amp-binding-RANDOMSTRING---x
so PHP can parse it.data-amp-binding...---
as being bracketed instead.amp-bind
component script when it comes across an AMP bound attribute.data-amp-binding-...
with the original syntax.Fixes #836.