Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent styles from being removed when in Customizer preview with Standard mode #4553

Merged
merged 10 commits into from
Apr 9, 2020
25 changes: 14 additions & 11 deletions lib/optimizer/src/Transformer/AmpBoilerplate.php
Original file line number Diff line number Diff line change
Expand Up @@ -12,8 +12,8 @@
use DOMElement;

/**
* Transformer that removes <style> and <noscript> tags in <head>, keeping only the amp-custom style tag. It then
* inserts the amp-boilerplate.
* Transformer that removes AMP boilerplate <style> and <noscript> tags in <head>, keeping only the amp-custom style tag.
* It then (re-)inserts the amp-boilerplate.
*
* This is ported from the Go optimizer.
*
Expand All @@ -36,14 +36,14 @@ final class AmpBoilerplate implements Transformer
*/
public function transform(Document $document, ErrorCollection $errors)
{
$this->removeStyleAndNoscriptTags($document);
list($boilerplate, $css) = $this->determineBoilerplateAndCss($document->html);

$this->removeStyleAndNoscriptTags($document, $boilerplate);

if ($this->hasNoBoilerplateAttribute($document)) {
return;
}

list($boilerplate, $css) = $this->determineBoilerplateAndCss($document->html);

$styleNode = $document->createElement(Tag::STYLE);
$styleNode->setAttribute($boilerplate, '');
$document->head->appendChild($styleNode);
Expand All @@ -70,23 +70,26 @@ public function transform(Document $document, ErrorCollection $errors)
/**
* Remove all <style> and <noscript> tags except for the <style amp-custom> tag.
*
* @param Document $document Document to remove the tags from.
* @param Document $document Document to remove the tags from.
* @param string $boilerplate Boilerplate attribute name.
*/
private function removeStyleAndNoscriptTags(Document $document)
private function removeStyleAndNoscriptTags(Document $document, $boilerplate)
{
$headNode = $document->head->firstChild;
$devMode = DevMode::isActiveForDocument($document);
while ($headNode) {
$nextSibling = $headNode->nextSibling;
if ($headNode instanceof DOMElement && (! $devMode || ! DevMode::hasExemptionForNode($headNode))) {
if ($headNode instanceof DOMElement) {
switch ($headNode->tagName) {
case Tag::STYLE:
if (! $headNode->hasAttribute(Attribute::AMP_CUSTOM)) {
if ($headNode->hasAttribute($boilerplate)) {
$document->head->removeChild($headNode);
}
break;
case Tag::NOSCRIPT:
$document->head->removeChild($headNode);
$style = $headNode->getElementsByTagName('style')->item(0);
if ($style instanceof DOMElement && $style->hasAttribute($boilerplate)) {
$document->head->removeChild($headNode);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@schlessera this fixes something I noticed before when first reviewing the Optimizer code. I thought at some point the logic of removing all styles other than the amp-custom could cause problems, and it seems my suspicion was correct. So now this logic removes exclusveily the AMP boilerplate styles only, leaving other styles intact.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't too happy with this logic either, but it is what the canonical Go optimizer package uses as well:
https://github.com/ampproject/amppackager/blob/releases/transformer/transformers/ampboilerplate.go#L24-L37

Maybe an issue and/or PR for the Go package would make sense as well?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. But otherwise, you see the changes in this PR as correct and approved?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Our needs are a bit different in that we often need to serve invalid AMP, whereas the Go version probably only ever needs to serve valid AMP. So that's perhaps why it is fine for Go use the logic it does now, but otherwise it seems more robust to do what this PR does now.)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One problem that I see is that if there's a mismatch between the type of AMP that the <html> tag states, and the boilerplate that is already included, the wrong one will stay in place.

So, if the document states it is AMP4EMAIL, but the AMP4ADS boilerplate was added (for whatever reason), the optimizer will only try to remove the AMP4EMAIL boilerplate (and fail), and then re-insert it, with the end result being that two boilerplates are in the document.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was originally looping over all the possible boilerplates but then paired it down to just the identified one. You think it's best to check all?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think that'd be safer. We should always keep in mind that the Optimizer is not at all coupled to the plugin, so we cannot make any assumptions about the actual HTML that gets served into the optimizer. For all we know, it could be a buggy Drupal extension or a Symfony middleware that hands it over to the Optimizer. So, while it's important to keep the needs of the plugin in mind, we also need to keep the Optimizer in a state where it will just work regardless of input.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do you think of 6762054?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approach looks good, just not happy with the naming (see code review comments).

break;
}
}
Expand Down
102 changes: 51 additions & 51 deletions lib/optimizer/tests/Transformer/AmpBoilerplateTest.php
Original file line number Diff line number Diff line change
Expand Up @@ -26,27 +26,16 @@ final class AmpBoilerplateTest extends TestCase
*/
public function dataTransform()
{
$inputWithBoilerplate = static function ($html, $boilerplate) {
$htmlDocument = static function ($html, $headEnd = '') {
return TestMarkup::DOCTYPE . $html . '<head>' .
TestMarkup::META_CHARSET . TestMarkup::META_VIEWPORT . TestMarkup::SCRIPT_AMPRUNTIME .
TestMarkup::LINK_FAVICON . TestMarkup::LINK_CANONICAL .
$boilerplate .
$headEnd .
'</head><body></body></html>';
};

$inputWithoutBoilerplate = static function ($html) {
return TestMarkup::DOCTYPE . $html . '<head>' .
TestMarkup::META_CHARSET . TestMarkup::META_VIEWPORT . TestMarkup::SCRIPT_AMPRUNTIME .
TestMarkup::LINK_FAVICON . TestMarkup::LINK_CANONICAL .
'</head><body></body></html>';
};

$expected = static function ($html, $boilerplate) {
return TestMarkup::DOCTYPE . $html . '<head>' .
TestMarkup::META_CHARSET . TestMarkup::META_VIEWPORT . TestMarkup::SCRIPT_AMPRUNTIME .
TestMarkup::LINK_FAVICON . TestMarkup::LINK_CANONICAL .
$boilerplate .
'</head><body></body></html>';
$repeatTwice = static function ($value) {
return array_fill(0, 2, $value);
};

$ampBoilerplate = '<style ' . Attribute::AMP_BOILERPLATE . '>' . Amp::BOILERPLATE_CSS . '</style>' .
Expand All @@ -57,65 +46,76 @@ public function dataTransform()
$amp4EmailBoilerplate = '<style ' . Attribute::AMP4EMAIL_BOILERPLATE . '>' . Amp::AMP4ADS_AND_AMP4EMAIL_BOILERPLATE_CSS . '</style>';

return [
'keeps boilerplate' => [
$inputWithBoilerplate('<html ⚡>', $ampBoilerplate),
$expected('<html ⚡>', $ampBoilerplate),
],
'keeps boilerplate' => $repeatTwice(
$htmlDocument('<html ⚡>', $ampBoilerplate),
),

'keeps boilerplate for amp4ads' => [
$inputWithBoilerplate('<html amp4ads>', $amp4AdsBoilerplate),
$expected('<html amp4ads>', $amp4AdsBoilerplate),
],
'keeps boilerplate again' => $repeatTwice(
$htmlDocument('<html amp>', $ampBoilerplate),
),

'keeps boilerplate for ⚡4ads' => [
$inputWithBoilerplate('<html ⚡4ads>', $amp4AdsBoilerplate),
$expected('<html ⚡4ads>', $amp4AdsBoilerplate),
],
'leaves out boilerplate' => $repeatTwice(
$htmlDocument('<html amp i-amphtml-no-boilerplate>'),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed there weren't tests for the i-amphtml-no-boilerplate condition, so I've added a few here as well.

),

'keeps boilerplate for amp4email' => [
$inputWithBoilerplate('<html amp4email>', $amp4EmailBoilerplate),
$expected('<html amp4email>', $amp4EmailBoilerplate),
'removes boilerplate' => [
$htmlDocument('<html amp i-amphtml-no-boilerplate>', $ampBoilerplate),
$htmlDocument('<html amp i-amphtml-no-boilerplate>'),
],

'keeps boilerplate for ⚡4email' => [
$inputWithBoilerplate('<html ⚡4email>', $amp4EmailBoilerplate),
$expected('<html ⚡4email>', $amp4EmailBoilerplate),
],
'keeps boilerplate for amp4ads' => $repeatTwice(
$htmlDocument('<html amp4ads>', $amp4AdsBoilerplate)
),

'keeps boilerplate for ⚡4ads' => $repeatTwice(
$htmlDocument('<html ⚡4ads>', $amp4AdsBoilerplate)
),

'keeps boilerplate for amp4email' => $repeatTwice(
$htmlDocument('<html amp4email>', $amp4EmailBoilerplate)
),

'keeps boilerplate for ⚡4email' => $repeatTwice(
$htmlDocument('<html ⚡4email>', $amp4EmailBoilerplate)
),

'adds boilerplate if missing' => [
$inputWithoutBoilerplate('<html ⚡>'),
$expected('<html ⚡>', $ampBoilerplate),
$htmlDocument('<html ⚡>'),
$htmlDocument('<html ⚡>', $ampBoilerplate),
],

'adds boilerplate if missing for amp4ads' => [
$inputWithoutBoilerplate('<html amp4ads>'),
$expected('<html amp4ads>', $amp4AdsBoilerplate),
$htmlDocument('<html amp4ads>'),
$htmlDocument('<html amp4ads>', $amp4AdsBoilerplate),
],

'adds boilerplate if missing for ⚡4ads' => [
$inputWithoutBoilerplate('<html ⚡4ads>'),
$expected('<html ⚡4ads>', $amp4AdsBoilerplate),
$htmlDocument('<html ⚡4ads>'),
$htmlDocument('<html ⚡4ads>', $amp4AdsBoilerplate),
],

'adds boilerplate if missing for amp4email' => [
$inputWithoutBoilerplate('<html amp4email>'),
$expected('<html amp4email>', $amp4EmailBoilerplate),
$htmlDocument('<html amp4email>'),
$htmlDocument('<html amp4email>', $amp4EmailBoilerplate),
],

'adds boilerplate if missing for ⚡4email' => [
$inputWithoutBoilerplate('<html ⚡4email>'),
$expected('<html ⚡4email>', $amp4EmailBoilerplate),
$htmlDocument('<html ⚡4email>'),
$htmlDocument('<html ⚡4email>', $amp4EmailBoilerplate),
],

'keeps devmode nodes when in devmode' => [
$inputWithBoilerplate('<html ⚡ data-ampdevmode>', '<style data-ampdevmode>h1: red;</style>' . $ampBoilerplate),
$expected('<html ⚡ data-ampdevmode>', '<style data-ampdevmode>h1: red;</style>' . $ampBoilerplate),
],
'leaves styles that lack boilerplate attribute' => $repeatTwice(
$htmlDocument('<html ⚡>', '<style>h1{color:red}</style><noscript><style>h2{color:blue}</style></noscript>' . $ampBoilerplate),
),

'removes devmode nodes when not in devmode' => [
$inputWithBoilerplate('<html ⚡>', '<style data-ampdevmode>h1: red;</style>' . $ampBoilerplate),
$expected('<html ⚡>', $ampBoilerplate),
'leaves styles that lack boilerplate attribute and adds boilerplate' => [
$htmlDocument('<html ⚡>', '<style>h1{color:red}</style><noscript><style>h2{color:blue}</style></noscript>'),
$htmlDocument('<html ⚡>', '<style>h1{color:red}</style><noscript><style>h2{color:blue}</style></noscript>' . $ampBoilerplate),
],

'leaves styles that lack boilerplate attribute and leaves out boilerplate' => $repeatTwice(
$htmlDocument('<html amp i-amphtml-no-boilerplate>', '<style>h1{color:red}</style><noscript><style>h2{color:blue}</style></noscript>'),
),
];
}

Expand Down