-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Eliminate manual construction of script tags in WP_Scripts and pass other scripts through wp_print_inline_script_tag() #4773
Conversation
|
||
// Ensure markup is XHTML compatible if not HTML5. | ||
if ( ! $is_html5 ) { | ||
$javascript = sprintf( "/* <![CDATA[ */\n%s\n/* ]]> */", $javascript ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To ensure XML-compatibility, the $javascript
string should have any instances of ]]>
escaped. It's ugly, but apparently this is what has to be done:
$javascript = sprintf( "/* <![CDATA[ */\n%s\n/* ]]> */", $javascript ); | |
$javascript = str_replace( ']]>', ']]]]><![CDATA[>', $javascript ); | |
$javascript = sprintf( "/* <![CDATA[ */\n%s\n/* ]]> */", $javascript ); |
Nevertheless, it is likely exceedingly rare that a WordPress site is actually being served with Content-Type: application/xhtml+xml
(true confessions, I used to do it, but the draconian XML parse error handling was painful). Still, it is unlikely for ]]>
to occur in a script. I say this because if a site is being served as Content-Type: text/html
then the parser HTML parser will ignore these CDATA
sections, and it could end up causing a parse error when the script is passed to the JS interpreter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any concern from adding this now in case the $javascript
passed to this function is already wrapped with that CDATA
markup? Very unlikely I assume, but we may want to add a check?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, good call. I've done this in ecc29a9. This shows that it is indeed necessary to do the escaping. Without the escaping, doing the following:
wp_print_inline_script_tag( "/* <![CDATA[ */ console.log( 'Hello World!' ); /* ]]> */" );
Results in the following output:
<script type="text/javascript">
/* <![CDATA[ */
/* <![CDATA[ */ console.log( 'Hello World!' ); /* ]]> */
/* ]]> */
</script>
Pasting this into the W3C Validator as a fragment under the XHTML Strict doctype:
Results in an XML parse error:
However, when the escaping is present, the PHP outputs:
<script type="text/javascript">
/* <![CDATA[ */
/* <![CDATA[ */ console.log( 'Hello World!' ); /* ]]]]><![CDATA[> */
/* ]]> */
</script>
And this is valid:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
something as bizarre as this could definitely use a link to an XML spec or the source of the reason why these have to be escaped. someone is going to look at that and "fix" it and "improve quality" by removing the escaping 🙃
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even though there is a comment about the escaping?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah because it mentions that it should be compatible but not how or why.
like, how does this fix ensure compatibility?
when I saw it I immediately wondered what rule necessitates this or what breaks without it. I could imagine something like this, to see if I'm properly understanding the code.
/*
* XHTML extracts the contents of the SCRIPT element and then the XML parser
* decodes character references and other syntax elements. This can lead to
* misinterpretation of the script contents or invalid XHTML documents.
*
* Wrapping the contents in a CDATA section instructs the XML parser not to
* transform the contents of the SCRIPT element before passing them to the
* JavaScript engine.
*
* Example
*
* <script>console.log('…');</script>
*
* In an HTML document this would print "…" to the console,
* but in an XHTML document it would print "…" to the console.
*
*
* <script>console.log('An image is <img> in HTML');</script>
*
* In an HTML document this would print "An image is <img> in HTML",
* but it's an invalid XHTML document because it interprets the `<img>`
* as an empty tag missing its closing `/`.
*
* @see https://www.w3.org/TR/xhtml1/#h-4.8
*/
if ( ! $is_html5 ) {
/*
* If the string `]]>` exists within the JavaScript it would break
* out of any wrapping CDATA section added here, so to start, it's
* necessary to escape that sequence which requires splitting the
* content into two CDATA sections wherever it's found.
*
* Note: it's only necessary to escape the closing `]]>` because
* an additional `<![CDATA[` leaves the contents unchanged.
*/
$javascript = str_replace( ']]>', ']]]]><![CDATA[>', $javascript );
// Wrap the entire escaped script inside a CDATA section.
$javascript = sprintf( "/* <![CDATA[ */\n%s\n/* ]]> */", $javascript );
}
yeah I know it's wordy and lengthy, but it was really confusing to me, and this code is now going to be responsible for general code generation, and may get a lot of eyes on it.
by the way it seems like this will not trigger if the HTTP Content-type: text/html
is what serves the document. I could not uncover these failures without serving as Content-type: application/xhtml+xml
or by directly opening the file with a .xml
extension. The exact same file contents stored with .html
as its extension leads to HTML semantics, thus this doesn't effect it.
Finally, and thankfully, it doesn't appear to be a security issue because Safari, Firefox, and Chrome all prevent escaping from the script using tricks like </script>
. It will change those things into </script>
, but that leads to a JavaScript syntax error, or data corruption if it's found within a JS string.
Including <img>
was fun because that broke the entire page render. It became invalid XML. Again, if served as .html
or with Content-type: text/html
or anything but explicit out-of-band information that it's XML, none of these problem arise.
0f687cb
to
c02d68a
Compare
c02d68a
to
b52c335
Compare
@@ -2845,8 +2862,6 @@ function wp_get_inline_script_tag( $javascript, $attributes = array() ) { | |||
*/ | |||
$attributes = apply_filters( 'wp_inline_script_attributes', $attributes, $javascript ); | |||
|
|||
$javascript = "\n" . trim( $javascript, "\n\r " ) . "\n"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: This needs to move up above the applying of the wp_inline_script_attributes
filters so that the final $javascript
is available for computing a CSP hash.
a182421
to
d05efbe
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @westonruter for the PR.
Do we need to update wp_print_community_events_templates() ?
Co-authored-by: Mukesh Panchal <mukeshpanchal27@users.noreply.github.com>
No. Since these are not |
src/wp-admin/includes/media.php
Outdated
ob_start(); | ||
?> | ||
<script type="text/javascript"> | ||
<script> | ||
addLoadEvent = function(func){if(typeof jQuery!=='undefined')jQuery(function(){func();});else if(typeof wpOnload!=='function'){wpOnload=func;}else{var oldonload=wpOnload;wpOnload=function(){oldonload();func();}}}; | ||
var ajaxurl = '<?php echo esc_js( admin_url( 'admin-ajax.php', 'relative' ) ); ?>', pagenow = 'media-upload-popup', adminpage = 'media-upload-popup', | ||
isRtl = <?php echo (int) is_rtl(); ?>; | ||
</script> | ||
<?php | ||
wp_print_inline_script_tag( trim( str_replace( array( '<script>', '</script>' ), '', ob_get_clean() ) ) ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not really happy with the boilerplate ob_start()
followed by an inline script followed by:
wp_print_inline_script_tag( trim( str_replace( array( '<script>', '</script>' ), '', ob_get_clean() ) ) );
I think ideally there would be a helper to do this automatically. Consider, for example, if wp_print_inline_script_tag()
actually allowed a closure to be provided for the $javascript
parameter in addition to a string. It could handle the output buffering automatically, for instance:
wp_print_inline_script_tag( static function () {
?>
<script>
addLoadEvent = function(func){if(typeof jQuery!=='undefined')jQuery(function(){func();});else if(typeof wpOnload!=='function'){wpOnload=func;}else{var oldonload=wpOnload;wpOnload=function(){oldonload();func();}}};
var ajaxurl = '<?php echo esc_js( admin_url( 'admin-ajax.php', 'relative' ) ); ?>', pagenow = 'media-upload-popup', adminpage = 'media-upload-popup',
isRtl = <?php echo (int) is_rtl(); ?>;
</script>
<?php
} );
When a closure is passed, it could automatically start and end output buffering, strip the script
start and end tags, and trim whitespace. If the output buffer lacks a <script>
it could issue a _doing_it_wrong()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only purpose for having the script
tags in the PHP code is to enable IDEs to do syntax-highlighting, syntax checking, autocompletion, etc. This is valuable so I think we should facilitate it somehow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This modification to wp_get_inline_script_tag()
has since been reverted in this PR.
Update: The scope has been reduced to not include wp-admin. There are still quite a few places where scripts are being manually printed, primarily in the admin: ack output
|
…d update more non-admin scripts
ff4473a
to
8bcd7a2
Compare
<?php | ||
wp_print_inline_script_tag( | ||
// language=JavaScript | ||
sprintf( 'var _wpWidgetCustomizerPreviewSettings = %s;', wp_json_encode( $settings ) ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
presumably because of the // language=JavaScript
comment, which shouldn't be there since this is still PHP?
@westonruter There seem to be number of places where we should use this function. wordpress-develop/src/wp-includes/theme-previews.php Lines 69 to 76 in 1cd9877
wordpress-develop/src/wp-includes/class-wp-embed.php Lines 91 to 95 in 1cd9877
There are also 50 other examples in /wp-admin |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is getting close. There are 5 places in wp-includes that should be updated. We should also consider using wp_add_inline_script
in some places.
@spacedmonkey I've intentionally excluded instances that are specifically used in the wp-admin. This is to reduce scope. The changes in this PR are intended to only relate to the frontend and to the login screen. Some instances in wp-includes are only used in wp-admin (generally) so that's why they aren't included. I'll double-check the ones you identified. I think the wp-admin will need to be a separate effort. For one, there are many many more inline script tags, and secondly, the block/site editor screens have manual script construction in JS which breaks Strict CSP. So that will need to be addressed in Gutenberg. |
This is used exclusively at the
This is used exclusively at the
Since these are for the classic editor which is (usually) only used in the admin, I think it is out of scope.
See #4773 (comment) |
I agree that would make this PR too big. But I would add a Side note, |
I'm not sure this is necessary. If someone wants to fix up other instances, more power to them. Using the function won't hurt. It's just that it won't get all the benefits since there are some blockers for complete coverage. Adding comments seems like it would just create noise. As long it is clear in the ticket that the scope is limited to the frontend and wp-login, I think this is sufficient.
Yes, but since it's only used in wp-admin then we can defer it to fix later. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me. I would like the follow on ticket for changing the admin scripts to be created before this is committed, just so we don't lose track of it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@westonruter This basically looks good to me, though I have a few non-blocking concerns that would be great to get your thoughts on.
@@ -1559,7 +1559,7 @@ public function export_preview_data() { | |||
$exports = array( | |||
'navMenuInstanceArgs' => $this->preview_nav_menu_instance_args, | |||
); | |||
printf( '<script>var _wpCustomizePreviewNavMenusExports = %s;</script>', wp_json_encode( $exports ) ); | |||
wp_print_inline_script_tag( sprintf( /** @lang JavaScript */ 'var _wpCustomizePreviewNavMenusExports = %s;', wp_json_encode( $exports ) ) ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is that why you added the /** @lang JavaScript */
here? Is that standardized somehow? Asking since I haven't seen that before.
src/wp-includes/comment-template.php
Outdated
@@ -1366,7 +1366,7 @@ function wp_comment_form_unfiltered_html_nonce() { | |||
|
|||
if ( current_user_can( 'unfiltered_html' ) ) { | |||
wp_nonce_field( 'unfiltered-html-comment_' . $post_id, '_wp_unfiltered_html_comment_disabled', false ); | |||
echo "<script>(function(){if(window===window.parent){document.getElementById('_wp_unfiltered_html_comment_disabled').name='_wp_unfiltered_html_comment';}})();</script>\n"; | |||
wp_print_inline_script_tag( /** @lang JavaScript */ "(function(){if(window===window.parent){document.getElementById('_wp_unfiltered_html_comment_disabled').name='_wp_unfiltered_html_comment';}})();" ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same question here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are language injection comments. Supported by PhpStorm, at least: https://www.jetbrains.com/help/phpstorm/using-language-injections.html
$attributes['type'] = 'text/javascript'; | ||
// Keep the type attribute as the first for legacy reasons (it has always been this way in core). | ||
$attributes = array_merge( | ||
array( 'type' => 'text/javascript' ), | ||
$attributes | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
|
||
// Ensure markup is XHTML compatible if not HTML5. | ||
if ( ! $is_html5 ) { | ||
$javascript = sprintf( "/* <![CDATA[ */\n%s\n/* ]]> */", $javascript ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any concern from adding this now in case the $javascript
passed to this function is already wrapped with that CDATA
markup? Very unlikely I assume, but we may want to add a check?
$expected = str_replace( " type='text/javascript'", '', $expected ); | ||
$expected = str_replace( ' type="text/javascript"', '', $expected ); | ||
$expected = str_replace( "/* <![CDATA[ */\n", '', $expected ); | ||
$expected = str_replace( "\n/* ]]> */", '', $expected ); | ||
$expected = str_replace( ' defer="defer"', ' defer', $expected ); | ||
$expected = str_replace( ' async="async"', ' async', $expected ); | ||
|
||
$actual = str_replace( " type='text/javascript'", '', $actual ); | ||
$actual = str_replace( ' type="text/javascript"', '', $actual ); | ||
$actual = str_replace( "/* <![CDATA[ */\n", '', $actual ); | ||
$actual = str_replace( "\n/* ]]> */", '', $actual ); | ||
$actual = str_replace( ' defer="defer"', ' defer', $actual ); | ||
$actual = str_replace( ' async="async"', ' async', $actual ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I'm a fan of having all these exceptions here. Can we somehow ensure the tests pass the relevant values instead? The CDATA
one feels okay to me, but otherwise, I think starting to have exceptions here sets a bad precedent that could eventually move this assertion further away from asserting "equal markup".
Maybe instead create a more specific assertEqualScriptTag
or something like that with these exceptions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about 3ba5135? I improved the normalization process to use the DOM instead, so it will be much more robust and safe. I also made it clear from the method description that it compares with normalizations applied. Lastly, since this method is contained in the Tests_Dependencies_Scripts
class I think this also sufficiently indicates that it is specific to checking script
tag markup.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My feedback wasn't really about how this was implemented, only about that there are now these "exceptions" where it's not actually equal.
I don't feel strongly about that, so happy to keep it as long as it's only used in this scripts.php
test file. The name is mostly what throws me off. I'd be on board if this was called something more specific to script markup. Maybe we can just rename the assertion since it's (as far as I can tell) only used in this class anyway?
Regarding DOMDocument
, I personally don't trust it. It has so many weird quirks, so IMO the str_replace
from before was more accessible to other developers (FWIW I don't understand half the code you added for the DOMDocument
approach) and less error-prone.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The benefit of DOMDocument
is PHPUnit supports comparing two instances in assertEquals
, and it normalizes non-semantic differences (e.g. attribute order). See #4773 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In other words, assertEquals
is intentionally not strict in how it compares equality. It checks semantic equality as opposed to assertSame
. So ==
instead of ===
. So I think it makes sense.
@@ -12,6 +12,9 @@ | |||
"require": { | |||
"php": ">=7.0" | |||
}, | |||
"suggest": { | |||
"ext-dom": "*" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could have put this in require
since the extension is required when running tests (and not just the tests being introduced in this PR). But since it is not required to actually run WordPress, I went with suggest
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another reason to probably not use this approach and stick to the simple str_replace
that doesn't rely on new dependencies :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, it was already using DOMDocument
. It turns out that assertEquals
in PHPUnit supports comparing two DOMElement
instances, and this then accounts for differences in attribute order or whether single vs double quotes are used automatically. So I'm just extending that a bit further in assertEqualMarkup
to normalize a bit more for HTML5 vs XHTML.
And DOMDocument
is being used in other tests as well, so it's not a new dependency. It's just a dependency that wasn't declared before.
31b32ad
to
ecc29a9
Compare
Committed in r56687. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
left some post-hoc comments. I think there are opportunities to follow-up on this, and particularly do something about str_replace( '<script>', '' )
because that seems dangerous and needlessly inefficient.
overall it looks better than it was though.
@@ -2106,6 +2109,7 @@ public function remove_frameless_preview_messenger_channel() { | |||
} )(); | |||
</script> | |||
<?php | |||
wp_print_inline_script_tag( str_replace( array( '<script>', '</script>' ), '', ob_get_clean() ) ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oops, we don't want to arbitrarily replace these tags. we know they exist at the front and back of the string, so by manually removing them we can avoid unintentional HTML syntax poisoning.
$script_html = ob_get_clean();
$script_html = substr( $script_html, strlen( '<script>' ), -strlen( '</script>' ) );
the strlen
calls shouldn't add any overhead because that's stored in the string object, which here is a string literal, which we've already created in this patch inside the array.
this change is both safer, deterministic, and more efficient, particularly in the worse-case, though given that we expect very short args here I would be surprised if it's a measurable impact. still, doing less is faster than doing more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great catch @dmsnell!
@westonruter IMO this is worth a quick follow up commit applying this throughout.
If we feel strongly about introducing a helper at this point, I wouldn't be opposed to that either. Though I think that should remain an @access private
function if we add it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Private helper function added in #5301
@@ -2106,6 +2109,7 @@ public function remove_frameless_preview_messenger_channel() { | |||
} )(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this is out of scope for this PR, but looks like we could prevent some other corruption here by using URLSearchParams
instead of string-searching the query of the URL. no need for DOM either.
url = new URL( location.href );
queryParams = url.searchParams;
if ( queryParams.has( 'customize_messenger_channel' ) ) {
queryParams.delete( 'customize_messenger_channel' );
url.search = queryParams;
location.replace( url );
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yes, this code dates back to a time before we could rely on URL
or URLSearchParams
. I made a similar change recently to wp-embed
(r56383). Probably should be put into a new defect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Filed this in Core-59480
@@ -5012,6 +5019,7 @@ public function customize_pane_settings() { | |||
?> | |||
</script> | |||
<?php | |||
wp_print_inline_script_tag( str_replace( array( '<script>', '</script>' ), '', ob_get_clean() ) ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would it help to create a helper function for this specific task?
/**
* Removes leading and trailing _empty_ script tags.
*
* This is a helper meant to be used for literal script tag construction
* within wp_print_inline_script_tag(). It removes the literal values of
* "<script>" and "</script>" from around an inline script.
*
* @since 6.4.0
*
* @param string $contents Script body with manually created SCRIPT tag literals.
* @return string Script body without surrounding script tag literals, or
* original contents if both exact literals aren't present.
*/
function wp_remove_surrounding_empty_script_tags( $contents ) {
$opener = '<script>';
$closer = '</script>';
$has_both_empty_tags = (
strlen( $opener ) + strlen( $closer ) > strlen( $contents ) ||
substr( $contents, 0, strlen( $opener ) ) !== $opener ) ||
substr( $contents, -strlen( $closer ) ) !== $closer
);
return $has_both_empty_tags
? substr( $contents, strlen( $opener ), -strlen( $closer ) )
: $contents;
}
then these repetitive calls could be marginally nicer. even with the checks for the leading and trailing SCRIPT tags, this should still be faster than str_replace()
, but more importantly, more secure and resilient against breaking syntax.
$script_body = wp_remove_surrounding_empty_script_tags( ob_get_clean() );
wp_print_inline_script_tag( $script_body );
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a syntax error in the $has_both_empty_tags
condition. Could you fix and add comments explaining the conditions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure; also just a note - all my examples are almost always wrong and are only meant to convey ideas. if you ever get the impulse, please never copy code of mine from a comment and paste it into a project. don't trust me 😄 I don't test these examples or vet them for WPCS' prefences.
/**
* Removes leading and trailing _empty_ script tags.
*
* This is a helper meant to be used for literal script tag construction
* within wp_print_inline_script_tag(). It removes the literal values of
* "<script>" and "</script>" from around an inline script.
*
* @since 6.4.0
*
* @param string $contents Script body with manually created SCRIPT tag literals.
* @return string Script body without surrounding script tag literals, or
* original contents if both exact literals aren't present.
*/
function wp_remove_surrounding_empty_script_tags( $contents ) {
$opener = '<script>';
$closer = '</script>';
$has_both_empty_tags = (
strlen( $opener ) + strlen( $closer ) > strlen( $contents ) ||
substr( $contents, 0, strlen( $opener ) ) !== $opener ||
substr( $contents, -strlen( $closer ) ) !== $closer
);
/*
* What should happen if the given contents are not surrounded by
* the exact literal script tags? This question opens up a can of
* worms with no obvious or clear answer. Removing one of the tags
* could lead to just as much trouble as leaving one in place.
* This code leaves the string untouched if it can't find both the
* exact tags. Maybe someone added an attribute to the opening tag
* or maybe someone added a space; it's impossible to know from
* here.
*
* It would be possible to return `null` or `false` here to indicate
* the failure, but people probably wouldn't be checking the result
* and that could introduce corruption. An empty string could be
* another viable return and that would quickly signal that something
* is wrong and needs fixing.
*/
return $has_both_empty_tags
? substr( $contents, strlen( $opener ), -strlen( $closer ) )
: $contents;
}
Maybe it's best to return an empty string here if the syntax isn't a perfect match. Is this what you were asking for in "add comments explaining the conditions"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I meant more the conditions for $has_both_empty_tags
as I wasn't understanding exactly they were doing. But it seems that's because the logic wasn't quite right. Also seems worthwhile to normalize the prefix and suffix to upper case. So I believe the condition should actually be:
$has_both_empty_tags = (
strlen( $contents ) > strlen( $opener ) + strlen( $closer ) &&
strtolower( substr( $contents, 0, strlen( $opener ) ) ) === $opener &&
strtolower( substr( $contents, -strlen( $closer ) ) ) === $closer
);
It has both empty tags if:
- The entire string is longer than the opener and closer, and
- The string starts with
$opener
, and - The string ends with
$closer
When those conditions are satisfied, then substr( $contents, strlen( $opener ), -strlen( $closer ) )
should be done.
In testing, I see that the first line of the function should also be:
$contents = trim( $contents );
In regards to the comment comment before:
return $has_both_empty_tags
? substr( $contents, strlen( $opener ), -strlen( $closer ) )
: $contents;
What do you think about actually just doing:
if ( ! $has_both_empty_tags ) {
_doing_it_wrong( __FUNCTION__, '6.4', __( 'Expected string to begin with an empty script tag and close with a script tag.' ) );
return '';
}
I'll note that this is actually the reverse of what wp_add_inline_script()
does:
wordpress-develop/src/wp-includes/functions.wp-scripts.php
Lines 133 to 145 in 6fa2ce4
if ( false !== stripos( $data, '</script>' ) ) { | |
_doing_it_wrong( | |
__FUNCTION__, | |
sprintf( | |
/* translators: 1: <script>, 2: wp_add_inline_script() */ | |
__( 'Do not pass %1$s tags to %2$s.' ), | |
'<code><script></code>', | |
'<code>wp_add_inline_script()</code>' | |
), | |
'4.5.0' | |
); | |
$data = trim( preg_replace( '#<script[^>]*>(.*)</script>#is', '$1', $data ) ); | |
} |
Note also how it does the string replacement, so somewhat the naive approach which I originally committed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've implemented this in #5301
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think about actually just doing:
Seems great. There's never a case to call this without the SCRIPT tags, so that makes sense to me. Make it visible as early as possible.
seems worthwhile to normalize the prefix and suffix to upper case
I have no strong opinion on this. Seems fine. The one thing is that <sCRiPt>
is not exactly a string literal match, but that's fine. Whatever happens here I think it will work out, and if something goes awry, with the help of the _doing_it_wrong()
someone will figure it out quickly enough.
<?php | ||
wp_print_inline_script_tag( | ||
// language=JavaScript | ||
sprintf( 'var _wpWidgetCustomizerPreviewSettings = %s;', wp_json_encode( $settings ) ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
presumably because of the // language=JavaScript
comment, which shouldn't be there since this is still PHP?
|
||
// Ensure markup is XHTML compatible if not HTML5. | ||
if ( ! $is_html5 ) { | ||
$javascript = sprintf( "/* <![CDATA[ */\n%s\n/* ]]> */", $javascript ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
something as bizarre as this could definitely use a link to an XML spec or the source of the reason why these have to be escaped. someone is going to look at that and "fix" it and "improve quality" by removing the escaping 🙃
@@ -3136,7 +3136,7 @@ public function test_remove_frameless_preview_messenger_channel() { | |||
ob_start(); | |||
$manager->remove_frameless_preview_messenger_channel(); | |||
$output = ob_get_clean(); | |||
$this->assertStringContainsString( '<script>', $output ); | |||
$this->assertStringContainsString( '<script', $output ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would be nice to know what this assertion is supposed to be testing. what's the point here? confirm that we have a SCRIPT
element in the output?
$processor = new WP_HTML_Tag_Processor( $output );
$this->assertTrue( $processor->next_tag( 'script' ), 'Failed to find expected SCRIPT element in output.' );
😉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
array( | ||
'id' => 'ms-isa-1-js-after', | ||
) | ||
); | ||
$this->assertSame( $expected, $output, 'Inline scripts in the "after" position, that are attached to a deferred main script, are failing to print/execute.' ); | ||
$this->assertEqualMarkup( $expected, $output, 'Inline scripts in the "after" position, that are attached to a deferred main script, are failing to print/execute.' ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❤️ all of these. great.
also, did you know that we're finally going to get a spec-compliant HTML5/DOM parser in PHP?
doesn't remove the need for the HTML API (mainly because of memory and performance and interface needs) but it will go a long way in our tests to eliminating false errors.
@@ -257,7 +260,7 @@ public function test_blocking_dependent_with_delayed_dependency( $strategy ) { | |||
wp_enqueue_script( 'main-script-a3', '/main-script-a3.js', array(), null, compact( 'strategy' ) ); | |||
wp_enqueue_script( 'dependent-script-a3', '/dependent-script-a3.js', array( 'main-script-a3' ), null ); | |||
$output = get_echo( 'wp_print_scripts' ); | |||
$expected = "<script type='text/javascript' src='/main-script-a3.js' id='main-script-a3-js' data-wp-strategy='{$strategy}'></script>"; | |||
$expected = str_replace( "'", '"', "<script type='text/javascript' src='/main-script-a3.js' id='main-script-a3-js' data-wp-strategy='{$strategy}'></script>" ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this be a assertEqualMarkup()
scenario?
if not, the HEREDOC might make it less quirky
$expected = <<<HTML
<script type="text/javascript" src="/main-script-a3.js" id="main-script-a3-js" data-wp-strategy="{$strategy}"></script>
HTML;
the initial and terminal newlines are stripped automatically, so this is a trimmed string.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, well, it's actually checking for only one of the script tags of the two. But I can just add the other script so it is checking for both.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
array( | ||
"/* <![CDATA[ */\n", | ||
"\n/* ]]> */", | ||
), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is worrisome because it conflates syntax and content. I guess it's probably low-risk in our tests
The scope here is now limited to the frontend (including Customizer preview) as well as the login screen (
wp-login.php
). Admin screens are not included since this would increase the scope significantly. Additionally, the core themes have not been updated sincewp_print_inline_script_tag()
was introduced in WP 5.7, and the last theme to include any custom scripts (Twenty Twenty-One) supports WordPress 5.3 and above. So to take advantage of the example Strict CSP plugin with core themes, you have to use one of these themes which don't manually construct scripts:Previously: 10up#58
See WordPress/gutenberg#54637 for syncing block change to Gutenberg.
Testing Instructions
Trac ticket: https://core.trac.wordpress.org/ticket/58664
Commit Message
Script Loader: Use
wp_get_script_tag()
andwp_get_inline_script_tag()
/wp_print_inline_script_tag()
helper functions to output scripts on frontend and login screen.Using script tag helper functions allows plugins to employ the
wp_script_attributes
andwp_inline_script_attributes
filters to inject thenonce
attribute to apply Content Security Policy (e.g. Strict CSP). Use of helper functions also simplifies logic inWP_Scripts
.wp_get_inline_script_tag()
to wrap inline script in CDATA blocks for XHTML-compatibility.type
attribute is printed first inwp_get_inline_script_tag()
for back-compat.<script>
tags in output buffering to retain IDE supports.wp_get_inline_script_tag()
, append the newline to$javascript
before it is passed into thewp_inline_script_attributes
filter so that the CSP hash can be computed properly.the_block_template_skip_link()
, opt to enqueue the inline script rather than print it.ext-php
tocomposer.json
undersuggest
as previously it was an undeclared dependency for running PHPUnit tests.DOMDocument
to compare script markup, normalizing unsemantic differences.Props westonruter, spacedmonkey, flixos90, 10upsimon, dmsnell, mukesh27, joemcgill, swissspidy, azaozz.
Fixes #58664.
See #39941.
This Pull Request is for code review only. Please keep all other discussion in the Trac ticket. Do not merge this Pull Request. See GitHub Pull Requests for Code Review in the Core Handbook for more details.