Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix invalid URL protocol for local development #7508

Merged
merged 20 commits into from
Apr 6, 2023

Conversation

thelovekesh
Copy link
Collaborator

@thelovekesh thelovekesh commented Mar 31, 2023

Summary

Fixes #5402

Checklist

  • My code is tested and passes existing tests.
  • My code follows the Engineering Guidelines (updates are often made to the guidelines, check it out periodically).

// Mark the `Invalid URL protocol http:` for attribute src, action and href as being in dev mode.
$dev_mode_xpaths[] = sprintf(
/* translators: %1$s is the home URL */
'//*[contains(@src, "%1$s") or contains(@action, "%1$s") or contains(@href, "%1$s")][starts-with(@src, "http:") or starts-with(@action, "http:") or starts-with(@href, "http:")]',
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@westonruter Should we include href attribute as well?

Moreover, please note that the included elements are those that have attributes with the same origin URLs.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The href attribute would have to be included, yes, as it needs to handle the case of the Web App Manifest link.

However, I think it should only include the starts-with() predicates and not contains().

Nevertheless, there won't address the case where there is a protocol-relative link. For example:

<link rel="web-app-manifest" href="//localhost/sw.js">

Also, it wouldn't handle a relative path:

<link rel="web-app-manifest" href="/sw.js">

Therefore, I'm wondering if marking with dev mode is going to work reliably.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In particular, the problematic logic in the sanitizer is the following:

} elseif ( isset( $attr_spec_rule[ AMP_Rule_Spec::VALUE_URL ][ AMP_Rule_Spec::ALLOWED_PROTOCOL ] ) &&
AMP_Rule_Spec::FAIL === $this->check_attr_spec_rule_allowed_protocol( $node, $attr_name, $attr_spec_rule ) ) {
$attrs_to_remove[] = [ $attr_node, self::INVALID_URL_PROTOCOL, null ]; // @todo A javascript: protocol could be treated differently. It should have a JS error type.

I wonder, should the check_attr_spec_rule_allowed_protocol method be modified with a special case, to always return PASS if the normalized URL points to localhost? This should also involve introducing a new arg to that sanitizer, similar to prefer_bento, which is the switch for whether to enable this special case.

The special case arg can then be passed in here:

AMP_Tag_And_Attribute_Sanitizer::class => [
'prefer_bento' => amp_is_bento_enabled(),
],

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, I think it should only include the starts-with() predicates and not contains().

Then we will need some other workaround to filter out elements that have same-origin URLs in their attars. We still want to show an invalid URL protocol error if the URL is cross-origin... right?


Makes more sense to update the sanitizer to handle the case. But there are two more cases:

  1. Usage of localhost IP - 127.0.0.1
  2. Change in hostname using /etc/hosts mapping. // Should we even consider this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then we will need some other workaround to filter out elements that have same-origin URLs in their attars. We still want to show an invalid URL protocol error if the URL is cross-origin... right?

What's an example of this attribute? Now that I think of it, srcset is one case in which there are multiple URLs in one attribute.

Makes more sense to update the sanitizer to handle the case. But there are two more cases:

What we should follow here is what browsers allow for bypassing the HTTPS security requirement for certain web platform features. I know this includes localhost and 127.0.0.1, but it is also true for subdomain.localhost. It would not be true for custom hostnames defined in /etc/hosts. See more at https://web.dev/when-to-use-local-https/

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean a case something like this:

// Should consider development mode
<amp-autocomplete filter="substring"
    src="http://localhost/static/samples/json/amp-autocomplete-cities.json">
    <input>
 </amp-autocomplete>
// Maybe not consider it as development mode?
<amp-autocomplete filter="substring"
    src="http://example.com/static/samples/json/amp-autocomplete-cities.json">
    <input>
 </amp-autocomplete>

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, the exception should only be granted if:

  1. The page is being served from localhost, and
  2. The attribute is linking to localhost.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would not be true for custom hostnames defined in /etc/hosts. See more at https://web.dev/when-to-use-local-https/

Perfect. Thanks for sharing. Also, I see parse_url() returns the host as localhost while using 127.0.0.1 or [::1].

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@westonruter What about amp_is_dev_mode as new arg in AMP_Tag_And_Attribute_Sanitizer?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about something more specific, like allow_localhost_http_protocol

@thelovekesh thelovekesh force-pushed the fix/invalid-url-protocol-in-local-env branch from 7c66bd0 to e8c6261 Compare March 31, 2023 19:33
@@ -1590,7 +1600,8 @@ function amp_get_content_sanitizers( $post = null ) {
AMP_Accessibility_Sanitizer::class => [],
// Note: This validating sanitizer must come at the end to clean up any remaining issues the other sanitizers didn't catch.
AMP_Tag_And_Attribute_Sanitizer::class => [
'prefer_bento' => amp_is_bento_enabled(),
'prefer_bento' => amp_is_bento_enabled(),
'allow_localhost_http_protocol' => amp_is_dev_mode(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor point, but since we check amp_is_dev_mode() below as well, might as well save the value to a variable and use it in both places.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sense, since there is no caching in amp_is_dev_mode().

@thelovekesh thelovekesh force-pushed the fix/invalid-url-protocol-in-local-env branch 5 times, most recently from 1625ab0 to a62334a Compare March 31, 2023 21:42
@thelovekesh thelovekesh force-pushed the fix/invalid-url-protocol-in-local-env branch from a62334a to 695ebb2 Compare March 31, 2023 22:10
@thelovekesh thelovekesh marked this pull request as ready for review March 31, 2023 22:43
@github-actions
Copy link
Contributor

github-actions bot commented Mar 31, 2023

Plugin builds for 15655b6 are ready 🛎️!

@@ -144,6 +144,7 @@
"test:js:update-snapshots": "npm run test:js -- --updateSnapshot",
"test:js:watch": "npm run test:js -- --watch",
"test:php": "wp-env run phpunit 'env WP_PHPUNIT__TESTS_CONFIG=/wordpress-phpunit/wp-tests-config.php WORDPRESS_TABLE_PREFIX=wptests_ WP_TESTS_DIR=/var/www/wordpress-develop/tests/phpunit /var/www/html/wp-content/plugins/amp/vendor/bin/phpunit -c /var/www/html/wp-content/plugins/amp/phpunit.xml.dist $npm_config_args'",
"test:php:coverage": "wp-env run tests-wordpress 'env WP_TESTS_DIR=/var/www/wordpress-develop/tests/phpunit WP_PHPUNIT__TESTS_CONFIG=/wordpress-phpunit/wp-tests-config.php /var/www/html/wp-content/plugins/amp/vendor/bin/phpunit -c /var/www/html/wp-content/plugins/amp/phpunit.xml.dist --coverage-html /var/www/html/wp-content/plugins/amp/coverage'",
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add command for generating PHP tests coverage using wp-env.

@westonruter
Copy link
Member

I tried this out with the bultin dev environment and I got as expected with the PWA plugin activated:

image

The source code included:

<link rel="manifest" href="http://localhost:8888/index.php?rest_route=/wp/v2/web-app-manifest">

Nevertheless, there's one thing which seems to be missing here, and that is the data-ampdevmode attribute on this element which would bypass the error from being reported by the AMP Validator extension in the first place. I realize there is not really anything that can be done about this using the implementation approach we worked out. Otherwise, we'd need to go back to the original approach which I suppose would involve extending AMP_Dev_Mode_Sanitizer to look for all potential elements that have href and src attributes and if they would possibly violate the protocol constraint. But this would end up being more expensive or else not be as robust.

But one thing that could be done is to at least include //link[ @rel = "web-app-manifest" ] among the elements that are included among the dev-mode exemptions. This could be done in AMP_PWA_Script_Sanitizer. But perhaps better in amp_get_content_sanitizers() to push the XPath onto $dev_mode_xpaths if ! is_ssl() && 'localhost' === $host.

@westonruter westonruter added this to the v2.4.2 milestone Apr 6, 2023
@westonruter westonruter merged commit 9531acb into develop Apr 6, 2023
@westonruter westonruter deleted the fix/invalid-url-protocol-in-local-env branch April 6, 2023 16:51
@thelovekesh thelovekesh added Bug Something isn't working Enhancement New feature or improvement of an existing one P2 Low priority Reader Mode WS:Core Work stream for Plugin core and removed Bug Something isn't working Enhancement New feature or improvement of an existing one P2 Low priority Reader Mode WS:Core Work stream for Plugin core labels Jul 13, 2023
@westonruter westonruter added the Changelogged Whether the issue/PR has been added to release notes. label Jul 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Changelogged Whether the issue/PR has been added to release notes.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[amp-autocomplete ] Fix invalid URL protocol for local development
2 participants